Coding a stdev() Function in Python
To calculate the standard deviation of a dataset, we’re going to rely on our
variance() function. We’re also going to use the
sqrt() function from the
math module of the Python standard library. Here’s a function called
stdev() that takes the data from a population and returns its standard deviation:
>>> import math >>> # We relay on our previous implementation for the variance >>> def variance(data, ddof=0): ... n = len(data) ... mean = sum(data) / n ... return sum((x - mean) ** 2 for x in data) / (n - ddof) ... >>> def stdev(data): ... var = variance(data) ... std_dev = math.sqrt(var) ... return std_dev >>> stdev([4, 8, 6, 5, 3, 2, 8, 9, 2, 5]) 2.4
stdev() function takes some
data and returns the population standard deviation. To do that, we rely on our previous
variance() function to calculate the variance and then we use
math.sqrt() to take the square root of the variance.
If we want to use
stdev() to estimate the population standard deviation using a sample of data, then we just need to calculate the variance with n – 1 degrees of freedom as we saw before. Here’s a more generic
stdev() that allows us to pass in degrees of freedom as well:
>>> def stdev(data, ddof=0): ... return math.sqrt(variance(data, ddof)) >>> stdev([4, 8, 6, 5, 3, 2, 8, 9, 2, 5]) 2.4 >>> stdev([4, 8, 6, 5, 3, 2, 8, 9, 2, 5], ddof=1) 2.5298221281347035
With this new implementation, we can use
ddof=0 to calculate the standard deviation of a population, or we can use
ddof=1 to estimate the standard deviation of a population using a sample of data.
Using Python’s pstdev() and stdev()
statistics module also provides functions to calculate the standard deviation. We can find
stdev(). The first function takes the data of an entire population and returns its standard deviation. The second function takes data from a sample and returns an estimation of the population standard deviation.
Here’s how these functions work:
>>> import statistics >>> statistics.pstdev([4, 8, 6, 5, 3, 2, 8, 9, 2, 5]) 2.4000000000000004 >>> statistics.stdev([4, 8, 6, 5, 3, 2, 8, 9, 2, 5]) 2.5298221281347035
We first need to import the
statistics module. Then, we can call
statistics.pstdev() with data from a population to get its standard deviation.
If we don’t have the data for the entire population, which is a common scenario, then we can use a sample of data and use
statistics.stdev() to estimate the population standard deviation.
Statistics module in Python provides a function known as stdev() , which can be used to calculate the standard deviation. stdev() function only calculates standard deviation from a sample of data, rather than an entire population.
To calculate standard deviation of an entire population, another function known as pstdev() is used.
Standard Deviation is a measure of spread in Statistics. It is used to quantify the measure of spread, variation of a set of data values. It is very much similar to variance, gives the measure of deviation whereas variance provides the squared value.
A low measure of Standard Deviation indicates that the data are less spread out, whereas a high value of Standard Deviation shows that the data in a set are spread apart from their mean average values. A useful property of the standard deviation is that, unlike the variance, it is expressed in the same units as the data.
Standard Deviation is calculated by : where x1, x2, x3.....xn are observed values in sample data, is the mean value of observations and N is the number of sample observations.
Syntax : stdev( [data-set], xbar )
[data] : An iterable with real valued numbers.
xbar (Optional): Takes actual mean of data-set as value.
Returnype : Returns the actual standard deviation of the values passed as parameter.
StatisticsError is raised for data-set less than 2 values passed as parameter.
Impossible/precision-less values when the value provided as xbar doesn’t match actual mean of the data-set.
Code #1 :
Standard Deviation of the sample is 1.5811388300841898
Code #2 : Demonstrate stdev() on a varying set of data types
The Standard Deviation of Sample1 is 3.9761191895520196 The Standard Deviation of Sample2 is 1.8708286933869707 The Standard Deviation of Sample3 is 7.8182478855559445 The Standard Deviation of Sample4 is 0.41967844833872525
Code #3 :Demonstrate the difference between results of variance() and stdev()
Standard Deviation of the sample is 1.5811388300841898 Variance of the sample is 2.5
Code #4 : Demonstrate the use of xbar parameter
Standard Deviation of Sample set is 0.6047037842337906
Code #5 : Demonstrates StatisticsError
Traceback (most recent call last): File "/home/f921f9269b061f1cc4e5fc74abf6ce10.py", line 12, in print(statistics.stdev(sample)) File "/usr/lib/python3.5/statistics.py", line 617, in stdev var = variance(data, xbar) File "/usr/lib/python3.5/statistics.py", line 555, in variance raise StatisticsError('variance requires at least two data points') statistics.StatisticsError: variance requires at least two data points
- Standard Deviation is highly essential in the field of statistical maths and statistical study. It is commonly used to measure confidence in statistical calculations. For example, the margin of error in calculating marks of an exam is determined by calculating the expected standard deviation in the results if the same exam were to be conducted multiple times.
- It is very useful in the field of financial studies as well as it helps to determine the margin of profit and loss. The standard deviation is also important, where the standard deviation on the rate of return on an investment is a measure of the volatility of the investment.