NumPy: Compute the mean, standard deviation, and variance of a given array along the second axis
NumPy Statistics: Exercise-7 with Solution
Write a NumPy program to compute the mean, standard deviation, and variance of a given array along the second axis.
From Wikipedia: There are several kinds of means in various branches of mathematics (especially statistics).
For a data set, the arithmetic mean, also called the mathematical expectation or average, is the central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values. The arithmetic mean of a set of numbers is typically denoted by , pronounced " bar". If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean (denoted ) to distinguish it from the mean of the underlying distribution.
In probability and statistics, the population mean, or expected value, are a measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution. In the case of a discrete probability distribution of a random variable , the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value of and its probability , and then adding all these products together, giving . An analogous formula applies to the case of a continuous probability distribution. Not every probability distribution has a defined mean; see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite.
Sample Solution:
Python Code:
# Importing the NumPy library
import numpy as np
# Creating an array 'x' using arange with 6 elements
x = np.arange(6)
# Displaying the original array 'x'
print("\nOriginal array:")
print(x)
# Calculating the mean of the array 'x' using np.mean()
r1 = np.mean(x)
# Calculating the average of the array 'x' using np.average()
r2 = np.average(x)
# Asserting if the results from np.mean() and np.average() are close
assert np.allclose(r1, r2)
# Displaying the calculated mean of the array 'x'
print("\nMean: ", r1)
# Calculating the standard deviation of the array 'x' using np.std()
r1 = np.std(x)
# Calculating the standard deviation manually
r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2))
# Asserting if the results from np.std() and manual calculation are close
assert np.allclose(r1, r2)
# Displaying the calculated standard deviation of the array 'x'
print("\nstd: ", r1)
# Calculating the variance of the array 'x' using np.var()
r1 = np.var(x)
# Calculating the variance manually
r2 = np.mean((x - np.mean(x)) ** 2)
# Asserting if the results from np.var() and manual calculation are close
assert np.allclose(r1, r2)
# Displaying the calculated variance of the array 'x'
print("\nvariance: ", r1)
Sample Output:
Original array: [0 1 2 3 4 5] Mean: 2.5 std: 1 variance: 2.9166666666666665
Explanation:
In the above code –
- x = np.arange(6): This line creates a NumPy array x containing the numbers from 0 to 5.
- r1 = np.mean(x): This line calculates the mean of the numbers in x.
- r2 = np.average(x): This line calculates the weighted average of the numbers in x, where each number has an equal weight. Since all the weights are equal, np.average(x) is equivalent to np.mean(x).
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
- r1 = np.std(x): This line calculates the standard deviation of the numbers in x.
- r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2 )): This line calculates the standard deviation of the numbers in x using the formula sqrt(mean((x - mean(x))**2)). This is another way to calculate the standard deviation, where the mean of the squared differences from the mean is calculated first and then the square root is taken.
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
- r1= np.var(x): This line calculates the variance of the numbers in x.
- r2 = np.mean((x - np.mean(x)) ** 2 ): This line calculates the variance of the numbers in x using the formula mean((x - mean(x))**2). This is another way to calculate the variance, where the mean of the squared differences from the mean is calculated directly.
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
Python-Numpy Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a NumPy program to compute the weighted of a given array.
Next: Write a NumPy program to compute the covariance matrix of two given arrays.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/python-exercises/numpy/python-numpy-stat-exercise-7.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics