![]() |
VOOZH | about |
scipy.stats.describe() function compute a variety of descriptive statistics for a dataset, including count, min/max, mean, variance, skewness and kurtosis. It's a convenient one-stop summary tool for numerical data analysis.
Example:
Output
DescribeResult(nobs=np.int64(7), minmax=(np.int64(3), np.int64(9)), mean=np.float64(6.0), variance=np.float64(4.666666666666667), skewness=np.float64(0.0), kurtosis=np.float64(-1.25))
Explanation: The data is symmetric (skewness=0) and flatter than a normal distribution (kurtosis < 0).
scipy.stats.describe(a, axis=0, ddof=1, bias=True, nan_policy='propagate')
Parameters:
Returns: A DescribeResult namedtuple with fields: nobs (number of observations), minmax (tuple of min and max), mean, variance (sample), skewness (asymmetry) and kurtosis (Fisher’s definition; 0 for a normal distribution).
Example 1: NumPy Array with axis=None
Explanation: Ignores the NaN and computes stats for [1, 2, 4, 5]. Returns count, mean, and variance without error.
Example 2: 2D Array with axis=0 (Column-wise Stats)
Output
DescribeResult(nobs=np.int64(3), minmax=(array([1, 2, 3]), array([7, 8, 9])), mean=array([4., 5., 6.]), variance=array([9., 9., 9.]), skewness=array([0., 0., 0.]), kurtosis=array([-1.5, -1.5, -1.5]))
Explanation: Calculates stats column-wise. Each column has 3 values and equal spread, so results are identical across columns.
Example 3: Handling NaN with nan_policy='omit'
Output
DescribeResult(nobs=np.int64(4), minmax=(masked_array(data=1.,
mask=False,
fill_value=1e+20), masked_array(data=5.,
mask=False,
fill_value=1e+20)), mean=np.float64(3.0), variance=np.float64(3.3333333333333335), skewness=masked_array(data=0.,
mask=False,
fill_value=1e+20), kurtosis=np.float64(-1.64))
Explanation: Output may appear as masked arrays when ignoring NaNs, but results are still valid and usable for analysis.