What is descriptive statistics in machine learning?

rishabhdwivedi062
Aug 29, 2022
1 min read

Updated: Oct 6, 2022

Descriptive statistics are calculated only for numerical variables. It gives us the detailed results of our dataset which includes mean, standard deviation, minimum value, maximum value, etc.

The output of the descriptive statistics function is:-

Count of values for a variable.
Mean
Median
Standard deviation
Minimum Value
Maximum Value
Percentiles (25%, 50%, 75%)

You can use pandas describe function to get the descriptive statistics.

Uses of descriptive statistics.

Build a better understanding of data.
Identify and treat missing values.
Identify any outliers and anomalies.

The mean or average value tells how closely the same values are grouped together, but the standard deviation tells us how some values differ from mean values.

If Standard Deviation is low: Most of the values are close to the average value.

If Standard Deviation is high: Most of the values are far from the mean value, hence it will spread out.

The standard deviation formula is:

σ = √Σ (xi – μ)2 / (n-1)