Analyzing the interquartile range (IQR) of a dataset
The IQR also measures the spread or variability of a dataset. It is simply the distance between the first and third quartiles. The IQR is a very useful statistic, especially when we need to identify where the middle 50% of values in a dataset lie. Unlike the range, which can be skewed by very high or low numbers (outliers), the IQR isn’t affected by outliers since it focuses on the middle 50. It is also useful when we need to compute for outliers in a dataset.
To analyze the IQR of a dataset, we will use the IQR method from the stats module within the scipy library in Python.
Getting ready
We will work with the COVID-19 cases again for this recipe.
How to do it…
We will explore how to compute the IQR using the scipy library:
- Import
pandasand import thestatsmodule from thescipylibrary:import pandas as pd from scipy import stats
- Load the
.csvinto a dataframe usingread_csv. Then subset...