TutorChase logo
Login
AP Statistics study notes

1.7.3 Calculating Measures of Variability

AP Syllabus focus:
‘- Introduction to measures of variability: range, interquartile range (IQR), and standard deviation.

- Demonstrating how to calculate the range as the difference between the maximum and minimum data values.

- Detailing the calculation of the IQR as Q3−Q1 and the formula for sample standard deviation, s=n−11∑(xi−x−bar)2, emphasizing its role in assessing data spread.

- Skill 2.C: Enhancing skills in calculating and interpreting variability measures to describe the distribution of quantitative data.’

Variation describes how spread out data values are, and understanding this spread is essential for interpreting quantitative variables. Measures of variability summarize how much observations differ from one another.

Understanding Measures of Variability

Measures of variability quantify the spread, or dispersion, of a distribution. They help reveal how typical or unusual individual observations may be when compared to the overall dataset. In AP Statistics, students focus on three core measures: the range, the interquartile range (IQR), and the standard deviation. Each of these statistics captures a different aspect of variability and contributes to a deeper understanding of a dataset’s distributional structure.

Why Variability Matters

Variability provides essential information that measures of center alone cannot convey. Two datasets may share the same mean or median but have dramatically different spreads. Understanding variability allows students to compare datasets, assess consistency, and identify potential irregularities such as unusually large or small values. Because variability summarizes the degree of difference from one observation to another, it directly influences how data are interpreted in context.

The Range

The range is the simplest measure of variability and describes the total span of the data.

Range: The difference between the maximum and minimum values in a dataset, showing the full extent of variation.

The range is straightforward to compute and gives a quick sense of overall spread. However, it is highly sensitive to outliers because it uses only the two most extreme data points. While useful as a preliminary measure, the range does not provide detailed information about how data values are distributed throughout the dataset.

EQUATION

Range=maximum valueminimum value Range = \text{maximum value} - \text{minimum value}
maximum value \text{maximum value} = Largest observed value
minimum value \text{minimum value} = Smallest observed value

Because the range incorporates no additional information beyond these extremes, it should be interpreted cautiously, particularly when distributions contain potential outliers or gaps.

The Interquartile Range (IQR)

The interquartile range (IQR) focuses on the middle portion of the data and is therefore a more resistant measure of spread.

Interquartile Range (IQR): The measure of spread that captures the range of the middle 50% of ordered data, calculated as Q3Q1Q_3 - Q_1.

The IQR is built from the first quartile (Q1) and third quartile (Q3), which mark the 25th and 75th percentiles of the data.

This diagram shows the five-number summary on a number line, highlighting the interquartile range (IQR) as the middle 50% of the data and marking an outlier beyond the whisker. Source.

Because it ignores the smallest and largest quarters of the dataset, it is resistant to outliers and offers a more stable description of typical variability.

EQUATION

IQR=Q3Q1 IQR = Q_3 - Q_1
Q3 Q_3 = Third quartile; value with 75% of observations below it
Q1 Q_1 = First quartile; value with 25% of observations below it

The IQR is especially useful when describing skewed distributions or when extreme values distort the overall spread.

Standard Deviation

The standard deviation is the most widely used measure of variability in statistical analysis because it incorporates every data value and reflects how far individual observations typically fall from the mean.

Standard Deviation (s): A measure of spread that represents the typical distance between each data point and the sample mean.

Standard deviation quantifies the average amount of variability within a dataset.

This figure depicts how data points lie relative to the mean, with arrows showing one standard deviation above and below it, illustrating standard deviation as the typical distance from the mean. Source.

Because it is based on deviations from the mean, it captures the overall pattern of spread more fully than the range or IQR. However, it is nonresistant, meaning it is heavily influenced by outliers and strong skewness.

EQUATION

s=1n1(xixˉ)2 s = \sqrt{\frac{1}{n-1}\sum (x_i - \bar{x})^2}
s s = Sample standard deviation
n n = Number of observations
xi x_i = Individual data value
xˉ \bar{x} = Sample mean

The formula shows that standard deviation is rooted in squared deviations, which amplify larger differences. Dividing by n1n - 1 adjusts for bias when estimating population spread from a sample, a concept known as degrees of freedom.

A dataset with small standard deviation has observations clustered closely around the mean, while a large standard deviation indicates widely dispersed values.

This graph compares two normal distributions with equal means but different standard deviations, illustrating how smaller standard deviation produces a narrower curve while larger standard deviation creates a wider shape. Source.

Because standard deviation is sensitive to extreme observations, students should always consider the shape of the distribution before relying on it to describe variability.

Comparing the Three Measures of Variability

Each measure provides complementary information about spread.

  • Range highlights total variation but is sensitive to extremes.

  • IQR focuses on the middle 50% and is resistant to outliers.

  • Standard deviation summarizes typical deviation from the mean but is nonresistant.

Understanding when to use each measure is a key analytical skill. For symmetric distributions without outliers, standard deviation is informative and appropriate. For skewed or outlier-contaminated distributions, the IQR provides a clearer picture of typical spread. The range, while limited, offers a quick initial impression of overall variability.

FAQ

The IQR focuses on the central 50% of the data, making it resistant to extreme values. This is especially useful when distributions are skewed or contain outliers.

The range, by contrast, uses only the minimum and maximum values, which may give a misleading sense of spread if either extreme is unusual.
The IQR therefore provides a more stable and representative measure of typical variability.

A visual check of the data distribution is often enough:

• If values cluster closely around a central value, the standard deviation is likely to be small.
• If values are widely dispersed across the scale, the standard deviation is likely to be large.
• Sudden jumps, extreme values, or long gaps between points also hint at higher variability.

Using a dotplot, stem-and-leaf plot, or rough sketch can make this assessment quicker and more intuitive.

The standard deviation increases when new values lie far from the mean.

This can happen when:
• An outlier is added.
• A cluster of values is introduced at one end of the distribution.
• The new values shift the mean, resulting in larger deviations for existing data points.

Values close to the existing mean tend to decrease or have minimal effect on the standard deviation.

Dividing by n − 1 corrects for bias in estimating population variability from a sample. Using n would underestimate the true variability because the mean is calculated from the same dataset and tends to sit closer to the sample values than the population mean would.

Using n − 1 compensates for this by slightly increasing the calculated variance, making the estimate more accurate for inferential work.

The range is helpful when a rapid, broad sense of spread is needed and when extreme values themselves are meaningful.

It is particularly useful when:
• Analysing the full span of possible outcomes matters, such as in quality control limits.
• The dataset is small and contains no apparent outliers.
• You want a quick comparison of distributions before calculating more resistant measures such as the IQR.

Practice Questions

Question 1 (3 marks)

A dataset contains the values: 12, 18, 25, 32, and 40.
(a) State the range of this dataset.
(b) Identify the interquartile range (IQR).

Question 1

(a) Range correctly stated as 40 − 12 = 28.
• 1 mark for correct calculation.
(b) Correct identification of Q1 = 18, Q3 = 32, so IQR = 14.
• 1 mark for correct quartiles.
• 1 mark for correct IQR.

Question 2 (6 marks)

A researcher records the daily number of customers visiting a small shop over 10 days. The data (in number of customers) are:
24, 30, 29, 41, 28, 26, 37, 35, 33, 42.
(a) Calculate the mean number of customers.
(b) Calculate the sample standard deviation.
(c) Comment on what the standard deviation suggests about the consistency of customer visits.

Question 2

(a) Mean correctly calculated as 32.5.
• 1 mark for correct summation of values.
• 1 mark for correct final mean.
(b) Sample standard deviation correctly calculated as approximately 6.1.
• 1 mark for correct calculation of deviations from the mean.
• 1 mark for correct calculation of squared deviations and summation.
• 1 mark for correct application of division by n − 1 and square root.
(c) Clear and accurate interpretation, e.g. “The standard deviation of about 6.1 suggests that daily customer counts vary moderately around the mean, indicating reasonably consistent but not uniform customer traffic.”
• 1 mark for a correct and contextually relevant interpretation.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
Your details
Alternatively contact us via
WhatsApp, Phone Call, or Email