AP Syllabus focus:
‘Enduring Understanding VAR-1: Discuss the premise that variation in data may be random or not, leading to uncertain conclusions. Learning Objective VAR-1.G: Focus on identifying questions that arise from observing variation in statistics for samples drawn from the same population. Essential Knowledge VAR-1.G.1: Elaborate on how variation in sample statistics can occur randomly or non-randomly when samples are taken from the same population, influencing how we interpret data and draw conclusions.’
Understanding how and why sample statistics vary is essential for interpreting data responsibly. Recognizing that variation can be random or non-random helps students evaluate patterns and avoid misleading conclusions.
Variation in Sample Statistics
Sample statistics almost never match the population parameter exactly. When we take repeated samples from the same population, the resulting statistics—such as the sample mean, sample proportion, or sample median—typically differ from one another. This variability is expected and is a key reason why statistical reasoning emphasizes uncertainty and probability rather than certainty.

This histogram displays the distribution of 1,000 simulated sample means from repeated random samples of the same population, showing how sample statistics vary across samples. The bars cluster around a central value, illustrating the concept of sampling variability. The Airbnb context includes extra details beyond the AP syllabus but directly supports the idea of repeated sampling from one population. Source.
Random Variation in Sample Statistics
Random variation refers to differences in statistics that arise purely because we select a sample rather than the entire population. Even when all sampling conditions are ideal and unbiased, each sample includes a slightly different mix of individuals. These natural fluctuations produce differing values for the statistics we compute.
Random Variation: Differences in sample outcomes that occur by chance when repeatedly sampling from the same population.
Small samples tend to show more random variability than large samples.

This graph displays sampling distributions of the sample mean for three different sample sizes drawn from the same normally distributed population. Narrower curves for larger samples demonstrate how increasing sample size reduces variability. The mathematical labels extend slightly beyond the AP syllabus but strongly reinforce the principle that larger samples produce more stable statistics. Source.
Non-Random Variation in Sample Statistics
Non-random variation occurs when differences in sample statistics stem from systematic influences, not chance. These influences may include the sampling method, the sampling frame, or external disruptions in data collection.
Non-Random Variation: Differences in sample outcomes caused by systematic factors rather than chance, such as biased sampling procedures or measurement errors.
Non-random variation is especially important to recognize because it may invalidate a study’s conclusions. In AP Statistics, identifying the presence or absence of systematic influence is a crucial step in evaluating the quality of data.

This figure contrasts a centered measurement distribution reflecting only random error with a shifted distribution reflecting systematic error. The visual illustrates the distinction between chance-based variation and variation caused by bias. Although the context involves continuous measurement, the concept directly parallels how systematic influences distort sample statistics. Source.
Sources of Random and Non-Random Variation
Understanding where variation comes from helps students make sense of observed differences and determine whether those differences are meaningful.
Common Sources of Random Variation
Random variation arises even under perfect conditions:
• Natural differences among individuals in a population
• Chance differences in which individuals appear in each sample
• Small sample sizes amplifying natural fluctuations
Common Sources of Non-Random Variation
Non-random variation typically results from avoidable design issues:
• Sampling bias arising from unrepresentative sampling procedures
• Undercoverage when some groups are left out of the sampling frame
• Nonresponse bias when certain individuals are more likely to opt out
• Measurement error from poor instruments or inconsistent recording
In all such cases, the statistic may consistently miss the true population parameter in a specific direction, making the variation predictable rather than due to chance.
Interpreting Variation and Forming Questions
The syllabus emphasizes that students should generate thoughtful questions when they observe variation in sample statistics. Such questions help determine whether observed differences can be attributed to chance or may reflect non-random influences.
Key Questions to Ask When Observing Variation
When analyzing variation in sample statistics, students should consider:
• Could this difference reasonably occur by chance?
• Does the sampling method introduce any systematic patterns?
• Are the sample sizes large enough to reduce excessive variability?
• Do the statistics differ in a way that suggests bias rather than randomness?
• What additional information would clarify the source of the variation?
This reflective questioning connects directly to the idea that variation leads to uncertain conclusions and that interpreting data requires distinguishing between randomness and systematic effects.
Why Distinguishing Random from Non-Random Variation Matters
The ability to differentiate between these two types of variation supports stronger statistical reasoning:
• Random variation suggests that repeated sampling or larger samples will center around the true parameter, reinforcing trust in probabilistic inference.
• Non-random variation signals a deeper issue and requires revising the data collection process before reliable inference can occur.
Ignoring this distinction may lead students to false conclusions about differences between groups or trends in data.
How Variation Influences Data Interpretation
Because variation affects every sample, statistics must always be interpreted with an understanding of uncertainty. Students must avoid assuming that a single sample statistic reflects the population perfectly. Instead, variation should prompt them to consider:
• The reliability and representativeness of their data
• Whether the statistic aligns with expected random fluctuation
• Whether patterns observed are meaningful or simply artifacts of sampling
Acknowledging and investigating variation prepares students to approach data with healthy skepticism and analytical rigor, which is a central aim of AP Statistics.
FAQ
A practical way is to consider whether the difference seems plausible under typical random sampling variation for the given sample size. Large differences are more suspicious when samples are large, because bigger samples usually vary less.
Useful indicators include:
• The magnitude of the difference relative to expected random fluctuation
• Whether the difference is consistent with what would occur by chance in repeated sampling
• Whether the same pattern persists across multiple independent samples
Increasing the sample size reduces random variation on average, but not necessarily in every single sample. Larger samples give more stable statistics because individual extreme values have less influence.
However, larger samples do not reduce variation caused by systematic issues such as bias in the sampling method, nonresponse problems, or poor measurement procedures.
Yes. Even if individuals are selected randomly, problems can arise after selection. For example, certain groups may decline to participate more often than others, or measurement tools may record values inconsistently.
These factors introduce systematic influences that are not due to the sampling method itself but affect the final statistics in predictable ways.
If the population is effectively unchanged, the differences almost always arise from chance. Each random sample captures a slightly different mix of individuals, producing different outcomes.
Short time gaps do not eliminate random variation because the samples are still independent selections from the same population.
Researchers can consider:
• Are the sampling procedures identical and truly random?
• Could any group be underrepresented due to nonresponse or accessibility?
• Are measurement instruments consistent across samples?
• Does the magnitude of the difference exceed what would normally occur by chance?
These questions help identify whether variation is due to natural sampling fluctuation or underlying bias.
Practice Questions
Question1 (1–3 marks)
A researcher repeatedly draws simple random samples of size 40 from the same population and calculates the sample mean each time. Explain why the sample means
Question1
• 1 mark: States that samples differ because they contain different individuals or different combinations of individuals.
• 1 mark: Recognises that this leads to natural or chance-based variation in the sample mean.
• 1 mark: Notes that this occurs even when sampling is carried out correctly and the population remains unchanged.
Question 2 (4–6 marks)
A school surveys two different simple random samples of 50 students each to estimate the proportion who regularly revise at home. The first sample reports a proportion of 0.62, while the second reports 0.54.
(a) State two possible reasons why the two sample proportions differ.
(b) Identify which of these reasons would indicate random variation, and which would indicate non-random variation.
(c) Explain why distinguishing between random and non-random variation is important when interpreting the survey results.
Question 2
(a)
• 1 mark: States random sampling variation as a reason (e.g., chance differences in who is selected).
• 1 mark: Gives a non-random reason such as sampling bias, undercoverage, nonresponse bias, or measurement issues.
(b)
• 1 mark: Correctly labels random sampling variation as random variation.
• 1 mark: Correctly labels the systematic influences (e.g., bias) as non-random variation.
(c)
• 1–2 marks: Explains that distinguishing the two matters because random variation is expected and does not necessarily imply a problem, while non-random variation could invalidate conclusions or indicate biased results.
• Award full 2 marks if the explanation clearly links the distinction to making reliable inferences about the proportion of students who revise at home.
