AP Syllabus focus:
‘Verify independence by ensuring data is collected from two independent random samples or through a randomized experiment. If sampling without replacement, the sample size for each group should be less than 10% of its population (n1/N1 ≤ 10% and n2/N2 ≤ 10%).’
Understanding how to check independence for inference with two population proportions ensures that statistical conclusions are valid, reliable, and grounded in properly collected data.
Why Independence Matters in Two-Proportion Inference
Independence is a foundational requirement for conducting a two-sample z-test or building a confidence interval for the difference between two population proportions. Independence ensures that the behavior of one sample does not influence the behavior of the other, allowing variability to be modeled accurately. When independence fails, standard formulas for standard error and sampling distribution assumptions may no longer hold, undermining valid inference.
Core Requirement: Two Independent Random Samples or a Randomized Experiment
The syllabus emphasizes two acceptable conditions that justify independence:
Data must come from two independent random samples, or
The study must use a randomized experiment where units are randomly assigned to treatment groups.
Random Sampling and Independence
Random sampling protects against systematic bias and ensures that each sample represents its population. Two samples are considered independent when:

This diagram shows two separate populations, each with its own independently drawn sample. The samples do not overlap, illustrating that selection in one group does not influence selection in the other. The statistical labels included exceed the syllabus requirements but still emphasize the concept of independent samples. Source.
Individuals selected for one sample play no role in the selection of individuals for the other sample.
No overlapping membership occurs between samples.
The sampling mechanism for one group does not alter or restrict sampling for the other.
Random Sample: A sampling method in which every member of the population has an equal chance of being selected, supporting unbiased estimation.
Random assignment in experiments also creates independence, because treatments are allocated using chance, balancing out unknown confounding variables across groups.
A normal explanatory sentence belongs here to maintain proper spacing before another structured block.
Randomized Experiment: A study design in which participants are randomly assigned to different treatment groups, ensuring independence between groups for causal inference.
Ensuring Independence When Sampling Without Replacement
When samples are selected without replacement, selection probability changes slightly after each draw.

This figure contrasts sampling with and without replacement. The lower panel highlights that once a unit is selected it cannot be chosen again, which is the condition relevant for applying the 10% rule to justify approximate independence. The upper panel provides additional context but is not required by the syllabus. Source.
This can introduce dependence among observations if the sample constitutes a large portion of the population. To control this, AP Statistics requires verification of the 10% condition.
The 10% Condition
For each sample:
The sample size must satisfy n₁ ≤ 0.10N₁ for population 1.
The sample size must satisfy n₂ ≤ 0.10N₂ for population 2.
These conditions ensure that the sample is small enough relative to the population that the draws behave approximately as if independent, even though selection occurs without replacement.

This table compares probabilities under independent and non-independent sampling for different classroom sizes. As the population grows relative to the fixed sample size, the probabilities converge, illustrating why the 10% condition allows draws to be treated as approximately independent. The football-themed example extends beyond the syllabus but reinforces the underlying reasoning. Source.
10% Condition: A rule stating that when sampling without replacement, the sample size should be no more than 10% of the population to maintain approximate independence.
Why the 10% Condition Protects Independence
The 10% threshold limits dependence because removing such a small fraction of the population barely alters the composition of the remaining population. In practice, this ensures:
Variability calculations based on independent trials remain accurate.
The sampling distribution of the difference in sample proportions maintains its expected behavior.
Approximate independence justifies the use of standard formulas for standard error.
Independence Between the Two Samples
In addition to independence within each sample, the two samples must also be independent of each other. This means:
Sampling from one population should not influence sampling from the second.
There must be no overlap in individuals or units.
The populations themselves should be conceptually distinct unless random assignment created the groups.
Situations That Violate Independence
Students should be aware of common violations of independence, such as:
Paired or matched samples, which require different inference procedures.
Clusters or groups where individuals share characteristics causing dependence.
Respondents belonging to both samples, creating overlap.
Self-selection into groups where individuals choose their category rather than being sampled from separate populations.
Key Checks for Independence in Practice
To meet the AP requirements, verify the following:
1. Sampling or Assignment Method
Was each group formed using random sampling?
If experimental, were subjects randomly assigned to treatment conditions?
Does the study design explicitly prevent relationships between samples?
2. Population Structure
Are the populations distinct or treated independently?
Is there any risk of overlap or cross-influence?
3. Sample Size Relative to Population
Apply the 10% condition for both groups:
n₁/N₁ ≤ 10%
n₂/N₂ ≤ 10%
4. Conceptual Independence
Could responses or behaviors in one group affect those in the other?
Does the context suggest natural dependence (e.g., siblings in samples, repeated measures)?
Summary of What Independence Ensures for Inference
Although no explicit conclusion is allowed, students should recognize that independence enables:
Valid modeling of sampling variability
Accurate standard error calculations
Justifiable use of the two-sample z procedures for proportions
FAQ
Independence is based on the sampling process, not on how similar the populations appear. Two samples remain independent as long as individuals in one sample are not eligible for selection into the other and the sampling procedures do not influence each other.
If both samples use separate sampling frames and no individual can be selected twice, independence is maintained even when populations share demographic characteristics.
Not necessarily. Independence concerns the mechanism of selection, not the timing.
Sampling periods can overlap as long as:
• the samples draw from distinct populations or non-overlapping lists
• the selection of individuals in one sample does not affect who is selected in the other
Time overlap becomes relevant only if behaviour in one group could alter responses in the other.
When populations are very large (for example, millions), the 10% condition is almost always satisfied because samples rarely exceed even a small fraction of the population.
However, it is still good practice to reference the condition to demonstrate that dependence caused by sampling without replacement is negligible. Explicitly showing that the sample is tiny relative to the population strengthens the justification for independence.
Yes. Random sampling alone does not guarantee independence if the sampling frames overlap or if the sampling methods interact.
Common issues include:
• selecting households for both samples when individuals may appear in multiple household units
• using the same random seed or method that links selection across groups
• logistical constraints that cause one sample to depend on the other’s outcomes
Ensuring separate sampling frames prevents these problems.
Matched-pair designs deliberately link observations across groups, meaning responses in one group are dependent on those in the other. This violates the requirement for independent samples.
Cluster sampling can also break independence if both samples draw individuals from shared clusters, creating correlated responses.
Because both designs introduce structural dependence, they require different inference methods and cannot be treated as independent two-sample proportion settings.
Practice Questions
Question 1 (1–3 marks)
A researcher selects two separate simple random samples: one from a population of 8,000 adults and another from a different population of 10,000 adults. Each sample contains 400 individuals. Explain whether the independence condition for inference on two population proportions is satisfied.
Question 1
• 1 mark for stating that each sample is a simple random sample from its own population.
• 1 mark for checking the 10% condition for at least one sample (e.g., 400 is 5% of 8,000; 400 is 4% of 10,000).
• 1 mark for concluding that independence is satisfied because both samples meet the 10% condition and are drawn from distinct populations.
Question 2 (4–6 marks)
A school district wants to compare the proportion of students who participate in after-school activities between two independent secondary schools. School A has 1,200 students, and a random sample of 90 students is taken. School B has 950 students, and a random sample of 120 students is taken.
(a) State the conditions required to justify independence for inference on the difference between two population proportions.
(b) Verify whether each condition is met for both schools.
(c) Explain why independence is essential for valid inference in this context. (4–6 marks)
Question 2
(a) (1–2 marks)
• 1 mark for stating that the two samples must be independently obtained through random sampling or random assignment.
• 1 mark for stating that when sampling without replacement, each sample must be no more than 10% of its population.
(b) (2–3 marks)
• 1 mark for verifying 90 ≤ 10% of 1,200 for School A.
• 1 mark for verifying 120 ≤ 10% of 950 for School B.
• 1 mark for correctly concluding that both samples meet the independence requirement.
(c) (1 mark)
• 1 mark for explaining that independence ensures that the samples do not influence each other, allowing valid estimation of variability and appropriate use of two-proportion inference methods.
