TutorChase logo
Login
AP Statistics study notes

8.3.1 Calculating the Chi-Square Statistic

AP Syllabus focus:
‘Calculate the chi-square statistic using the formula: χ² = Σ((Observed count - Expected count)² / Expected count), summing across all categories.
- Understand that the degrees of freedom for this test are calculated as (number of categories - 1).
- The chi-square statistic measures how much the observed counts deviate from the expected counts under the null hypothesis.’

This section introduces how the chi-square statistic quantifies discrepancies between observed and expected counts, forming the foundation for inference using the chi-square distribution in categorical data analysis.

Calculating the Chi-Square Statistic

The Role of the Chi-Square Statistic

The chi-square statistic is a numerical measure that reflects the extent to which observed categorical counts differ from the counts predicted under a null hypothesis. Because it evaluates the size of deviations across all categories simultaneously, it provides a single, interpretable value summarizing the evidence against the assumed distribution. A larger statistic indicates a greater mismatch between observation and expectation, suggesting the observed data may not align with the proposed model.

Observed and Expected Counts

In chi-square testing, observed counts are the actual frequencies recorded in each category, while expected counts represent the hypothetical frequencies predicted when the null hypothesis is assumed true. Expected counts are calculated before the chi-square statistic and act as a benchmark for assessing variation in the observed data.

Observed Count: The actual frequency recorded for a category in the collected sample.

A short pause between definition blocks is required, so here we note that expected counts must be predetermined to assess how surprising the observed results are under the null model.

Expected Count: The predicted frequency for a category under the assumption that the null hypothesis is correct.

Structure of the Chi-Square Formula

The chi-square statistic is formed by summing a standardized measure of deviation across all categories. Each deviation is squared to prevent positive and negative differences from canceling out, then divided by the expected count to adjust for scale. This structure ensures that categories with low expected frequencies do not disproportionately affect the statistic.

EQUATION

Chi-Square Statistic (χ2)=(ObservedExpected)2Expected \text{Chi-Square Statistic }(\chi^2) = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}
χ2 \chi^2 = The test statistic measuring total deviation
Observed \text{Observed} = Recorded count in a category
Expected \text{Expected} = Predicted count under the null hypothesis

A normal sentence is required here before the next equation block. Note that each component of the formula contributes proportionally, meaning larger discrepancies increase the statistic more dramatically.

Degrees of Freedom

Every chi-square statistic is interpreted in relation to a chi-square distribution, which is determined by its degrees of freedom. In the goodness-of-fit setting, degrees of freedom depend solely on the number of categories being evaluated.

EQUATION

Degrees of Freedom (df)=Number of Categories1 \text{Degrees of Freedom }(df) = \text{Number of Categories} - 1
df df = Number of independent pieces of information

Why Degrees of Freedom Matter

Degrees of freedom shape the chi-square distribution used to compute a p-value, making them essential for inference. A distribution with fewer degrees of freedom is more right-skewed, while larger values lead to a shape closer to a normal distribution. This affects how extreme a given chi-square value appears. Because the test assesses deviations across multiple categories, reducing one degree of freedom accounts for the constraint created by the total sample size and the probabilities summing to one.

Interpreting the Chi-Square Statistic

The chi-square statistic functions as a measure of overall discrepancy. Its magnitude reflects how far the observed pattern diverges from theoretical expectations under the null hypothesis. Important interpretive principles include:

  • A small statistic suggests only minor deviations attributable to random variation.

  • A large statistic indicates substantial inconsistencies that may challenge the null hypothesis.

  • The statistic alone does not determine significance; its extremity must be assessed using the appropriate chi-square distribution.

Key Considerations When Calculating the Statistic

To ensure proper interpretation and valid inference, the following points guide effective calculation:

  • All categories must be included in the summation to capture the total discrepancy.

  • Expected counts should never be substituted by observed counts, as this would undermine the logic of the test.

  • Large deviations contribute quadratically, making the statistic sensitive to categories where observed behavior strongly conflicts with theory.

  • The test is inherently one-sided, as the statistic can only increase with larger discrepancies; it cannot take negative values.

Practical Interpretation in Context

Within the framework of hypothesis testing, the chi-square statistic provides essential evidence about the fit between data and model. By quantifying how unexpected the observed counts are, the statistic becomes the central link between sample data and population-level conclusions. A thorough understanding of how it is calculated ensures accurate reasoning about whether a proposed categorical distribution is plausible under the null hypothesis.

Chi-square probability density functions for several degrees of freedom. Smaller df values create sharply right-skewed curves, while larger df values produce broader, more symmetric shapes. The image includes more df values than typically used in AP Statistics but clearly illustrates how the distribution changes with df. Source.

Bar chart showing the contribution of each category to a chi-square statistic. Taller bars represent categories with larger standardized deviations from expected counts. The chart includes specific category labels beyond AP requirements but clearly demonstrates how individual terms build the total chi-square value. Source.

FAQ

Squaring ensures all deviations contribute positively, preventing cancellation between categories where observed counts are above or below expectations.

It also makes larger deviations disproportionately more influential, helping highlight categories with unusually large differences. This property supports the sensitivity needed for detecting departures from the null hypothesis.

Larger samples typically increase the chi-square statistic because even small proportional differences between observed and expected counts become more pronounced when multiplied across more observations.

However, larger samples also stabilise expected counts, making the test more reliable. Very small samples can produce deceptively small chi-square values due to unstable category counts.

Yes. Many different patterns of deviation can sum to the same overall chi-square value, because the statistic measures total discrepancy, not the specific distribution of discrepancies.

This is why it is often useful to look at individual category contributions when interpreting results, even though the chi-square test itself is based solely on the total statistic.

Expected counts define the theoretical distribution under the null hypothesis, forming the baseline against which deviations are measured.

Without expected counts, there is no reference point to determine whether an observed count is unusually high or low. They also ensure each category's contribution is standardised relative to its expected frequency.

Large expected counts tend to dilute the influence of observed deviations in that category because each discrepancy is divided by a larger number.

This means:

  • A category with a very high expected count must show a very large deviation to meaningfully affect the chi-square total.

  • Conversely, categories with small expected counts are more sensitive to moderate deviations.

This imbalance is why tests require all expected counts to be sufficiently large to avoid distortion.

Practice Questions

Question 1 (1–3 marks)

A chi-square goodness-of-fit test is carried out with four categories. The observed and expected counts differ slightly in each category. Explain what the chi-square statistic measures in this context.
(1–3 marks)

Question 1
1 mark: States that the chi-square statistic measures the discrepancy or difference between observed and expected counts.
2 marks: Explains that the statistic sums these discrepancies across all categories in a standardised way.
3 marks: Clearly states that larger values indicate greater departure from what would be expected under the null hypothesis.

Question 2 (4–6 marks)

A researcher studies preferences for three types of music: Classical, Pop, and Jazz. Under the null hypothesis, each category is equally likely. In a random sample of 150 people, the observed counts are: Classical 42, Pop 63, Jazz 45.
(a) Calculate the expected count for each category under the null hypothesis.
(b) Calculate the chi-square statistic for the data using the formula sum of (Observed − Expected) squared divided by Expected.
(c) State the degrees of freedom for this test and briefly comment on whether the observed data appear to differ meaningfully from the expected distribution.
(4–6 marks)

Question 2

(a) Expected counts

  • 1 mark: Correctly states the expected count is 50 for each category (150 people equally across 3 categories).

(b) Chi-square statistic

  • 1 mark: Correct substitution of observed and expected values into the formula.

  • 1 mark: Correct intermediate calculations for each category’s contribution.

  • 1 mark: Correct chi-square value (approximately 4.62).

(c) Degrees of freedom and comment

  • 1 mark: Correct degrees of freedom: 2 (number of categories minus 1).

  • 1 mark: Makes a sensible comment that the chi-square statistic is not extremely large, indicating only moderate deviation from the expected distribution (no requirement to conclude significance).

Total: 6 marks.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
Your details
Alternatively contact us via
WhatsApp, Phone Call, or Email