TutorChase logo
Login
AP Statistics study notes

7.9.3 Making a Formal Decision

AP Syllabus focus:
‘Compare the p-value to the predetermined level of significance (α). If the p-value ≤ α, reject the null hypothesis, concluding there is statistical evidence of a difference in population means. If the p-value > α, fail to reject the null hypothesis, indicating insufficient evidence to support a difference in population means.’

Formal decision-making in hypothesis testing requires comparing the p-value to a chosen significance level, allowing researchers to determine whether sample evidence provides support against the null hypothesis in context.

Making a Formal Decision in a Two-Sample t-Test

A formal decision in statistical inference links the numerical result of a hypothesis test to a clear, contextual conclusion. AP Statistics emphasizes interpreting this decision using both the p-value and the significance level (α), ensuring that students can justify whether the data provide convincing statistical evidence against the null hypothesis (H₀). The syllabus requires understanding that the decision rule is grounded in probability and that conclusions must reflect the context of the research question rather than purely mathematical outcomes.

Understanding the Role of the p-Value and Significance Level

The p-value is a probability describing how likely it is to obtain a test statistic as extreme as the one observed, assuming the null hypothesis is true.

The diagram shows a sampling distribution under the null hypothesis, with the p-value represented as the shaded tail area beyond the observed statistic. It reinforces that the p-value is the probability of obtaining results at least as extreme as the sample outcome, assuming the null hypothesis is true. The additional labels about unlikely observations provide supporting context without exceeding syllabus expectations. Source.

It reflects the compatibility of the data with the null model and helps determine whether the observed difference in sample means is unusual enough to question that model.

p-value: The probability, calculated under the assumption that the null hypothesis is true, of obtaining a test statistic as extreme as or more extreme than the observed statistic.

Because the p-value is a probability, it must be compared to a predetermined benchmark: the significance level.

Significance level (α): A threshold that specifies how unlikely a sample result must be, assuming the null hypothesis is true, to justify rejecting the null hypothesis.

Researchers choose α before collecting data to control the long-run probability of committing a Type I error, defined as mistakenly rejecting a true null hypothesis. Common choices include 0.05, 0.01, or 0.10, depending on the context and consequences of errors.

This graph displays a t-distribution with shaded critical regions for α = 0.05, showing that only the most extreme 5% of outcomes lead to rejecting the null hypothesis. It illustrates how the significance level sets the boundary for unusual results under the null model. The specific numerical example adds context without exceeding syllabus limits. Source.

A meaningful comparison between the p-value and α allows a decision that is both mathematically justified and aligned with experimental goals.

The Decision Rule for Hypothesis Tests

The AP Statistics syllabus specifies a simple yet powerful rule for making the formal decision:

  • If the p-value ≤ α, reject the null hypothesis.

  • If the p-value > α, fail to reject the null hypothesis.

This rule connects the probability framework to practical inference. It ensures that claims are based on evidence strong enough to surpass a predetermined threshold, preventing overly confident conclusions from random variation alone.

To support clear understanding, the reasoning can be expressed as follows:

  • When the p-value is small, the observed statistic would be unlikely if the null hypothesis were true. Such evidence suggests that the null hypothesis may not adequately explain the data and supports the alternative hypothesis.

  • When the p-value is large, the observed result is plausible under the null hypothesis, so there is not enough evidence to conclude a difference in population means.

Avoiding Misinterpretations and Using Appropriate Language

Students often misinterpret p-values as direct probabilities of hypotheses being true or false. However, the formal decision must remain grounded in what the p-value actually measures: the likelihood of obtaining data at least as extreme as observed, assuming the null hypothesis is true. Therefore, the decision should be stated using careful, contextualized language.

Correct statements emphasize evidence, not certainty. For example, rejecting H₀ does not prove the alternative hypothesis is true; it simply indicates that the sample provides statistically significant evidence supporting it. Likewise, failing to reject H₀ does not confirm that the null hypothesis is true. It means only that the test did not detect enough evidence against it given the sample size and variability.

Incorporating Context into the Final Decision

The syllabus stresses the importance of presenting the decision within the context of the research question.

The flowchart connects reject/fail-to-reject decisions to appropriate conclusion statements, reinforcing the requirement to express decisions in context. It helps students understand how statistical evidence shapes the final written conclusion. References to “original claim” add minor extra detail but remain fully aligned with AP expectations. Source.

This means that after determining whether to reject or fail to reject the null hypothesis, the conclusion must translate the statistical result into a meaningful statement about the populations under study.

Strong contextual conclusions include three elements:

  • Decision based on the p-value and α

  • Evidence demonstrated through the significance of results

  • Connection to the real-world variables and populations

For example, rather than stating “Reject H₀ at α = 0.05,” an appropriate contextual conclusion would reference the difference in population means and what that difference implies about the scenario.

Importance of Pre-Determined Significance Level

Setting α before analyzing data ensures objectivity and prevents researchers from manipulating thresholds to achieve desired outcomes. Because α represents a controlled probability of a false-positive conclusion, selecting it ahead of time preserves the integrity of hypothesis testing.

Furthermore, α anchors the formal decision in long-term reasoning: over many repeated studies, a proportion α of tests would be expected to reject a true null hypothesis. In this way, hypothesis testing balances the risks of Type I and Type II errors, acknowledging that decisions are based on probabilities rather than certainties.

Summary of the Decision-Making Steps

  • Determine the significance level (α) before analyzing data.

  • Compute or obtain the p-value from the t-test.

  • Compare the p-value to α.

  • Reject H₀ if p-value ≤ α; fail to reject H₀ if p-value > α.

  • State the decision clearly and relate it to the context of the population means being compared.

FAQ

Changing the significance level after observing the p-value undermines the integrity of the hypothesis test because it introduces bias.

It effectively allows the researcher to tailor the rules to obtain a preferred outcome, increasing the risk of Type I errors.

In formal statistical practice, alpha must be selected in advance to preserve objectivity and maintain the long-run error rate of the testing procedure.

Failing to reject the null hypothesis reflects insufficient evidence against it, not confirmation that it is true.

The sample may simply not provide strong enough evidence, perhaps due to limited size or high variability.

Saying “accept” implies certainty, which statistical inference cannot provide since decisions are based on probability rather than proof.

Larger samples tend to produce more precise estimates, reducing the standard error and making extreme test statistics more detectable.

This increases the chance of rejecting the null hypothesis when a true effect exists.

Smaller samples may mask meaningful differences, leading to higher p-values and more frequent failures to reject the null hypothesis.

Different significance levels across studies can produce different decisions even when the p-values match.

For example, a p-value of 0.04 leads to rejection at alpha 0.05 but not at alpha 0.01.

Context, risk tolerance, and domain standards influence the choice of alpha, so decisions are not universally identical.

Yes, in certain contexts. A single p-value does not capture all aspects of the data.

For instance:

  • A large p-value alongside a large sample size might still suggest a practically meaningful difference

  • A wide confidence interval may show values of interest even if the p-value exceeds alpha.

Interpretation must consider the context, not just the numerical comparison to alpha.

Practice Questions

Question 1 (1–3 marks)
A researcher tests whether the mean concentration of a chemical in a river differs from a known safe level of 12.0 mg/L. The significance level is set at α = 0.05.
The p-value from the test is 0.028.
State the formal decision the researcher should make and justify it using the decision rule for hypothesis testing.

Question 1

  • 1 mark for correctly comparing the p-value to α.

    • Statement such as: "0.028 is less than 0.05."

  • 1 mark for the correct decision.

    • "Reject the null hypothesis."

  • 1 mark for contextual justification.

    • E.g., "There is sufficient evidence to conclude that the true mean concentration differs from 12.0 mg/L."

Question 2 (4–6 marks)
A sports scientist investigates whether a new training method affects the average sprint time of athletes. The null hypothesis states that the mean sprint time is equal to the current average of 10.4 seconds.
A one-sample t-test is carried out at the 5% significance level and produces a p-value of 0.18.

(a) Using the decision rule involving the p-value and alpha, state whether the null hypothesis should be rejected or not.
(b) Explain, in context, what this decision means regarding the effectiveness of the new training method.
(c) Give one reason why failing to reject the null hypothesis does not prove that the training method has no effect.

Question 2

(a)

  • 1 mark for stating that the p-value (0.18) is greater than α (0.05).

  • 1 mark for correct decision: "Fail to reject the null hypothesis."

(b)

  • 1–2 marks for a clear contextual interpretation.

    • Must refer to sprint times and the new training method.

    • Example earning full credit: "There is not sufficient statistical evidence to conclude that the new training method changes the average sprint time from 10.4 seconds."

(c)

  • 1–2 marks for a correct explanation of why failing to reject H0 does not prove it true.

    • Acceptable points include:

      • Lack of evidence is not proof of no effect.

      • The sample size may be too small to detect a real difference.

      • The data may have too much variability.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
Your details
Alternatively contact us via
WhatsApp, Phone Call, or Email