A-Level — mathsA2 Level12 min read

Statistical Hypothesis Testing

Master A-Level hypothesis testing: null and alternative hypotheses, significance levels, binomial tests, critical regions, and one/two-tailed tests.

Tutor AI Team2026-02-24T07:25:42.267Z

Hypothesis testing is a formal statistical procedure used to assess whether observed data provides sufficient evidence to reject a claim about a population parameter. It is one of the most important topics in A-Level Statistics, tested by all major exam boards (AQA, Edexcel, OCR), and it underpins much of the statistical reasoning used in science, medicine, business, and social research.

At A-Level, hypothesis testing is introduced using the binomial distribution (and later the normal distribution at A2 for some boards). You will learn to set up hypotheses, choose significance levels, calculate test statistics or probabilities, and draw conclusions in context. The language and logic of hypothesis testing are precise — examiners reward students who use correct terminology and structure their arguments clearly.

This guide covers the full A-Level hypothesis testing framework, with a focus on binomial tests, critical regions, and the interpretation of results.

Core Concepts

What is a Hypothesis Test?

A hypothesis test starts with a claim (or assumption) about a population parameter — typically a probability $p$ or a mean $\mu$ . We collect data and ask: "Is this data consistent with the claim, or does it provide evidence against it?"

The test follows a structured process:

Define the hypotheses.
Choose a significance level.
Collect data and calculate the test statistic.
Compare with the critical value (or calculate the $p$ -value).
Draw a conclusion in context.

Null and Alternative Hypotheses

The null hypothesis $H_0$ is the default assumption — typically that nothing has changed or that a parameter takes a specified value. For example:

$H_0: p = 0.3$

The alternative hypothesis $H_1$ specifies what we suspect might be true instead. It can take one of three forms:

One-tailed (upper): $H_1: p > 0.3$ (we suspect $p$ is larger)
One-tailed (lower): $H_1: p < 0.3$ (we suspect $p$ is smaller)
Two-tailed: $H_1: p \neq 0.3$ (we suspect $p$ is different, but don't specify the direction)

The choice of $H_1$ depends on the context of the problem and must be decided before looking at the data.

Significance Level

The significance level $\alpha$ is the probability of incorrectly rejecting $H_0$ when it is actually true (a Type I error). Common significance levels are:

$\alpha = 0.05$ (5%) — the most common
$\alpha = 0.01$ (1%) — more stringent
$\alpha = 0.10$ (10%) — more lenient

The smaller the significance level, the stronger the evidence needed to reject $H_0$ .

The Binomial Test

At A-Level, many hypothesis tests involve a binomial distribution. If we observe $X$ successes in $n$ independent trials, each with probability $p$ of success, then under $H_0$ :

$X \sim B(n, p_0)$

where $p_0$ is the value of $p$ specified in $H_0$ .

To carry out the test, we calculate the probability of obtaining a result as extreme as (or more extreme than) the observed value, assuming $H_0$ is true.

For a one-tailed test ( $H_1: p < p_0$ ):

Calculate $P(X \leq x_{\text{obs}})$ under $H_0$ . If this probability is less than $\alpha$ , reject $H_0$ .

For a one-tailed test ( $H_1: p > p_0$ ):

Calculate $P(X \geq x_{\text{obs}})$ under $H_0$ . If this probability is less than $\alpha$ , reject $H_0$ .

For a two-tailed test ( $H_1: p \neq p_0$ ):

Calculate the probability in the relevant tail and compare with $\frac{\alpha}{2}$ (since the significance level is split between both tails).

Critical Regions and Critical Values

The critical region is the set of values of the test statistic that lead to rejection of $H_0$ . The critical value is the boundary of this region.

For a binomial test with $H_1: p < p_0$ at the 5% level, the critical region consists of all values $x$ such that $P(X \leq x) < 0.05$ . The largest such $x$ is the critical value.

For $H_1: p > p_0$ , the critical region is in the upper tail: values $x$ such that $P(X \geq x) < 0.05$ .

When the observed value falls in the critical region, we reject $H_0$ . When it falls outside, we do not reject $H_0$ .

The Actual Significance Level

Because the binomial distribution is discrete, we usually cannot achieve exactly $\alpha = 0.05$ . The actual significance level is the probability of the critical region, which is as close to $\alpha$ as possible without exceeding it.

For example, if the critical region is $X \leq 2$ and $P(X \leq 2) = 0.0382$ , then the actual significance level is $3.82\%$ , not exactly $5\%$ .

Type I and Type II Errors

A Type I error occurs when we reject $H_0$ when it is actually true. The probability of a Type I error equals the significance level $\alpha$ .

A Type II error occurs when we fail to reject $H_0$ when it is actually false. The probability of a Type II error depends on the true value of the parameter and is harder to calculate.

	$H_0$ true	$H_0$ false
Reject $H_0$	Type I error	Correct decision
Don't reject $H_0$	Correct decision	Type II error

Writing Conclusions

Conclusions must be written in context and with appropriate language:

Reject $H_0$ : "There is sufficient evidence at the $\alpha$ significance level to reject $H_0$ and conclude that [contextual statement about $H_1$ ]."
Do not reject $H_0$ : "There is insufficient evidence at the $\alpha$ significance level to reject $H_0$ . There is no significant evidence that [contextual statement about $H_1$ ]."

Important: we never say we "accept $H_0$ " — we only say we "do not reject" it, because failing to find evidence against $H_0$ is not the same as proving it true.

Strategy Tips

Tip 1: Read the Context Carefully

The wording of the question tells you which alternative hypothesis to use. Phrases like "believes the proportion has increased" suggest $H_1: p > p_0$ ; "claims it has changed" suggests $H_1: p \neq p_0$ .

Tip 2: Set Up Hypotheses Before Calculating

Always write down $H_0$ and $H_1$ before doing any calculations. This ensures you test the correct tail and use the correct comparison.

Tip 3: Use the Correct Tail Probability

For $H_1: p < p_0$ , calculate $P(X \leq x)$ (lower tail). For $H_1: p > p_0$ , calculate $P(X \geq x)$ (upper tail). Mixing these up is one of the most common errors.

Tip 4: State the Distribution Under $H_0$

Explicitly write "Under $H_0$ , $X \sim B(n, p_0)$ ". This earns a method mark and shows the examiner you understand the test framework.

Tip 5: Always Conclude in Context

A conclusion that says only "reject $H_0$ " without reference to the real-world situation will lose marks. Always relate your answer back to the scenario described in the question.

Worked Example: Example 1

Problem

A manufacturer claims that $20\%$ of items produced are defective. A quality inspector tests a random sample of $20$ items and finds $7$ defective. Test, at the $5\%$ significance level, whether the proportion of defective items is greater than $20\%$ .

Solution

$H_0: p = 0.2$ (the proportion of defective items is $20\%$ )

$H_1: p > 0.2$ (the proportion is greater than $20\%$ )

Significance level: $\alpha = 0.05$ (one-tailed test).

Under $H_0$ : $X \sim B(20, 0.2)$ , where $X$ is the number of defective items.

Observed value: $x = 7$ .

Calculate $P(X \geq 7)$ under $H_0$ :

$P(X \geq 7) = 1 - P(X \leq 6)$

Using binomial tables or a calculator: $P(X \leq 6) = 0.9133$

$P(X \geq 7) = 1 - 0.9133 = 0.0867$

Since $0.0867 > 0.05$ , we do not reject $H_0$ .

Conclusion: There is insufficient evidence at the $5\%$ significance level to conclude that the proportion of defective items is greater than $20\%$ .

Worked Example: Example 2

Problem

A coin is suspected of being biased. It is tossed $10$ times and lands on heads $9$ times. Test at the $5\%$ significance level whether the coin is biased towards heads.

Solution

$H_0: p = 0.5$ (the coin is fair)

$H_1: p > 0.5$ (the coin is biased towards heads)

Significance level: $\alpha = 0.05$ (one-tailed).

Under $H_0$ : $X \sim B(10, 0.5)$ .

Observed value: $x = 9$ .

$P(X \geq 9) = P(X = 9) + P(X = 10)$

$P(X = 9) = \binom{10}{9}(0.5)^{10} = 10 \times \frac{1}{1024} = \frac{10}{1024}$

$P(X = 10) = \binom{10}{10}(0.5)^{10} = \frac{1}{1024}$

$P(X \geq 9) = \frac{11}{1024} \approx 0.0107$

Since $0.0107 < 0.05$ , we reject $H_0$ .

Conclusion: There is sufficient evidence at the $5\%$ significance level to conclude that the coin is biased towards heads.

Worked Example: Example 3

Problem

Historically, $35\%$ of students at a school achieve a grade A in maths. After introducing a new teaching method, a random sample of $15$ students is taken and $8$ achieve grade A. Test at the $5\%$ significance level whether there is evidence that the proportion has changed.

Solution

$H_0: p = 0.35$

$H_1: p \neq 0.35$ (two-tailed test)

Significance level: $\alpha = 0.05$ , so each tail has $\alpha/2 = 0.025$ .

Under $H_0$ : $X \sim B(15, 0.35)$ .

Observed value: $x = 8$ . Since $8 > 15 \times 0.35 = 5.25$ , we test the upper tail.

$P(X \geq 8) = 1 - P(X \leq 7)$

Using a calculator: $P(X \leq 7) = 0.9500$ (approximately)

$P(X \geq 8) \approx 0.0500$

Since $0.0500 > 0.025$ (the critical value for the upper tail in a two-tailed test), we do not reject $H_0$ .

Conclusion: There is insufficient evidence at the $5\%$ significance level to conclude that the proportion of students achieving grade A has changed following the new teaching method.

Worked Example: Example 4

Problem

Find the critical region for a test of $H_0: p = 0.3$ against $H_1: p < 0.3$ using $X \sim B(12, 0.3)$ at the $5\%$ significance level.

Solution

We need the largest value $c$ such that $P(X \leq c) < 0.05$ under $H_0$ .

$P(X = 0) = (0.7)^{12} = 0.0138$

$P(X \leq 0) = 0.0138 < 0.05$ ✓

$P(X \leq 1) = P(X = 0) + P(X = 1) = 0.0138 + \binom{12}{1}(0.3)^1(0.7)^{11} = 0.0138 + 0.0712 = 0.0850$

$P(X \leq 1) = 0.0850 > 0.05$ ✗

So the critical region is $X \leq 0$ , i.e., $\{0\}$ .

The actual significance level is $0.0138$ ( $1.38\%$ ).

Practice Problems

Problem 1

A die is thought to be biased. The probability of rolling a six is tested. In $30$ rolls, $9$ sixes are observed. Test at the $5\%$ level whether the die is biased towards six. ( $H_0: p = \frac{1}{6}$ , $H_1: p > \frac{1}{6}$ .) [Hint: $P(X \geq 9)$ where $X \sim B(30, 1/6)$ ]

Problem 2

A charity claims that $40\%$ of households donate. A survey of $25$ households finds $6$ donors. Test at the $5\%$ level whether the proportion is less than $40\%$ . [Answer: $P(X \leq 6) \approx 0.074 > 0.05$ , do not reject $H_0$ ]

Problem 3

Find the critical region for testing $H_0: p = 0.5$ against $H_1: p < 0.5$ with $n = 10$ at the $5\%$ significance level. [Answer: $X \leq 1$ , actual significance $= 0.0107$ ]

Problem 4

A factory's defect rate has historically been $10\%$ . After maintenance, a sample of $50$ items reveals $2$ defects. Is there evidence at the $5\%$ level that the defect rate has decreased? [Hint: one-tailed test, $X \sim B(50, 0.1)$ ]

Problem 5

Explain what is meant by a Type I error in the context of Problem 1 above. State its probability.

Want to check your answers and get step-by-step solutions?

Common Mistakes

Saying "accept $H_0$ " instead of "do not reject $H_0$ ". This is a critical language error. We never prove $H_0$ true — we merely find insufficient evidence to reject it.
Using the wrong tail. If $H_1: p > p_0$ , you need the upper tail probability $P(X \geq x)$ , not $P(X \leq x)$ . Read $H_1$ carefully to determine the correct direction.
Forgetting to halve $\alpha$ for two-tailed tests. In a two-tailed test, compare the tail probability with $\frac{\alpha}{2}$ , not $\alpha$ . Forgetting this effectively doubles the significance level.
Not writing the distribution under $H_0$ . Always state $X \sim B(n, p_0)$ explicitly. This is a required step in the method and earns marks.
Vague or non-contextual conclusions. "Reject $H_0$ " alone is not sufficient. You must relate the conclusion to the real-world scenario described in the question.
Confusing $P(X = x)$ with $P(X \leq x)$ . The $p$ -value for a lower-tailed test is the cumulative probability $P(X \leq x)$ , not the probability of that single value.

Frequently Asked Questions

Why don't we "accept" the null hypothesis?

Because failing to reject $H_0$ does not prove it is true. It merely means we did not find enough evidence against it. A different sample might yield different results. The correct phrase is "there is insufficient evidence to reject $H_0$ ".

How do I decide between a one-tailed and two-tailed test?

If the question suggests a specific direction of change (e.g., "believes the proportion has increased"), use a one-tailed test. If it says "test whether the proportion has changed" without specifying direction, use a two-tailed test.

What if my $p$-value exactly equals the significance level?

Convention varies, but at A-Level, if the $p$ -value equals $\alpha$ , we are on the boundary of the critical region. Most exam mark schemes treat this as "reject $H_0$ " (the critical region includes the boundary), but read the question carefully.

Do I need to calculate binomial probabilities by hand?

You should be able to use the binomial probability formula $P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}$ and cumulative probabilities. In practice, many exam boards provide statistical tables or expect calculator use. Check your board's guidance.

What is the actual significance level, and why does it differ from $\alpha$?

The actual significance level is the exact probability of the critical region. Because the binomial distribution is discrete, we cannot always achieve exactly $\alpha = 0.05$ . The actual significance level is the largest possible probability that does not exceed $\alpha$ .

Key Takeaways

✓
Hypothesis testing follows a rigid structure. Define $H_0$ and $H_1$ , state the significance level, identify the distribution under $H_0$ , compute the probability, compare, and conclude in context.
✓
$H_0$ represents the status quo. The null hypothesis is what we assume to be true unless the data provides sufficient evidence against it.
✓
The significance level controls Type I error. Choosing $\alpha = 0.05$ means we accept a $5\%$ chance of incorrectly rejecting a true $H_0$ .
✓
Critical regions define rejection boundaries. If the observed test statistic falls in the critical region, we reject $H_0$ . Otherwise, we do not.
✓
Language matters enormously. Use "sufficient evidence to reject" and "insufficient evidence to reject" — never "accept $H_0$ " or "prove $H_1$ ".
✓
Context is king. Every conclusion must be expressed in terms of the original problem. Statistical jargon alone does not earn full marks.

Tags:

#a-level#maths#statistics#hypothesis-testing

Ready to Ace Your A-Level maths?

Get instant step-by-step solutions to any problem. Snap a photo and learn with Tutor AI — your personal exam prep companion.

Core Concepts

What is a Hypothesis Test?

Null and Alternative Hypotheses

Significance Level

The Binomial Test

Critical Regions and Critical Values

The Actual Significance Level

Type I and Type II Errors

Writing Conclusions

Strategy Tips

Tip 1: Read the Context Carefully

Tip 2: Set Up Hypotheses Before Calculating

Tip 3: Use the Correct Tail Probability

Tip 4: State the Distribution Under H0H_0H0​

Tip 5: Always Conclude in Context

Worked Example: Example 1

Worked Example: Example 2

Worked Example: Example 3

Worked Example: Example 4

Practice Problems

Problem 1

Problem 2

Problem 3

Problem 4

Problem 5

Common Mistakes

Frequently Asked Questions

Key Takeaways

Related Topics

Quadratic Functions and Equations

Vectors

Trigonometric Identities and Equations

Ready to Ace Your A-Level maths?

Tip 4: State the Distribution Under $H_0$