📏

Chi-Square Distribution

continuous

The chi-square (χ²) distribution is a continuous probability distribution that arises as the sum of squares of independent standard normal random variables. It is right-skewed and takes only non-negative values. The chi-square distribution is fundamental to statistical inference, appearing in goodness-of-fit tests, tests of independence in contingency tables, and confidence intervals for population variances. The shape is determined entirely by its degrees of freedom parameter.

Formula

f(x) = [x^(k/2 - 1) · e^(-x/2)] / [2^(k/2) · Γ(k/2)], for x ≥ 0

Mean (Expected Value)

Variance

Parameters

Degrees of Freedom

The number of independent standard normal random variables being squared and summed. Must be a positive integer (k ≥ 1). Determines the shape, mean, and variance of the distribution.

Key Properties

•If Z₁, Z₂, ..., Z_k are independent standard normal variables, then Z₁² + Z₂² + ... + Z_k² ~ χ²(k)
•Always non-negative (x ≥ 0) and right-skewed, but becomes more symmetric as k increases
•The sum of independent chi-square variables is also chi-square: if X ~ χ²(k₁) and Y ~ χ²(k₂), then X + Y ~ χ²(k₁ + k₂)
•For large k, the chi-square distribution is approximately normal with mean k and variance 2k
•The sample variance S² from a normal population satisfies (n - 1)S²/σ² ~ χ²(n - 1)

Example

A die is rolled 60 times with results: 1 appeared 8 times, 2 appeared 12 times, 3 appeared 7 times, 4 appeared 15 times, 5 appeared 9 times, 6 appeared 9 times. Test at the 5% significance level whether the die is fair.

Expected frequency for each face = 60/6 = 10. The chi-square statistic: χ² = ∑(O - E)²/E = (8-10)²/10 + (12-10)²/10 + (7-10)²/10 + (15-10)²/10 + (9-10)²/10 + (9-10)²/10 = 0.4 + 0.4 + 0.9 + 2.5 + 0.1 + 0.1 = 4.4. Degrees of freedom = 6 - 1 = 5. Critical value χ²(0.05, 5) = 11.07.

Result: χ² = 4.4, which is less than the critical value of 11.07, so we fail to reject H₀

At the 5% significance level, there is insufficient evidence to conclude the die is unfair. The observed frequencies do not differ significantly from what we would expect from a fair die. The p-value is approximately 0.49, well above 0.05.

When to Use

✓When performing a goodness-of-fit test to determine if observed frequencies match expected frequencies
✓When testing for independence between two categorical variables using a contingency table
✓When constructing confidence intervals for a population variance from a normally distributed population
✓When testing homogeneity of proportions across multiple populations

Common Mistakes

✗Using the chi-square test when expected cell frequencies are too small. The general rule is that all expected frequencies should be at least 5 for the approximation to be valid.
✗Forgetting that the chi-square test is always a right-tailed test for goodness-of-fit and independence. Larger χ² values indicate more deviation from the null hypothesis.
✗Using raw data instead of frequencies. The chi-square goodness-of-fit test requires counts (observed and expected frequencies), not individual data values.
✗Getting degrees of freedom wrong: for goodness-of-fit it is (categories - 1); for independence it is (rows - 1)(columns - 1).

Need Help with Distribution Problems?

Snap a photo of any distribution problem for instant step-by-step solutions.

Download StatsIQ

FAQs

Common questions about Chi-Square Distribution

The goodness-of-fit test examines whether a single categorical variable follows a specified distribution (e.g., are die outcomes uniform?). It uses one-way frequency tables with df = categories - 1. The test of independence examines whether two categorical variables are related (e.g., is political party associated with voting preference?). It uses two-way contingency tables with df = (rows - 1)(columns - 1). Both use the same test statistic χ² = ∑(O - E)²/E, but the expected frequencies are calculated differently.

The chi-square distribution is defined as the sum of squared standard normal random variables. Since squaring any real number produces a non-negative result, and the sum of non-negative numbers is non-negative, the chi-square random variable can never be negative. This aligns with its applications: the chi-square test statistic ∑(O - E)²/E involves squared differences, so it is always ≥ 0. A value of 0 would mean perfect agreement between observed and expected frequencies.

With few degrees of freedom (k = 1 or 2), the chi-square distribution is heavily right-skewed with most probability near zero. As k increases, the distribution shifts rightward, the peak moves away from zero, and it becomes more symmetric and bell-shaped. The mean shifts to k and the spread increases (variance = 2k). For k > 30, the chi-square distribution is well-approximated by a normal distribution with mean k and standard deviation √(2k).

Related Distributions

🔔

All Distributions

🔔 Normal Distribution 🎯 Binomial Distribution 📊 Poisson Distribution 📐 Student's t-Distribution 📏 Chi-Square Distribution 📈 F-Distribution ⏱️ Exponential Distribution 📋 Continuous Uniform Distribution 🔁 Geometric Distribution 🃏 Hypergeometric Distribution

Chi-Square Distribution

Formula

Mean (Expected Value)

Variance

Parameters

Key Properties

Example

When to Use

Common Mistakes

Need Help with Distribution Problems?

FAQs

What is the difference between the chi-square goodness-of-fit test and the test of independence?

Why can the chi-square distribution only take non-negative values?

How do degrees of freedom affect the shape of the chi-square distribution?

Related Distributions

Normal Distribution

F-Distribution

Student's t-Distribution

All Distributions