F-Distribution
The F-distribution is a continuous probability distribution that arises as the ratio of two independent chi-square random variables, each divided by their respective degrees of freedom. Named after Sir Ronald Fisher, it is the cornerstone distribution for analysis of variance (ANOVA) and for comparing the variances of two populations. The F-distribution is right-skewed, takes only positive values, and is characterized by two degrees of freedom parameters.
Formula
f(x) = [√(((d₁x)^d₁ · d₂^d₂) / (d₁x + d₂)^(d₁ + d₂))] / [x · B(d₁/2, d₂/2)], for x > 0, where B is the beta function
Mean (Expected Value)
d₂ / (d₂ - 2) (for d₂ > 2)
Variance
[2d₂²(d₁ + d₂ - 2)] / [d₁(d₂ - 2)²(d₂ - 4)] (for d₂ > 4)
Parameters
Degrees of freedom for the numerator chi-square variable. In one-way ANOVA, d₁ = k - 1 where k is the number of groups. Must be a positive integer.
Degrees of freedom for the denominator chi-square variable. In one-way ANOVA, d₂ = N - k where N is total sample size. Must be a positive integer.
Key Properties
- •Defined as the ratio F = (U₁/d₁) / (U₂/d₂) where U₁ ~ χ²(d₁) and U₂ ~ χ²(d₂) are independent
- •Always positive (F > 0) and right-skewed
- •If F ~ F(d₁, d₂), then 1/F ~ F(d₂, d₁) -- the reciprocal swaps the degrees of freedom
- •The square of a t-distributed variable with ν df follows an F(1, ν) distribution: if T ~ t(ν), then T² ~ F(1, ν)
- •The mean is slightly greater than 1 (for d₂ > 2), reflecting the ratio of two variance estimates that should be equal under H₀
Example
Three teaching methods are tested on groups of 10 students each (N = 30 total). The ANOVA yields a between-group mean square MSB = 450 and a within-group mean square MSW = 120. Test at the 5% level whether the teaching methods produce different mean scores.
Step 1: F = MSB / MSW = 450 / 120 = 3.75. Step 2: d₁ = k - 1 = 3 - 1 = 2 (numerator df). Step 3: d₂ = N - k = 30 - 3 = 27 (denominator df). Step 4: Critical value F(0.05, 2, 27) = 3.35 from the F-table.
Result: F = 3.75 > 3.35 = F(0.05, 2, 27), so we reject H₀ at the 5% significance level
There is statistically significant evidence at the 5% level that the three teaching methods do not all produce the same mean scores. At least one method leads to a different average performance. Post-hoc tests (like Tukey's HSD) would be needed to identify which specific methods differ.
When to Use
- ✓When performing one-way or multi-way ANOVA to compare means across three or more groups
- ✓When testing whether two population variances are equal using an F-test (e.g., Levene's test or the classical variance ratio test)
- ✓When evaluating the overall significance of a multiple regression model (the global F-test)
- ✓When comparing nested statistical models to see if additional predictors significantly improve the fit
Common Mistakes
- ✗Swapping the numerator and denominator degrees of freedom when looking up the critical value. The order matters: F(d₁, d₂) ≠ F(d₂, d₁).
- ✗Using the F-test for comparing variances when the underlying populations are not approximately normal. The F-test for variances is very sensitive to non-normality.
- ✗Interpreting a significant ANOVA F-test as meaning all groups differ. It only means at least one group mean is different; post-hoc tests are needed to determine which.
- ✗Forgetting that the F-test in ANOVA is always right-tailed. Large F values indicate evidence against H₀, while F values near 1 support H₀.
Need Help with Distribution Problems?
Snap a photo of any distribution problem for instant step-by-step solutions.
Download StatsIQFAQs
Common questions about F-Distribution
The F-distribution and t-distribution are closely related. If T follows a t-distribution with ν degrees of freedom, then T² follows an F-distribution with d₁ = 1 and d₂ = ν. This means a two-sample t-test comparing two group means is mathematically equivalent to a one-way ANOVA with two groups. The F-statistic from the ANOVA will equal the square of the t-statistic, and both tests yield the same p-value.
The F-distribution has two degrees of freedom because it is the ratio of two independent chi-square variables, each with its own degrees of freedom. The numerator df (d₁) comes from the between-group variability and depends on the number of groups. The denominator df (d₂) comes from the within-group variability and depends on the total sample size minus the number of groups. Both parameters together determine the shape of the distribution and the critical values.