Understanding Statistical Distributions
A guide to the most important probability distributions in statistics. Learn the shapes, parameters, and applications of the normal, binomial, t, chi-square, and other key distributions.
What You'll Learn
- โIdentify and describe the key properties of major statistical distributions.
- โApply the normal and binomial distributions to solve probability problems.
- โUnderstand how sampling distributions underpin statistical inference.
1. The Normal Distribution
The normal (Gaussian) distribution is the most important distribution in statistics. It is symmetric and bell-shaped, completely described by its mean and standard deviation. The empirical rule and z-scores provide quick probability estimates.
Key Points
- โขAbout 68% of data fall within one standard deviation, 95% within two, and 99.7% within three (the 68-95-99.7 rule).
- โขA z-score converts any normal value to the standard normal: z = (x - mu) / sigma.
- โขThe Central Limit Theorem guarantees the sampling distribution of the mean is approximately normal for large samples.
2. The Binomial Distribution
The binomial distribution models the number of successes in a fixed number of independent trials, each with the same probability of success. It is the foundation for inference about proportions.
Key Points
- โขParameters: n (number of trials) and p (probability of success on each trial).
- โขMean = np and standard deviation = sqrt(np(1-p)).
- โขThe binomial can be approximated by the normal when np >= 10 and n(1-p) >= 10.
3. Sampling Distributions and the t, Chi-Square, and F Distributions
Sampling distributions describe how a statistic varies across repeated samples. The t-distribution is used for means with unknown sigma, the chi-square for variance and categorical tests, and the F-distribution for comparing variances in ANOVA.
Key Points
- โขThe t-distribution has heavier tails than the normal, accounting for extra uncertainty with small samples.
- โขThe chi-square distribution is right-skewed and used in goodness-of-fit tests and tests of independence.
- โขThe F-distribution is the ratio of two chi-square variables and is the basis of the ANOVA F-test.
Key Takeaways
- โ The normal distribution is symmetric, so the mean, median, and mode are all equal.
- โ A binomial random variable counts successes; a geometric random variable counts trials until the first success.
- โ The t-distribution approaches the normal as degrees of freedom increase.
- โ Chi-square values are always non-negative because they are sums of squared terms.
Practice Questions
1. Scores on a test are normally distributed with mean 500 and SD 100. What percentage score above 700?
2. A fair coin is flipped 20 times. What is the probability of exactly 10 heads?
FAQs
Common questions about this topic
The normal distribution appears throughout statistics for two reasons: many natural phenomena are approximately normal, and the Central Limit Theorem guarantees that sample means are approximately normal regardless of the population shape, enabling inference even when the original data are not normal.
Match the distribution to the context. Counting successes in fixed trials? Binomial. Measuring a continuous variable? Often normal. Testing means with unknown sigma? t-distribution. Comparing variances or testing categorical data? Chi-square or F. The type of data and research question determine the distribution.