Correlation Coefficient and Significance Test
Calculate the Pearson correlation coefficient between two variables and test whether the correlation is statistically significant.
Problem Scenario
A psychologist studies the relationship between hours of sleep per night and reaction time (in milliseconds) on a cognitive test. Data from 7 participants: Sleep hours: 5, 6, 6.5, 7, 7.5, 8, 9. Reaction time: 420, 390, 370, 350, 340, 310, 280. Calculate the correlation coefficient and test its significance at alpha = 0.05.
Given Data
Requirements
- Calculate the Pearson correlation coefficient r
- Test whether r is significantly different from zero using a t-test
- Interpret the strength and direction of the relationship
Solution
Step 1:
Calculate the necessary sums: Sum(xy) = 5(420) + 6(390) + 6.5(370) + 7(350) + 7.5(340) + 8(310) + 9(280) = 2100 + 2340 + 2405 + 2450 + 2550 + 2480 + 2520 = 16845. Sum(x^2) = 25 + 36 + 42.25 + 49 + 56.25 + 64 + 81 = 353.5. Sum(y^2) = 176400 + 152100 + 136900 + 122500 + 115600 + 96100 + 78400 = 878000.
Step 2:
Calculate the Pearson correlation coefficient using the formula: r = [n*Sum(xy) - Sum(x)*Sum(y)] / sqrt{[n*Sum(x^2) - (Sum(x))^2] * [n*Sum(y^2) - (Sum(y))^2]}. Numerator: 7(16845) - 49(2460) = 117915 - 120540 = -2625. Denominator: sqrt{[7(353.5) - 49^2] * [7(878000) - 2460^2]} = sqrt{[2474.5 - 2401] * [6146000 - 6051600]} = sqrt{73.5 * 94400} = sqrt{6938400} = 2634.1. r = -2625 / 2634.1 = -0.9965.
Step 3:
Interpret: r = -0.997 (rounded) indicates a very strong negative linear relationship. As sleep hours increase, reaction time decreases.
Step 4:
Test significance. H_0: rho = 0 (no linear correlation). H_a: rho != 0 (there is a linear correlation). t = r * sqrt(n - 2) / sqrt(1 - r^2) = -0.9965 * sqrt(5) / sqrt(1 - 0.9930) = -0.9965 * 2.2361 / sqrt(0.0070) = -2.2282 / 0.0837 = -26.62.
Step 5:
With df = n - 2 = 5 and t = -26.62, the p-value is extremely small (p < 0.0001). The critical value t(0.025, 5) = 2.571. Since |t| = 26.62 >> 2.571, we reject H_0.
Final Answer
r = -0.997, t = -26.62, df = 5, p-value < 0.0001. The correlation is statistically significant. There is a very strong negative linear relationship between hours of sleep and reaction time: more sleep is associated with faster (lower) reaction times.
Key Takeaways
- โPearson's r measures the strength and direction of the linear relationship between two quantitative variables. It ranges from -1 (perfect negative) to +1 (perfect positive).
- โA significant correlation does not imply causation. Other variables (e.g., caffeine consumption, age) could confound the relationship between sleep and reaction time.
- โThe t-test for correlation tests whether the population correlation rho is significantly different from zero. A small p-value provides evidence of a real linear association.
Common Errors to Avoid
- โInterpreting correlation as causation. Even with r = -0.997, we cannot conclude that more sleep causes faster reaction times without a controlled experiment.
- โIgnoring potential outliers that can dramatically affect the correlation coefficient. With only 7 observations, one outlier could change r substantially.
- โApplying Pearson's r to non-linear relationships. If the relationship is curved, r can be close to zero even when there is a strong association. Always plot your data first.
Practice More Problems with AI
Snap a photo of any problem and get instant explanations.
Download StatsIQFAQs
Common questions about this problem type
Pearson's r measures the strength of the linear relationship between two quantitative variables. Spearman's rho is a nonparametric measure that assesses the strength of the monotonic relationship based on ranks. Use Spearman when the data are ordinal, when the relationship is monotonic but not linear, or when there are significant outliers.
Common guidelines: |r| < 0.3 is weak, 0.3 <= |r| < 0.7 is moderate, and |r| >= 0.7 is strong. However, what counts as "meaningful" depends on the field. In physics, r = 0.9 might be considered poor, while in psychology, r = 0.5 might be considered strong.