๐Ÿ”—
Correlation Analysisintermediate

Correlation Coefficient and Significance Test

Calculate the Pearson correlation coefficient between two variables and test whether the correlation is statistically significant.

Problem Scenario

A psychologist studies the relationship between hours of sleep per night and reaction time (in milliseconds) on a cognitive test. Data from 7 participants: Sleep hours: 5, 6, 6.5, 7, 7.5, 8, 9. Reaction time: 420, 390, 370, 350, 340, 310, 280. Calculate the correlation coefficient and test its significance at alpha = 0.05.

Given Data

Sleep hours (x)5, 6, 6.5, 7, 7.5, 8, 9
Reaction time in ms (y)420, 390, 370, 350, 340, 310, 280
n7
Sum(x) = 49, Sum(y) = 2460x-bar = 7, y-bar = 351.43

Requirements

  1. Calculate the Pearson correlation coefficient r
  2. Test whether r is significantly different from zero using a t-test
  3. Interpret the strength and direction of the relationship

Solution

Step 1:

Calculate the necessary sums: Sum(xy) = 5(420) + 6(390) + 6.5(370) + 7(350) + 7.5(340) + 8(310) + 9(280) = 2100 + 2340 + 2405 + 2450 + 2550 + 2480 + 2520 = 16845. Sum(x^2) = 25 + 36 + 42.25 + 49 + 56.25 + 64 + 81 = 353.5. Sum(y^2) = 176400 + 152100 + 136900 + 122500 + 115600 + 96100 + 78400 = 878000.

Step 2:

Calculate the Pearson correlation coefficient using the formula: r = [n*Sum(xy) - Sum(x)*Sum(y)] / sqrt{[n*Sum(x^2) - (Sum(x))^2] * [n*Sum(y^2) - (Sum(y))^2]}. Numerator: 7(16845) - 49(2460) = 117915 - 120540 = -2625. Denominator: sqrt{[7(353.5) - 49^2] * [7(878000) - 2460^2]} = sqrt{[2474.5 - 2401] * [6146000 - 6051600]} = sqrt{73.5 * 94400} = sqrt{6938400} = 2634.1. r = -2625 / 2634.1 = -0.9965.

Step 3:

Interpret: r = -0.997 (rounded) indicates a very strong negative linear relationship. As sleep hours increase, reaction time decreases.

Step 4:

Test significance. H_0: rho = 0 (no linear correlation). H_a: rho != 0 (there is a linear correlation). t = r * sqrt(n - 2) / sqrt(1 - r^2) = -0.9965 * sqrt(5) / sqrt(1 - 0.9930) = -0.9965 * 2.2361 / sqrt(0.0070) = -2.2282 / 0.0837 = -26.62.

Step 5:

With df = n - 2 = 5 and t = -26.62, the p-value is extremely small (p < 0.0001). The critical value t(0.025, 5) = 2.571. Since |t| = 26.62 >> 2.571, we reject H_0.

Final Answer

r = -0.997, t = -26.62, df = 5, p-value < 0.0001. The correlation is statistically significant. There is a very strong negative linear relationship between hours of sleep and reaction time: more sleep is associated with faster (lower) reaction times.

Key Takeaways

  • โœ“Pearson's r measures the strength and direction of the linear relationship between two quantitative variables. It ranges from -1 (perfect negative) to +1 (perfect positive).
  • โœ“A significant correlation does not imply causation. Other variables (e.g., caffeine consumption, age) could confound the relationship between sleep and reaction time.
  • โœ“The t-test for correlation tests whether the population correlation rho is significantly different from zero. A small p-value provides evidence of a real linear association.

Common Errors to Avoid

  • โœ—Interpreting correlation as causation. Even with r = -0.997, we cannot conclude that more sleep causes faster reaction times without a controlled experiment.
  • โœ—Ignoring potential outliers that can dramatically affect the correlation coefficient. With only 7 observations, one outlier could change r substantially.
  • โœ—Applying Pearson's r to non-linear relationships. If the relationship is curved, r can be close to zero even when there is a strong association. Always plot your data first.

Practice More Problems with AI

Snap a photo of any problem and get instant explanations.

Download StatsIQ

FAQs

Common questions about this problem type

Pearson's r measures the strength of the linear relationship between two quantitative variables. Spearman's rho is a nonparametric measure that assesses the strength of the monotonic relationship based on ranks. Use Spearman when the data are ordinal, when the relationship is monotonic but not linear, or when there are significant outliers.

Common guidelines: |r| < 0.3 is weak, 0.3 <= |r| < 0.7 is moderate, and |r| >= 0.7 is strong. However, what counts as "meaningful" depends on the field. In physics, r = 0.9 might be considered poor, while in psychology, r = 0.5 might be considered strong.

More Practice Problems