🔗regression

Pearson Correlation Coefficient

r = Σ(xᵢ - x̄)(yᵢ - ȳ) / √[Σ(xᵢ - x̄)² Σ(yᵢ - ȳ)²]

The Pearson correlation coefficient measures the strength and direction of the linear relationship between two quantitative variables. It ranges from -1 (perfect negative linear relationship) through 0 (no linear relationship) to +1 (perfect positive linear relationship).

Variables

r=Correlation Coefficient

The strength and direction of the linear association, between -1 and 1

xᵢ, yᵢ=Paired Observations

The individual data pairs for the two variables

x̄, ȳ=Sample Means

The means of the x and y variables, respectively

Example Calculation

Scenario

Five students' study hours (x) and exam scores (y) are: (2,65), (4,78), (5,82), (6,90), (8,95). Calculate the Pearson correlation.

Given Data

x̄:(2+4+5+6+8)/5 = 5.0

ȳ:(65+78+82+90+95)/5 = 82.0

Σ(xᵢ - x̄)(yᵢ - ȳ):(-3)(-17)+(-1)(-4)+(0)(0)+(1)(8)+(3)(13) = 51+4+0+8+39 = 102

Calculation

Σ(xᵢ - x̄)² = 9+1+0+1+9 = 20; Σ(yᵢ - ȳ)² = 289+16+0+64+169 = 538; r = 102 / √(20 × 538) = 102 / √10760 = 102 / 103.73

Result

r = 0.983

Interpretation

The correlation of 0.983 indicates a very strong positive linear relationship between study hours and exam scores. As study hours increase, exam scores tend to increase proportionally.

When to Use This Formula

✓Assessing the strength and direction of a linear relationship between two quantitative variables
✓Determining whether a linear regression model is appropriate before fitting it
✓Comparing the strength of association across different pairs of variables

Common Mistakes

✗Interpreting correlation as causation without additional evidence
✗Using Pearson correlation for nonlinear relationships where it will understate the true association
✗Ignoring the effect of outliers, which can dramatically inflate or deflate r
✗Applying Pearson correlation to ordinal data where Spearman rank correlation would be more appropriate

Calculate This Formula Instantly

Snap a photo of any problem and get step-by-step solutions.

Download StatsIQ

FAQs

Common questions about this formula

No. Correlation measures association, not causation. Two variables can be strongly correlated due to a third confounding variable or by coincidence. Establishing causation requires controlled experiments or rigorous causal inference methods.

As a general guideline, |r| > 0.7 is often considered a strong correlation, 0.3 < |r| < 0.7 is moderate, and |r| < 0.3 is weak. However, context matters greatly. In some fields, r = 0.3 may be practically significant, while in others r = 0.9 may be expected.

More Formulas

📊 Sample Mean 📏 Sample Standard Deviation 🎯 Z-Score 🔒 Confidence Interval for the Mean 🧪 One-Sample T-Test Statistic 🔢 Chi-Square Statistic 🔗 Pearson Correlation Coefficient 📈 Linear Regression Slope 🎲 Binomial Probability 🔔 Normal Distribution PDF 🔄 Bayes' Theorem 📐 Margin of Error 📋 F-Statistic (ANOVA)🎯 Coefficient of Determination 🎰 Poisson Probability