1. The Least Squares Regression Line#
Logic: How do we draw the “perfect” straight line through a messy cloud of dots? We calculate the line that minimizes the sum of the squares of the errors (the distances from the dots to the line).
Equation: $\hat{y} = a + bx$
- $a$: The y-intercept.
- $b$: The gradient (slope).
2. The Correlation Coefficient ($r$)#
This number tells you how well the line fits the data.
- $r = 1$: Perfect positive correlation (Dots are in a perfect line going up).
- $r = -1$: Perfect negative correlation (Dots are in a perfect line going down).
- $r = 0$: No correlation (Just a random cloud).
The “Strength” Rule:
- $0.8 < |r| \le 1$: Strong
- $0.5 < |r| \le 0.8$: Moderate
- $0 < |r| \le 0.5$: Weak
3. Interpolation vs Extrapolation#
- Interpolation: Predicting a value inside the range of your data. (Usually reliable).
- Extrapolation: Predicting a value outside the range (like guessing the future). (Risky!).
⏮️ Scatter Plots | 🏠 Back to Statistics
