The Limitations of R-Squared

R-squared (R²) is a relative measure of how much variance a model explains compared to a baseline (the mean). Its primary flaw is that it never decreases when adding features, even if those features are random noise. This encourages "feature bloat," where a model appears to improve simply by gaining more degrees of freedom. To counter this, Adjusted R-squared applies a complexity tax, penalizing the inclusion of features that do not meaningfully improve predictive power.

The Hierarchy of Error Metrics

Regression metrics provide different lenses into model performance. Understanding their relationships is critical for debugging:

  • MAE (Mean Absolute Error): Measures the average magnitude of errors in the original units. It is highly interpretable and robust to outliers.
  • MSE (Mean Squared Error): Penalizes large errors disproportionately by squaring them. It is the standard loss function for gradient descent due to its mathematical properties (differentiability).
  • RMSE (Root Mean Squared Error): The square root of MSE, bringing the error back into the target variable's units. It acts as a "dramatic sibling" to MAE; if RMSE is significantly higher than MAE, your model has catastrophic outliers.
  • Standard Error (SE): A more rigorous version of RMSE that accounts for degrees of freedom. It provides a more honest estimate of prediction uncertainty, especially in smaller datasets.

Diagnostic Workflow

To move from a "good-looking" model to a performant one, use these techniques:

  1. Residual Analysis: Plot residuals against predictions. A healthy model shows a bell-shaped distribution centered at zero with no discernible patterns.
  2. Feature Engineering: Use polynomial features to capture non-linear relationships, but monitor Adjusted R-squared to prevent overfitting.
  3. Regularization: Use Ridge or Lasso regression to penalize complexity and improve out-of-sample performance.
  4. Cross-Validation: Never rely on a single train-test split. Use 10-fold cross-validation to ensure metrics are stable and not the result of a lucky split.

Metric Selection Guide

  • Comparing models with different feature counts: Use Adjusted R².
  • Communicating with stakeholders: Use MAE.
  • Training optimization: Use MSE.
  • Reporting uncertainty: Use Standard Error.
  • General goodness-of-fit: Use R².