ISLR_ch6.0 Intro_Model_Selection

ISLR_ch6.0 Intro_Model_Selection

Setting:

  • In the regression setting, the standard linear model $Y = β_0 + β_1X_1 + · · · + β_pX_p + \epsilon$

  • In the chapters that follow, we consider some approaches for extending the linear model framework.

Reason of using other fitting procedure than lease squares:

  • Prediction Accuracy:

    • Provided that the true relationship between the response and the predictors is approximately linear, the least squares estimates will have low bias.
    • If n $\gg$ p, least squares estimates tend to also have low variance $\Rightarrow$ perform well on test data.
    • If n is not much larger than p, least squares fit has large variance $\Rightarrow$ overfitting $\Rightarrow$ consequently poor predictions on test data
    • If p > n, no more unique least squares coefficient estimate: the variance is infinite so the method cannot be used at all

    By constraining or shrinking the estimated coefficients, we can often substantially reduce the variance at the cost of a negligible increase in bias.

  • Model Interpretability

    • irrelevant variables leads to unnecessary complexity in the resulting model. By removing these variables—that is, by setting the corresponding coefficient estimates to zero—we can obtain a model that is more easily interpreted.
    • least squares is extremely unlikely to yield any coefficient estimates that are exactly zero $\Rightarrow$ feature selection

Alternatives of lease squares:

  1. Subset Selection
  2. Shrinkage
  3. Dimension Reduction

  TOC