ISLR_ch5.2 Potential Problems

ISLR_ch5.2 Potential Problems

Approach:

  1. A data set, which we call Z, that contains n observations. We randomly select n observations from the data set in order to produce a bootstrap data set, $Z^{∗1}$.

  2. The sampling is performed with replacement, which means that the replacement same observation can occur more than once in the bootstrap data set.

    • In this example, $Z^{∗1}$ contains the third observation twice, the first observation once, and no instances of the second observation.
    • Note that if an observation is contained in $Z^{∗1}$, then both its X and Y values are included.
  3. We can use $Z^{∗1}$ to produce a new bootstrap estimate for α, which we call $\alpha^{∗1}$. This procedure is repeated B times for some large value of B, in order to produce B different bootstrap data sets, $Z^{∗1}$,$Z^{∗2}$, . . . , $Z^{∗B}$, and B corresponding α estimates, $\alpha^{∗1}$, $\alpha^{∗2}$, . . . , $\alpha^{∗B}$.

  4. We can compute the standard error of these bootstrap estimates using the formula \begin{align} SE_B(\hat{\alpha})=\sqrt{\frac{1}{B-1}\sum_{i=1}^B\left( \hat{\alpha}^{i}-\frac{1}{B}\sum^{B}_{j=1}\hat{\alpha}^{j} \right)} \end{align} This serves as an estimate of the standard error of $\hat{\alpha}$ estimated from the original data set.


  TOC