604 Chapter 21. Bootstrapping Regression Models
the sample mean Y ; and the jackknifed confidence interval is the same as the usual t
confidence interval.
(b) Demonstrate the results in part (a) numerically for the contrived “data” in Table 21.3. (These
results are peculiar to linear statistics like the mean.)
(c) Find jackknifed confidence intervals for the Huber M estimator of Duncan’s regression of
occupational prestige on income and education. Compare these intervals with the bootstrap
and normal-theory intervals given in Table 21.5.
Exercise 21.3. Random versus fixed resampling in regression:
(a) Recall (from Chapter 2), Davis’s data on measured and reported weight for 101 women
engaged in regular exercise. Bootstrap the least-squares regression of reported weight on
measured weight, drawing r = 1, 000 bootstrap samples using (1) random-X resampling
and (2) fixed-X resampling. In each case, plot a histogram (and, if you wish, a density
estimate) of the 1,000 bootstrap slopes, and calculate the bootstrap estimate of standard error
for the slope. How does the influential outlier in this regression affect random resampling?
How does it affect fixed resampling?
(b) Randomly construct a data set of 100 observations according to the regression model Y
i
=
5 + 2x
i
+ ε
i
, where x
i
= 1, 2,... ,100, and the errors are independent (but seriously
heteroscedastic), with ε
i
∼ N(0,x
2
i
). As in (a), bootstrap the least-squares regression of Y
on x, using (1) random resampling and (2) fixed resampling. In each case, plot the bootstrap
distribution of the slope coefficient, and calculate the bootstrap estimate of standard error
for this coefficient. Compare the results for random and fixed resampling. For a few of the
bootstrap samples, plot the least-squares residuals against the fitted values. How do these
plots differ for fixed versus random resampling?
(c) Why might random resampling be preferred in these contexts, even if (as is not the case for
Davis’s data) the X values are best conceived as fixed?
Exercise 21.4. Bootstrap estimates of bias: The bootstrap can be used to estimate the bias of
an estimator
θ of a parameter θ , simply by comparing the mean of the bootstrap distribution
θ
∗
(which stands in for the expectation of the estimator) with the sample estimate
θ (which stands in
for the parameter); that is,
bias =
θ
∗
−
θ. (Further discussion and more sophisticated methods are
described in Efron and Tibshirani, 1993, chap. 10.) Employ this approach to estimate the bias of
the maximum-likelihood estimator of the variance, σ
2
=
(Y
i
−Y)
2
/n, for a sample of n = 10
observations drawn from the normal distribution N(0, 100). Use r = 500 bootstrap replications.
How close is the bootstrap bias estimate to the theoretical value −σ
2
/n =−100/10 =−10?
Exercise 21.5.
∗
Test the omnibus null hypothesis H
0
: β
1
= β
2
= 0 for the Huber M estimator
in Duncan’s regression of occupational prestige on income and education.
(a) Base the test on the estimated asymptotic covariance matrix of the coefficients.
(b) Use the bootstrap approach described in Section 21.4.
Exercise 21.6. Case weights:
(a)
∗
Show how case weights can be used to “adjust” the usual formulas for the least-squares
coefficients and their covariance matrix. How do these case-weighted formulas compare
with those for weighted-least-squares regression (discussed in Section 12.2.2)?