Abstract
The internal norm measure DI(C−1 k) introduced in Gray (1989) assesses the influencethat a size- subset of cases has on the least squares parameter estimates in a linear regression. The term internal norm refers to the fact that the assessment of influence is based on a comparison of the subset of interest to all size- subsets internal to the data set. Although this approachis intuitively
appealing; it can be impractical for k > 1 since the calculation of DI(C−1 k) for a size-k subset I requires the covariance matrix Ck of all size-A' deletion parameter estimates. Using results from finite population sampling theory, we show that the covariance matrices Ck and CI are nearly proponional for large sample sizes, a fact
that can be used to inexpensively approximate DI(C−1 k) for subsets containing two or more cases. Numerical evidence from several real data sets indicates that the approximation is highly accurate for large data sets, where it is most needed, and sufficiently accurate for small to moderate regression problems.