Abstract
We provide a method for finding the optimal double sampling plan for estimating the mean value of a continuous outcome. It is assumed that the fallible and true outcome data are related by a multivariate linear regression model where only some of the explanatory variables are sampled. Conditions under which double sampling is preferred over standard sampling plans are determined. An application of the method to a well-known data set on air pollution is presented.