Abstract
Data on contamination concentrations for chromium from one of the EPA's toxic waste sites consist of independent and identically distributed (iid) measurements along with additional observations from the residual distribution. The residual sample is obtained by sampling from hot spots, where contamination concentrations are assumed to be above a given threshold value. The data are modeled using a nonparametric Bayes estimator of the distribution function. The Dirichlet process is used to formulate prior information about the chromium contamination, and we compare the Bayes estimator of the mean concentration level to other estimators currently considered by the EPA and other sources. The Bayes estimator of the mean generally outperforms competing estimators under various cost functions. The Bayes estimator of the distribution function is derived assuming the possibility of right-censored contamination measurements along with left-truncated hot spot data. For the case in which the prior becomes noninformative, the Bayes estimator of the distribution function is the nonparametric maximum likelihood estimator, which is identical to the Kaplan-Meier estimator for concentration values observed below the residual sample threshold. Robustness of the Bayes estimator is examined with respect to misspecification of the prior and its sensitivity to the censoring distribution.