Abstract
In this paper, we address the problem of simulating from a data-generating process for which the observed data do not follow a regular probability distribution. One existing method for doing this is bootstrapping, but it is incapable of interpolating between observed data. For univariate or bivariate data, in which a mixture structure can easily be identified, we could instead simulate from a Gaussian mixture model. In general, though, we would have the problem of identifying and estimating the mixture model. Instead of these, we introduce a non-parametric method for simulating datasets like this: Kernel Carlo Simulation. Our algorithm begins by using kernel density estimation to build a target probability distribution. Then, an envelope function that is guaranteed to be higher than the target distribution is created. We then use simple accept–reject sampling. Our approach is more flexible than others, can simulate intelligently across gaps in the data, and requires no subjective modelling decisions. With several univariate and multivariate examples, we show that our method returns simulated datasets that, compared with the observed data, retain the covariance structures and have distributional characteristics that are remarkably similar.