Search in:

Statistical Theory and Related Fields Volume 3, 2019 - Issue 2

Submit an article Journal homepage

Free access

292

Views

CrossRef citations to date

Altmetric

Listen

Articles

Small area prediction of quantiles for zero-inflated data and an informative sample design

Emily BergDepartment of Statistics, Iowa State University, Ames, IA, USACorrespondence[email protected]
View further author information

Danhyang LeeDepartment of Information Systems, Statistics and Management Science, University of Alabama, Tuscaloosa, AL, USAView further author information

Pages 114-128 | Received 31 Dec 2018, Accepted 07 Sep 2019, Published online: 28 Sep 2019

Cite this article
https://doi.org/10.1080/24754269.2019.1666243
CrossMark

In this article

ABSTRACT
1. Introduction
2. Zero-Inflated model and estimation procedure
3. Modification for an informative design
4. Illustration for Kansas CEAP data
5. Summary and future work
Supplemental material
Disclosure statement
Additional information
References
Appendixes

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The Conservation Effects Assessment Project (CEAP) is a survey intended to quantify soil and nutrient loss on cropland. Estimates of the quantiles of CEAP response variables are published. Previous work develops a procedure for predicting small area quantiles based on a mixed effects quantile regression model. The conditional density function of the response given covariates and area random effects is approximated with the linearly interpolated generalised Pareto distribution (LIGPD). Empirical Bayes is used for prediction and a parametric bootstrap procedure is developed for mean squared error estimation. In this work, we develop two extensions of the LIGPD-based small area quantile prediction procedure. One extension allows for zero-inflated data. The second extension accounts for an informative sample design. We apply the procedures to predict quantiles of the distribution of percolation (a CEAP response variable) in Kansas counties.

KEYWORDS:

Quantile regression
mixed effects models
bootstrap

1. Introduction

Small area estimation procedures traditionally make use of fully parametric models (Battese, Harter, & Fuller,Citation1988). When analyzing data, evidence of nonlinearity, nonconstant variances, or outliers can make the problem of specifying an appropriate parametric form a challenging task. To address challenges in parametric modelling, several semiparametric small area estimation procedures have been proposed. Opsomer, Claeskens, Ranalli, Kauermann, and Breidt (Citation2008) use penalised spline regression for small area estimation. Sinha and Rao (Citation2009) consider outlier-robust estimation. Chambers and Tzavidis (Citation2006) use M-quantile regression. See Rao and Molina (Citation2015) for further background on the wide range of models used for small area estimation.

Berg and Lee (Citation2019a) develop a small area procedure for estimating quantiles based on the semiparametric mixed effects quantile regression model of Jang and Wang (Citation2015). The model of Jang and Wang (Citation2015) approximates the conditional distribution of the response given a covariate and a random effect using a distribution that they term the linearly interpolated generalised Pareto dentisy (LIGPD). The name for the approximate density function (LIGPD) refers to the two main aspects of the approach. First, for a fine grid of interior quantiles, the LIGPD approximates the quantile function corresponding to the distribution of the response given a covariate using linear interpolation (LI). Second, an extreme value distribution, namely the generalised Pareto distribution (GPD), is used to model the distribution of the response for quantile levels that exceed the lower and upper bounds of the interior grid. We define these two aspects of the LIGPD of Jang and Wang (Citation2015) more precisely in Section 1.2. Jang and Wang (Citation2015) use Bayesian methods to conduct inference for the parameters of the LIGPD model. Berg and Lee(Citation2019a) adopt the LIGPD model for small area estimation. Their interest in using the LIGPD for small area estimation stems from a survey called the Convservation Effects Assessment Project (CEAP), which is intended to measure different types of erosion. A preliminary analysis of the CEAP data indicated that finding a single parametric form to describe the distributions of all CEAP response variables of interest is difficult. As a consequence, semi-parametric procedures are of interst. Further, the CEAP survey publishes estimates of the quantiles of distributions of erosion variables, which makes an estimation procedure based on quantile regression attractive. While Jang and Wang (Citation2015) use Bayesian methods for inference and focus on estimating the quantile regression coefficients, Berg and Lee (Citation2019a) define a frequentist estimation procedure, an empirical Bayes predictor, and a parametric bootstrap MSE estimator. Section 1.2 defines the Berg and Lee (Citation2019a) procedure in more detail. Berg and Lee (Citation2019a) restrict attention to a continuous response variable and assume that the sample design is noninformative for the specified model.

We consider two extensions of the LIGPD SAE procedure developed in Berg and Lee (Citation2019a). The first is an extension to zero-inflated data. The second is an extension to an informative sample design.

Existing small area estimation procedures for zero-inflated data utilise fully parametric models. Pfeffermann, Terryn, and Moura (Citation2008) and Chandra and Sud (Citation2012) consider linear mixed effects models for the non-zero component of the zero-inflated distribution. To ensure that the support of the distribution for the nonzero component is positive, Dreassi, Petrucci, and Rocco (Citation2014) and Lyu (Citation2018) consider gamma and lognormal distributions, respectively, for the positive component. Outside the context of small area estimation, quantile regression procedures for zero-inflated data build on the concept underlying Tobit regression. Such quantile regression procedures for zero-inflated data typically assume that the observed response variable is a truncated version of a partially observed variable with support on the real line (Buchinsky & Hahn, Citation1998; Powell, Citation1986). The partially observed variable is assumed to satisfy a standard quantile regression model. We specify a zero-inflated quantile regression model for small area estimation in the spirit of Dreassi et al. (Citation2014) and Lyu (Citation2018). We assume that the positive component of the model satisfies a modification of the quantile regression model of Berg and Lee (Citation2019a). We assume a logistic mixed effects model for the probability of observing a zero.

Numerous small area procedures for an informative sample design have been developed. You and Rao (Citation2002) use inverse selection probabilities as weights. Verret, Rao, and Hidiroglou (Citation2015) propose an augmented model. Pfeffermann and Sverchkov (Citation2007) exploit relationships between the sample distribution, the sample complement distribution, and the survey weights. We adapt the approach of Pfeffermann and Sverchkov (Citation2007) to the quantile regression framework. To our knowledge, this is the first work to consider estimation of small area quantiles when the sample design is informative for the small area model.

1.1. Overview of CEAP survey data

Our interest in small area estimation for zero-inflated data under a complex sample design stems partly from a survey called the Conservation Effects Assessment Project (CEAP). The CEAP survey uses a multi-phase design. The first phase is a longitudinal survey called the National Resources Inventory (NRI) that collects information on agriculture and natural resources through visual interpretation of aerial photographs of sampled segments. The CEAP survey collects more detailed information for a subset of NRI locations through farmer interviews. Primary response variables in CEAP are measures of soil and nutrient loss that result from processing farmer interview data through a computer model called the Agricultural Policy Environmental Extender (APEX). Berg and Lee (Citation2019a) analyze several CEAP response variables for Wisconsin. The model of Berg and Lee (Citation2019a) is not appropriate for data with a large proportion of zeros. Their model, for example, would not be well suited to the percolation variable for Kansas, where approximately 12% of the sampled values are equal to zero. Berg and Lee (Citation2019a) also assume that the sample design is noninformative for the specified model, an assumption that we examine more rigorously in this paper.

1.2. Overview of LIGPD small area procedure

We provide an overview of the LIGPD model and estimation procedure used in Berg and Lee (Citation2019a). Further detail is provided in Berg and Lee (Citation2019a) and in the supplementary document (Berg & Lee, Citation2019b). A sample of $n_{i}$ elements is selected from the population of $N_{i}$ elements for area i, where $i = 1, \dots, D$ . Let $y_{i j}$ denote the variable of interest for unit j in area i, and assume $y_{i j}$ is observed only for sampled elements. We assume that a vector of covariates $x_{i j}$ is available for all $N_{i}$ elements in the population. Parameters of interest are quantiles of ${y_{i j} : j = 1, \dots, N_{i}}$ .

The LIGPD model and estimator of Berg and Lee (Citation2019a) begins with specification of a mixed effects quantile regression model. Let $b_{i} \sim N (0, σ_{b}^{2})$ denote a normally distributed random effect for area i with mean 0 and variance $σ_{b}^{2}$ . Assume the conditional distribution of $y_{i j}$ given $b_{i}$ is absolutely continuous. Denote the τth quantile of the conditional distribution of $y_{i j}$ given $x_{i j}$ and $b_{i}$ by $q_{i j} (τ)$ . Specifically, $q_{i j} (τ)$ satisfies $P (y_{i j} \leq q_{i j} (τ) ∣ b_{i}, x_{i j}) = τ$ . The model underlying the LIGPD is a mixed effects quantile regression model. The model assumes that $q_{i j} (τ)$ satisfies (1) $q_{i j} (τ) = x_{i j}^{'} β (τ) + b_{i},$ (1) and that $x_{i j}^{'} β (τ) \leq x_{i j}^{'} β (τ + δ)$ for $δ \geq 0$ . The critical assumption in (Equation1(1) $q_{i j} (τ) = x_{i j}^{'} β (τ) + b_{i},$ (1) ) is that the area random effect $b_{i}$ is constant across quantile levels. Because the area random effect is fixed across quantile levels, $q_{i j} (τ)$ is nondecreasing in τ for fixed $(i, j)$ as long as $x_{i j}^{'} β (τ) \leq x_{i j}^{'} β (τ + δ)$ for $δ \geq 0$ .

The LIGPD of Jang and Wang (Citation2015) defines an approximation to the density of the conditional distribution of $y_{i j}$ given $x_{i j}$ and $b_{i}$ , denoted as $f_{Y} (y ∣ x_{i j}, b_{i}, θ)$ . The approximation for the density derives from the assumed quantile regression model (Equation1(1) $q_{i j} (τ) = x_{i j}^{'} β (τ) + b_{i},$ (1) ). The quantile function and the density function are related by (2) $f_{Y} (q_{i j} (τ) ∣ x_{i j}, b_{i}, θ) = lim_{h \to 0} \frac{h}{q_{i j} (τ + h) - q_{i j} (τ)} .$ (2) As explained in Jang and Wang (Citation2015), the relationship (Equation2(2) $f_{Y} (q_{i j} (τ) ∣ x_{i j}, b_{i}, θ) = lim_{h \to 0} \frac{h}{q_{i j} (τ + h) - q_{i j} (τ)} .$ (2) ) motivates the LIGPD approximation for $f_{Y} (y ∣ x_{i j}, b_{i})$ for a grid of interior quantiles. For extreme values, the conditional distribution of $y_{i j}$ given $x_{i j}$ and $b_{i}$ is assumed to have a generalised Pareto distribution. We now define the LIGPD approximation precisely. Let $0 < τ_{1} < \dots < τ_{K} < 1$ partition $(0, 1)$ into K+1 evenly spaced subintervals. We use as our basis for inference the approximate density function defined in Jang and Wang (Citation2015) by (3) $\begin{aligned} f_{Y} (y | x_{i j}, b_{i}, θ) \\ = I [y < q_{i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{i j} (τ_{k}) \leq y < q_{i j} (τ_{k + 1})] \frac{τ_{k + 1} - τ_{k}}{q_{i j} (τ_{k + 1}) - q_{i j} (τ_{k})}, \end{aligned}$ (3) where the vector of fixed parameters to be estimated is $θ = (β_{K}^{'}, σ_{b}^{'}, ρ_{ℓ}, ξ_{ℓ}, ρ_{u}, ξ_{u})^{'}$ , $β_{K} = (β (τ_{1})^{'}, \dots, β (τ_{K})^{'})^{'}$ , and $f_{s} (y | ρ_{s}, ξ_{s})$ for $s = ℓ, u$ are densities of generalised Pareto distributions defined as in Jang and Wang (Citation2015) and in Berg and Lee (Citation2019a). For interior quantiles, the LIGPD approximates the density function as a piecewise constant function on the intervals $[x_{i j}^{'} β (τ_{j}), x_{i j}^{'} β (τ_{j + 1})]$ for $j = 1, \dots, J - 1$ . By the relationship (Equation2(2) $f_{Y} (q_{i j} (τ) ∣ x_{i j}, b_{i}, θ) = lim_{h \to 0} \frac{h}{q_{i j} (τ + h) - q_{i j} (τ)} .$ (2) ), the approximation for the density function as a piece-wise constant function corresponds to an approximation for the CDF using linear interpolation. The approximation for the quantile function through linear interpolation is the inverse of the approximation for the CDF.

Using the LIGPD for small area estimation requires predicting the area random effect $b_{i}$ . An approximation for the conditional distribution of $b_{i}$ given the data corresponding to (Equation3(3) $\begin{aligned} f_{Y} (y | x_{i j}, b_{i}, θ) \\ = I [y < q_{i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{i j} (τ_{k}) \leq y < q_{i j} (τ_{k + 1})] \frac{τ_{k + 1} - τ_{k}}{q_{i j} (τ_{k + 1}) - q_{i j} (τ_{k})}, \end{aligned}$ (3) ) is given by (4) $\begin{aligned} f_{b ∣ y} (b_{i} ∣ y_{i 1}, \dots, y_{i n_{i}}; θ) \\ = \frac{\prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2})}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}, \end{aligned}$ (4) where $f_{b} (b_{i} ∣ σ_{b}^{2})$ is the density function of a normal distribution with mean zero and variance $σ_{b}^{2}$ , and $y_{i} = (y_{i 1}, \dots, y_{i n_{i}})^{'}$ . The density function (Equation4(4) $\begin{aligned} f_{b ∣ y} (b_{i} ∣ y_{i 1}, \dots, y_{i n_{i}}; θ) \\ = \frac{\prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2})}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}, \end{aligned}$ (4) ) allows defining a Bayes (minimum MSE) predictor of the area random effect $b_{i}$ . Specifically, the Bayes predictor of $b_{i}$ (for squared error loss) is given by (5) $\begin{aligned} E [b_{i} ∣ y_{i}; θ] \\ = \frac{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} b_{i} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}} . \end{aligned}$ (5) With the predictor (Equation5(5) $\begin{aligned} E [b_{i} ∣ y_{i}; θ] \\ = \frac{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} b_{i} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}} . \end{aligned}$ (5) ) of $b_{i}$ , a predictor of $q_{i j} (τ)$ is ${\tilde{q}}_{i j} (τ) = x_{i j}^{'} β (τ_{i}) + E [b_{i} ∣ y_{i}; θ] .$ The set of ${{\tilde{q}}_{i j} (τ_{k}) : k = 1, \dots, K; j = 1, \dots, N_{i}}$ defines an approximation for the distribution of the population of $y_{i j}$ for $j = 1, \dots, N_{i}$ . The predictor ${\tilde{q}}_{i j} (τ)$ requires an estimate of the unknown $β (τ_{k})$ for $k = 1, \dots, K$ .

Berg and Lee (Citation2019a) define an iterative procedure to estimate $β (τ_{k})$ . We summarise the critical aspects of the estimation procedure and refer the reader to Berg and Lee (Citation2019a) and to the supplementary material (Berg & Lee, Citation2019b) for details. The two critical components of the estimation procedure involve (1) the use of Koenker's check function to estimate the quantile regression coefficients and (2) the use of the distribution (Equation4(4) $\begin{aligned} f_{b ∣ y} (b_{i} ∣ y_{i 1}, \dots, y_{i n_{i}}; θ) \\ = \frac{\prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2})}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}, \end{aligned}$ (4) ) to estimate $σ_{b}^{2}$ and to predict $b_{i}$ . Koenker's check function (Koenker, 2005) is defined as (6) $ρ_{τ} (u) = u (τ - I [u < 0]) .$ (6) Koenker's check function is a standard objective function for estimating quantiles because $q_{i j} (τ) = {a r g m i n}_{a} E [ρ_{τ} (y_{i j} - a) ∣ x_{i j}, b_{i}]$ . The estimation procedure of Berg and Lee (Citation2019a) alternates between optimisation of Koenker's check function to estimate $β_{K}$ and use of the distribution (Equation4(4) $\begin{aligned} f_{b ∣ y} (b_{i} ∣ y_{i 1}, \dots, y_{i n_{i}}; θ) \\ = \frac{\prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2})}{\int_{- \infty}^{\infty} \prod_{j = 1}^{n_{i}} f (y_{i j} | x_{i j}, b_{i}, θ) f_{b} (b_{i} ∣ σ_{b}^{2}) d b_{i}}, \end{aligned}$ (4) ) to estimate $σ_{b}^{2}$ and to predict $b_{i}$ . The estimates of the parameters of the extreme value distribution are obtained using a procedure recommended in Jang and Wang (Citation2015). Note that the estimates of the parameters of the extreme value distribution are required for the LIGPD approximation but are not explicitly part of the specified quantile regression model (Equation1(1) $q_{i j} (τ) = x_{i j}^{'} β (τ) + b_{i},$ (1) ). In this sense, the estimates of the extreme value distribution are less central than the estimates of $β_{K}$ and $σ_{b}^{2}$ . We define the estimator of the extreme value distribution that we use for zero-inflated data precisely in Section 2.

Given estimates $\hat{β} (τ_{k})$ and ${\hat{σ}}_{b}^{2}$ , one can construct predictors of small area parameters. A predictor of $q_{i j} (τ_{k})$ is given by ${\hat{q}}_{i j} (τ_{k}) = x_{i j}^{'} \hat{β} (τ_{k}) + E [b_{i} ∣ y_{i}, \hat{θ}],$ where $\hat{β} (τ_{k})$ is the estimator of $β (τ_{k})$ . The ${{\hat{q}}_{i j} (τ_{k}) : j = 1, \dots, N_{i}; k = 1, \dots, K}$ approximates the distribution of ${y_{i j} : j = 1, \dots, N_{i}}$ . We use ${{\hat{q}}_{i j} (τ_{k}) : j = 1, \dots, N_{i}; k = 1, \dots, K}$ to define small area predictors, as in Berg and Lee (Citation2019a). Define a predictor of the τth population quantile for area i by (7) $\begin{aligned} {\hat{q}}_{i} (τ) & = min {{\hat{q}}_{i j} (τ_{k}) : {\hat{F}}_{y_{i}} ({\hat{q}}_{i j} (τ_{k})) \geq τ; \\ j & = 1, \dots, N_{i}; k = 1, \dots, K}, \end{aligned}$ (7) where ${\hat{F}}_{y_{i}} (t) = (N_{i} K)^{- 1} \sum_{j = 1}^{N_{i}} \sum_{k = 1}^{K} I [{\hat{q}}_{i j} (τ_{k}) \leq t]$ .

1.3. Outline

We extend the LIGPD model and estimation procedure outlined in Section 1.2 to zero-inflated data and an informative sample design. In Section 2, we describe the extension to zero-inflated data. In Section 3, we describe the extension to the informative sample design. In Section 4, we illustrate the procedures using the variable percolation for Kansas.

2. Zero-Inflated model and estimation procedure

We modify the LIGPD model and estimation procedure of Section 1.2 for a case in which the support of $y_{i j}$ is $[0, \infty)$ . As discussed in Section 1, several examples in which small area estimates of a zero-inflated variable are of interest exist in small area estimaton (SAE) literature. For instance, $y_{i j}$ may be grape production as in Dreassi et al. (Citation2014) or $y_{i j}$ may be sheet and rill erosion as in Lyu (Citation2018). In Section 2.1, we describe the extension of the LIGPD model to accommodate zero-inflated data. In Section 2.2, we describe the procedure to estimate the parameters of the zero-inflated model. Section 2.3 proposes a bootstrap MSE estimator. The procedures are modifications of the estimation and bootstrap MSE estimation methods defined in Berg and Lee (Citation2019a).

Before describing the procedures in detail, we note that the method described in Section 2 is one of many possible ways to accommodate zero-inflated, positive data. We adopt the approach described below for two main reasons. First, the approach allows us to remain within the framework of modelling quantiles. Second, the estimation procedures require only minor modifications to the procedures in Berg and Lee (Citation2019a)Berg and Lee (Citation2019a).

2.1. Zero-Inflated mixed effects quantile regression model

Assume the support of the response variable $y_{i j}$ is $[0, \infty)$ . As for Section 1.2, assume $y_{i j}$ is observed for a sample $A_{i}$ of $n_{i}$ elements in area i. Assume a vector of covariates $(x_{i j}^{'}, z_{i j}^{'})^{'}$ is available for the full population of $N_{i}$ elements in area i. The parameters of interest are quantiles of ${y_{i j} : j = 1, \dots, N_{i}}$ .

We specify a model with two components. One component is for the probability that $y_{i j}$ is zero. We refer to this component as the binary component. The second component is a model for the quantile of the conditional distribution given that $y_{i j} > 0$ . We first define the model for the binary component and then define the model for the positive component. Finally, we explain how these two models combine to form a model for the quantile of the conditional distribution of $y_{i j}$ given the covariates and area random effects.

First, we define the model for the binary component. Assume (8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) where $u_{i} \sim N (0, σ_{u}^{2})$ . The model (Equation8(8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) ) is a standard mixed effects logistic regression model for $I [y_{i j} = 0]$ . We advise the reader to make note that the model (Equation8(8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) ) is a model for the probability of observing a zero, and $P (y_{i j} > 0 ∣ u_{i}, z_{i j}) = 1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})$ .

Next, we define the model for the positive component. Define $q_{p o s i j} (τ)$ to be the τth quantile of the conditional distribution of $y_{i j}$ given $y_{i j} > 0$ . Specifically, $q_{p o s i j} (τ)$ satisfies $P (y_{i j} \leq q_{p o s i j} (τ) ∣ y_{i j} > 0, b_{i}, x_{i j}) = τ$ . We define a quantile regression model for $q_{p o s i j}$ that is a modification of the model (Equation1(1) $q_{i j} (τ) = x_{i j}^{'} β (τ) + b_{i},$ (1) ) to respect the restricted sample space for $y_{i j} > 0$ . Define a model for $q_{p o s i j} (τ)$ by (9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) where $x_{i j}^{'} β (τ + δ) \geq x_{i j} β (τ)$ for $δ > 0$ , $x_{i j}^{'} β (τ) > 0$ for all $τ \in (0, 1)$ , and $b_{i} \sim N (0, σ_{b}^{2})$ .

Finally, we combine (Equation8(8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) ) and (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ) to define a model the τth quantile of the conditional distribution of $y_{i j}$ given $x_{i j}, b_{i}, z_{i j},$ and $u_{i}$ . Precisely, the τth quantile of the conditional distribution of $y_{i j}$ , denoted $q_{i j} (τ)$ , satisfies $P (y_{i j} \leq q_{i j} (τ) ∣ x_{i j}, b_{i}, z_{i j}, u_{i}) = τ$ . The models (Equation8(8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) ) and (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ) induce a model for $q_{i j} (τ)$ . It is the induced model for $q_{i j} (τ)$ that we would like to use for small area prediction. The key idea to deriving the induced model for $q_{i j} (τ)$ is the observation that for $τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j})$ , $q_{i j} (τ)$ has the same functional form as $q_{p o s i j} (τ)$ but with shifted quantile levels. To derive the model for $q_{i j} (τ)$ , let t>0 satisfy $P (y_{i j} \leq t ∣ b_{i}, u_{i}, x_{i j}, z_{i j}) = τ$ . Observe that $\begin{aligned} τ & = P (y_{i j} = 0 ∣ b_{i}, u_{i}, x_{i j}, z_{i j}) \\ + P (y_{i j} \leq t ∣ y_{i j} > 0, b_{i}, u_{i}, x_{i j}, z_{i j}) P (y_{i j} \\ > 0 ∣ b_{i}, u_{i}, x_{i j}, z_{i j}), \\ = P (y_{i j} = 0 ∣ u_{i}, z_{i j}) + τ^{*} P (y_{i j} > 0 ∣ u_{i}, z_{i j}), \end{aligned}$ where $q_{p o s i j} (τ^{*}) = t$ . Solving for $τ^{*}$ gives (10) $\begin{aligned} τ^{*} = \frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})} . \end{aligned}$ (10) Then, (11) $q_{i j} (τ) = \{\begin{cases} 0 \\ if τ \leq P (y_{i j} = 0 ∣ u_{i}, z_{i j}) \\ q_{p o s i j} (\frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}) \\ if τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j}) . \end{cases}$ (11) As a remark on the model for the positive component, one can consider alternatives to the model (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ) for the quantile of the conditional distribution given that $y_{i j}$ is positive. For instance, a different approach is to use a transformation of $y_{i j}$ for $y_{i j} > 0$ , as in Berg and Lee (Citation2019a). The relationship (Equation11(11) $q_{i j} (τ) = \{\begin{cases} 0 \\ if τ \leq P (y_{i j} = 0 ∣ u_{i}, z_{i j}) \\ q_{p o s i j} (\frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}) \\ if τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j}) . \end{cases}$ (11) ) holds for any $q_{p o s i j} (τ) > 0$ . In the data analysis of Section 4, we consider an expansion of the model (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ).

To construct small area predictors according to the distribution (Equation11(11) $q_{i j} (τ) = \{\begin{cases} 0 \\ if τ \leq P (y_{i j} = 0 ∣ u_{i}, z_{i j}) \\ q_{p o s i j} (\frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}) \\ if τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j}) . \end{cases}$ (11) ), we require estimates of the model parameters. In the estimation procedure defined below, we first estimate $q_{p o s i j} (τ)$ and $P (y_{i j} = 0 ∣ u_{i}, z_{i j})$ . We then predict finite population quantiles of $y_{i j}$ according to (Equation11(11) $q_{i j} (τ) = \{\begin{cases} 0 \\ if τ \leq P (y_{i j} = 0 ∣ u_{i}, z_{i j}) \\ q_{p o s i j} (\frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}) \\ if τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j}) . \end{cases}$ (11) ). Details of the estimation and prediction procedures are defined in Section 2.2.

2.2. Estimation procedure for zero-inflated model

The estimation procedure consists of three main steps. We first estimate the parameters of the model for $q_{p o s i j} (τ)$ . We then estimate the probability of a zero. Finally, we combine the predictor of $q_{p o s i j} (τ)$ with the predictor of the probability of a zero to obtain predictors of population quantiles.

2.2.1. Estimator of positive component

We use the LIGPD of (Jang & Wang, Citation2015) to approximate the conditional density function for $y_{i j}$ given that $y_{i j} > 0$ . The approximation is analogous to the approach outlined in Section 1.2, except that we use the LIGPD to approximate the conditional density of $y_{i j}$ given that $y_{i j} > 0$ . Define a sequence of quantile levels by $τ_{k} = k (K + 1)^{- 1}$ for $k = 1, \dots, K$ , where $K \to \infty$ as $D \to \infty$ . The approximate density function for the conditional distribution of $y_{i j}$ given $y_{i j} > 0$ and $b_{i}$ is defined by (12) $\begin{aligned} f_{Y} (y | y_{i j} > 0, x_{i j}, b_{i}, θ) = I [y < q_{p o s i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{p o s i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{p o s i j} (τ_{k}) \leq y < q_{p o s i j} (τ_{k + 1})] \\ \times \frac{τ_{k + 1} - τ_{k}}{q_{p o s i j} (τ_{k + 1}) - q_{p o s i j} (τ_{k})}, \end{aligned}$ (12) where $θ = (β_{K}^{'}, σ_{b}^{2}, ρ_{ℓ}, ξ_{ℓ}, ρ_{u}, ξ_{u})^{'}$ , $β_{K} = (β (τ_{1})^{'}, \dots, β (τ_{K})^{'})^{'}$ is the vector of fixed parameters to be estimated, $I [\cdot]$ is the indicator function that is equal to 1 if the argument is true and zero otherwise, and $f_{s} (y | ρ_{s}, ξ_{s})$ for $s = ℓ, u$ are densities of generalised Pareto distributions defined as follows. Letting $u_{i j} = 0.5 (x_{i j}^{'} β (τ_{K}) + x_{i j}^{'} β (τ_{K - 1}))$ and $ℓ_{i j} = 0.5 (x_{i j}^{'} β (τ_{1}) + x_{i j}^{'} β (τ_{2}))$ , (13) $f_{u} (y | ρ_{u}, ξ_{u}) = \frac{1 - 0.5 (τ_{K - 1} + τ_{K})}{1 - τ_{K}} g (y - u_{i j} ∣ ρ_{u}, ξ_{u}),$ (13) and (14) $f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) = \frac{0.5 (τ_{1} + τ_{2})}{τ_{1}} g (- y + ℓ_{i j} ∣ ρ_{ℓ}, ξ_{ℓ}),$ (14) where (15) $g (y ∣ ρ_{s}, ξ_{s}) = \{\begin{cases} ρ_{s}^{- 1} (1 + ξ_{s} y / ρ_{s})^{- (1 + 1 / ξ_{s})}, & ξ_{s} \neq 0 \\ ρ_{s}^{- 1} \exp (- y / ρ_{s}), & ξ_{s} = 0, \end{cases}$ (15) for $s = ℓ, u$ with y>0 for $ξ \geq 0$ , and $0 \leq y < - ρ / ξ$ for $ξ < 0$ . The function (Equation15(15) $g (y ∣ ρ_{s}, ξ_{s}) = \{\begin{cases} ρ_{s}^{- 1} (1 + ξ_{s} y / ρ_{s})^{- (1 + 1 / ξ_{s})}, & ξ_{s} \neq 0 \\ ρ_{s}^{- 1} \exp (- y / ρ_{s}), & ξ_{s} = 0, \end{cases}$ (15) ) is a density function of a generalised Pareto distribution. The multipliers defining (Equation13(13) $f_{u} (y | ρ_{u}, ξ_{u}) = \frac{1 - 0.5 (τ_{K - 1} + τ_{K})}{1 - τ_{K}} g (y - u_{i j} ∣ ρ_{u}, ξ_{u}),$ (13) ) and (Equation14(14) $f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) = \frac{0.5 (τ_{1} + τ_{2})}{τ_{1}} g (- y + ℓ_{i j} ∣ ρ_{ℓ}, ξ_{ℓ}),$ (14) ) are derived in Jang and Wang (Citation2015), and we summarise the motivation in Jang and Wang (Citation2015) for these multipliers for internal consistency. We consider the density for the upper extreme value distribution, $f_{u}$ , recognising that the motivation for $f_{ℓ}$ is completely analogous. By the definition of $u_{i j}$ , (16) $\begin{aligned} P (Y > y ∣ Y > u_{i j}, x_{i j}, b_{i}, u_{i j} > 0) \\ = \frac{F_{Y} (y ∣ x_{i j}, b_{i}, y > 0) - 0.5 (τ_{K - 1} + τ K)}{1 - 0.5 (τ_{K - 1} + τ_{K})} . \end{aligned}$ (16) Taking derivatives of both sides with respect to y gives $(1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) = [1 - 0.5 (τ_{K - 1} + τ_{K})]^{- 1} f_{Y} (y ∣ x_{i j}, b_{i}, y > 0)$ . Under the assumption that the generalised Pareto distribution describes the conditional distribution of $y_{i j}$ for $y_{i j} > u_{i j}$ , $g (y - u_{i j} ∣ ρ_{u}, ξ_{u}) = f_{Y} (y ∣ x_{i j}, b_{i}, y > 0) [1 - (τ_{K - 1} + τ_{K}) / 2]^{- 1}$ . The form for $f_{u}$ follows from setting $(1 - τ_{K}) f_{u} (y_{i j} ∣ x_{i j}, b_{i}, y > 0) = [1 - (τ_{K - 1} + τ_{K}) / 2] g (y - u_{i j} ∣ ρ_{u}, ξ_{u})$ .

Before proceeding with the prediction and estimation procedure, we add a brief comment on the relationship between the model and the LIGPD approximation, particularly the role of the generalised Pareto distribution. The assumed model for the positive component is defined in (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ). The density function (Equation12(12) $\begin{aligned} f_{Y} (y | y_{i j} > 0, x_{i j}, b_{i}, θ) = I [y < q_{p o s i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{p o s i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{p o s i j} (τ_{k}) \leq y < q_{p o s i j} (τ_{k + 1})] \\ \times \frac{τ_{k + 1} - τ_{k}}{q_{p o s i j} (τ_{k + 1}) - q_{p o s i j} (τ_{k})}, \end{aligned}$ (12) ) is an approximation that provides a tool for defining predictors and estimators. The extreme value distributions are adapted from Berg and Lee (Citation2019a) and from Jang and Wang (Citation2015). Conceptually, the extreme value distribution for the lower tail can be improved for the case of zero-inflated data. We retain the estimator defined in step 3 of Section 2.2.1 largely for simplicity. Based on past experiments with different estimators of the extreme value distribution, we expect the choice of the extreme value distribution to have little impact on the efficiency of the predictors.

We recognise that the use of the same notation for $θ$ in the model for the zero-inflated response that we use in Section 1.2 is a slight abuse of notation. We use the same notation $θ$ in defining the model for $q_{p o s i j} (τ_{k})$ that we use in defining the general LIGPD in Section 1.2 for simplicity. We recognise that the $θ$ in (Equation12(12) $\begin{aligned} f_{Y} (y | y_{i j} > 0, x_{i j}, b_{i}, θ) = I [y < q_{p o s i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{p o s i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{p o s i j} (τ_{k}) \leq y < q_{p o s i j} (τ_{k + 1})] \\ \times \frac{τ_{k + 1} - τ_{k}}{q_{p o s i j} (τ_{k + 1}) - q_{p o s i j} (τ_{k})}, \end{aligned}$ (12) ) is different from the $θ$ for the unconditional distribution of Section 1.2.

An important distribution used to define estimators and predictors is the conditional distribution of $b_{i}$ given the data. An expression for the conditional distribution of $b_{i}$ given the data corresponding to the LIGPD is (17) $\begin{aligned} f_{b ∣ y_{p o s}} (b_{i} ∣ y_{p o s i}; θ) \\ = \frac{\begin{matrix} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}, \end{aligned}$ (17) where φ is the density function of a standard normal distribution, and $y_{p o s i} = {y_{i j} : j \in A_{i}, y_{i j} > 0}$ . If the area has no sampled units, then the conditional density of $b_{i}$ is that of a normal distribution with mean zero and variance $σ_{b}^{2}$ . One can calculate expectations with respect to (Equation17(17) $\begin{aligned} f_{b ∣ y_{p o s}} (b_{i} ∣ y_{p o s i}; θ) \\ = \frac{\begin{matrix} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}, \end{aligned}$ (17) ) to obtain Bayes predictors under squared error loss. For an integrable function $h (\cdot)$ , the Bayes preditor of $h (b_{i})$ for squared error loss is defined as (18) $\begin{aligned} E [h (b_{i}) ∣ y_{p o s i}; θ] \\ = \frac{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} h (b_{i}) f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}} . \end{aligned}$ (18) In particular, for $h (b) = \exp (b)$ , we obtain the Bayes predictor of $\exp (b_{i})$ . The Bayes predictor of $q_{p o s i j} (τ)$ for squared error loss corresponding to the approximate density function (Equation12(12) $\begin{aligned} f_{Y} (y | y_{i j} > 0, x_{i j}, b_{i}, θ) = I [y < q_{p o s i j} (τ_{1})] τ_{1} f_{ℓ} (y | ρ_{ℓ}, ξ_{ℓ}) \\ + I [y \geq q_{p o s i j} (τ_{K})] (1 - τ_{K}) f_{u} (y | ρ_{u}, ξ_{u}) \\ + \sum_{k = 1}^{K - 1} I [q_{p o s i j} (τ_{k}) \leq y < q_{p o s i j} (τ_{k + 1})] \\ \times \frac{τ_{k + 1} - τ_{k}}{q_{p o s i j} (τ_{k + 1}) - q_{p o s i j} (τ_{k})}, \end{aligned}$ (12) ) and the model (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ) is (19) $q_{i j}^{B} (τ) = x_{i j}^{'} β (τ) E [\exp (b_{i}) ∣ y_{p o s i}; θ] .$ (19) A predictor of the form (Equation19(19) $q_{i j}^{B} (τ) = x_{i j}^{'} β (τ) E [\exp (b_{i}) ∣ y_{p o s i}; θ] .$ (19) ) will provide the basis of the small area predictors for zero-inflated data. However, the predictor (Equation19(19) $q_{i j}^{B} (τ) = x_{i j}^{'} β (τ) E [\exp (b_{i}) ∣ y_{p o s i}; θ] .$ (19) ) is unattainable because (Equation19(19) $q_{i j}^{B} (τ) = x_{i j}^{'} β (τ) E [\exp (b_{i}) ∣ y_{p o s i}; θ] .$ (19) ) is a function of the unknown $θ$ .

We next define an estimator of $θ$ . The estimator is a modification of the iterative estimation procedure used in Berg and Lee (Citation2019a) to account for the zero-inflated nature of the data. The iteration involves optimisation of Koenker's check function (Equation6(6) $ρ_{τ} (u) = u (τ - I [u < 0]) .$ (6) ) and calculation of conditional moments according to (Equation17(17) $\begin{aligned} f_{b ∣ y_{p o s}} (b_{i} ∣ y_{p o s i}; θ) \\ = \frac{\begin{matrix} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} \\ > 0, x_{i j}, b_{i}, θ) φ (b_{i} / σ_{b}) d b_{i} \end{matrix}}, \end{aligned}$ (17) ).

Begin with the initial estimator ${\hat{θ}}^{(0)}$ defined in Appendix 1. For $m = 1, 2, \dots, M$ , alternate between the following steps.

Define the updated estimator of $σ_{b}^{2}$ by (20) $\begin{aligned} {\hat{σ}}_{b}^{2 (m)} = (D - p)^{- 1} \sum_{i = 1}^{D} E [b_{i}^{2} ∣ y_{p o s i}; {\hat{θ}}^{(m - 1)}], \end{aligned}$ (20) where p is the dimension of $x_{i j}$ . Define predictors of $b_{i}$ and $\exp (b_{i})$ in the mth step by $\begin{aligned} {\hat{b}}_{i}^{(m)} & = E [b_{i} ∣ y_{p o s i}; {\hat{θ}}^{(m - 1)}], and \\ {\hat{e}}_{b i}^{(m)} & = E [\exp (b_{i}) ∣ y_{p o s i}, {\hat{θ}}^{(m - 1)}] . \end{aligned}$ To approximate the integrals defining the conditional expectations, we use a Riemann sum, as described in Berg and Lee (Citation2019a). The motivation for the estimator ${\hat{σ}}_{b}^{2 (m)}$ is from the EM algorithm for a linear mixed effects model with normally distributed random terms (Searle, Casella, & McCulloch, Citation1992, p. 300).
We use the method of Koenker and Ng (Citation2005) to update the estimator of $β_{K}$ to maintain the monotonicity restriction. The motivation for the estimator of $β (τ_{k})$ is that for known $b_{i}$ , $x_{i j}^{'} β (τ) = {a r g m i n}_{a} E [ρ_{τ} (y_{i j} \exp (- b_{i}) - a) ∣ y_{i j} > 0, b_{i}]$ , where $ρ_{τ} (u)$ is the check function defined in (Equation6(6) $ρ_{τ} (u) = u (τ - I [u < 0]) .$ (6) ). The estimates of $β (τ_{j})$ are obtained sequentially to enforce the monotonicity condition. First, define (21) $\begin{aligned} {\hat{β}}^{(m)} (τ_{1}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ ρ_{τ_{1}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) - x_{i j}^{'} β), \end{aligned}$ (21) subject to the restriction that $x_{i j}^{'} {\hat{β}}^{(m)} (τ_{1}) > c_{0}$ , where $c_{0}$ is a specified constant. For $k = 2, \dots, K$ , define (22) $\begin{aligned} {\hat{β}}^{(m)} (τ_{k}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ ρ_{τ_{k}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) - x_{i j}^{'} β) \end{aligned}$ (22) subject to the restriction that $x_{i j}^{'} {\hat{β}}^{(m)} (τ_{k}) \geq x_{i j}^{'} {\hat{β}}^{(m)} (τ_{k - 1})$ for $j = 1, \dots, N_{i}$ and $i = 1, \dots, D$ . To enforce the monotonicity restrictions, we implement the constrained optimisation method of Koenker and Ng (Citation2005) using the method fn in the R function rq.
Next, we estimate $ρ_{s}$ and $ξ_{s}$ for $s = ℓ, u$ , the parameters of the generalised Pareto density. The estimators are minor modifications of the procedures used in Jang and Wang (Citation2015) to account for the zero-inflated nature of the data. Specifically, (23) $\begin{aligned} {\hat{ρ}}_{ℓ}^{(m)} & = 0.5 (τ_{1} + τ_{2}) \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ \times \frac{{\hat{q}}_{i j}^{(m)} (τ_{2}) - {\hat{q}}_{i j}^{(m)} (τ_{1})}{n (τ_{2} - τ_{1})}, \\ {\hat{ρ}}_{u}^{(m)} & = [1 - 0.5 (τ_{K} + τ_{K - 1})] \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ \times \frac{{\hat{q}}_{i j}^{(m)} (τ_{K}) - {\hat{q}}_{i j}^{(m)} (τ_{K - 1})}{n (τ_{K} - τ_{K - 1})}, \end{aligned}$ (23) where ${\hat{q}}_{i j}^{(m)} (τ_{k}) = x_{i j}^{'} {\hat{β}}^{(m)} (τ_{k}) {\hat{e}}_{b i}^{(m)}$ , and $n = \sum_{i = 1}^{D} \sum_{j = 1}^{n_{i}} I [y_{i j} > 0]$ . Holding ${\hat{ρ}}_{ℓ}^{(m)}$ and ${\hat{ρ}}_{u}^{(m)}$ fixed, the estimator of $ξ_{s}$ is the maximum likelihood estimator using only ${y_{i j} < {\hat{ℓ}}_{i j}^{(m)}}$ for $s = ℓ$ and ${y_{i j} > {\hat{u}}_{i j}^{(m)}}$ for s=u, where ${\hat{ℓ}}_{i j}^{(m)} = 0.5 (x_{i j}^{'} {\hat{β}}^{(m)} (τ_{1}) + x_{i j}^{'} {\hat{β}}^{(m)} (τ_{2})) {\hat{e}}_{b i}^{(m)}$ and ${\hat{u}}_{i j}^{(m)} = 0.5 (x_{i j}^{'} {\hat{β}}^{(m)} (τ_{K}) + x_{i j}^{'} {\hat{β}}^{(m)} (τ_{K - 1})) {\hat{e}}_{b i}^{(m)}$ . Precisely, (24) $\begin{aligned} {\hat{ξ}}_{ℓ}^{(m)} & = {argmax}_{ξ} \prod_{{(i j) : 0 < y_{i j} < {\hat{ℓ}}_{i j}^{(m)}}} g (- (y_{i j} - {\hat{ℓ}}_{i j}^{(m)})) ∣ \\ \times {\hat{ρ}}_{ℓ}^{(m)}, ξ), \end{aligned}$ (24) and (25) $\begin{aligned} {\hat{ξ}}_{u}^{(m)} & = {argmax}_{ξ} \prod_{{(i j) : y_{i j} > {\hat{u}}_{i j}^{(m)} > 0}} \\ g (y_{i j} - {\hat{u}}_{i j}^{(m)} ∣ {\hat{ρ}}_{u}^{(m)}, ξ) . \end{aligned}$ (25)

Let $\hat{θ} = (({\hat{β}}_{K})^{'}, {\hat{σ}}_{b}^{2}, {\hat{ρ}}_{ℓ}, {\hat{ξ}}_{ℓ}, {\hat{ρ}}_{u}, {\hat{ξ}}_{u})^{'}$ denote the estimator of $θ$ obtained in the final step of the iteration.

2.2.2. Estimator of Binary component

One can use standard software to estimate the parameters of the logistic mixed effects model (Equation8(8) $\begin{aligned} P (y_{i j} & = 0 ∣ u_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} γ + u_{i}))^{- 1} \\ \times \exp (z_{i j}^{'} γ + u_{i}), \end{aligned}$ (8) ). To estimate $σ_{u}^{2}$ and $γ$ , we use a Laplace approximation, as implemented in the R function glmer. Let ${\hat{σ}}_{u}^{2}$ and $\hat{γ}$ be the resulting estimates of $σ_{u}^{2}$ and $γ$ . We use penalised quasi-likelihood (Breslow & Clayton, Citation1993), as implemented with the predict method for glmer objects to predict $u_{i}$ , and we let ${\hat{u}}_{i}$ be the resulting predictor. We then define a predictor of the probability that $y_{i j}$ is zero by (26) ${\hat{p}}_{z} ({\hat{u}}_{i}, z_{i j}) = (1 + \exp (z_{i j}^{'} \hat{γ} + {\hat{u}}_{i}))^{- 1} \exp (z_{i j}^{'} \hat{γ} + {\hat{u}}_{i}) .$ (26)

2.2.3. Predictors of quantiles

Given estimates of parameters $θ$ , $γ$ , and $σ_{u}^{2}$ , as well as predictors of $u_{i}$ and $\exp (b_{i})$ , the next step is to construct small area predictors. The small area prediction procedure involves two main steps. First, we define an approximation for the population. The approximation for the population is similar in structure to the method of Berg and Lee (Citation2019a), except that the unconditional distribution (Equation11(11) $q_{i j} (τ) = \{\begin{cases} 0 \\ if τ \leq P (y_{i j} = 0 ∣ u_{i}, z_{i j}) \\ q_{p o s i j} (\frac{τ - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}{1 - P (y_{i j} = 0 ∣ u_{i}, z_{i j})}) \\ if τ > P (y_{i j} = 0 ∣ u_{i}, z_{i j}) . \end{cases}$ (11) ) is used to accommodate the zero-inflated nature of the data. The second step is to use the approximation for the population to define estimates of small area quantiles.

The details of the two steps of the small area prediction procedure are as follows. For $i = 1, \dots, D$ , $j = 1, \dots, N_{i}$ , and $k = 1, \dots, K$ , define a predictor of the $τ_{k}$ th conditional quantile for $y_{i j} > 0$ by ${\hat{q}}_{p o s i j} (τ_{k}) = E [\exp (b_{i}) ∣ y_{p o s i}, \hat{θ}] x_{i j}^{'} \hat{β} (τ_{k}),$ where the expectation is approximated using the Riemann sum defined in Berg and Lee (Citation2019a). Then, define a predictor of the unconditional quantile by (27) ${\hat{q}}_{i j} (τ) = \{\begin{cases} 0 & if τ \leq {\hat{p}}_{z} ({\hat{u}}_{i}, z_{i j}) \\ {\hat{q}}_{p o s i j} (\frac{τ - {\hat{p}}_{z} ({\hat{u}}_{i}, z_{i j})}{1 - {\hat{p}}_{z} ({\hat{u}}_{i}, z_{i j})}) & if τ > {\hat{p}}_{z} ({\hat{u}}_{i}, z_{i j}) . \end{cases}$ (27) The ${{\hat{q}}_{i j} (τ_{k}) : i = 1, \dots, D; j = 1, \dots, N_{i}; k = 1, \dots, K}$ defines an approximation for the population. We define a predictor of the τth population quantile by (28) $\begin{aligned} {\hat{q}}_{i} (τ) & = min {{\hat{q}}_{i j} (τ_{k}) : {\hat{F}}_{y_{i}} ({\hat{q}}_{i j} (τ_{k})) \\ \geq τ; j = 1, \dots, N_{i}; k = 1, \dots, K}, \end{aligned}$ (28) where ${\hat{F}}_{y_{i}} (t) = (N_{i} K)^{- 1} \sum_{j = 1}^{N_{i}} \sum_{k = 1}^{K} I [{\hat{q}}_{i j} (τ_{k}) \leq t]$ .

2.3. Bootstrap MSE estimation

We modify the parametric bootstrap MSE estimator of Berg and Lee (Citation2019a) to account for the zero-inflated nature of the data. The main idea of the bootstrap simulation procedure is to use the probability integral transform to simulate from the conditional distribution of $y_{i j}$ given $x_{i j}$ and $b_{i}$ . First, a $b_{i}^{*}$ is generated from the estimated marginal distribution of $b_{i}$ . Then, linear interpolation is used to approximate the quantile function corresponding to the conditional distribution of $y_{i j}$ given $x_{i j}$ and $b_{i}^{*}$ . The probability integral transform is then used to simulate a new variable, $y_{i j}^{*}$ from this linear approximation to the conditional quantile function. Finally, the estimation procedure is repeated using the original sample and the new simulated $y_{i j}^{*}$ .

To define a bootstrap MSE estimator, repeat the following steps for $t = 1, \dots, T$ .

First, generate a bootstrap approximation for the population. Generate $b_{i}^{* (t)} \sim N (0, {\hat{σ}}_{b}^{2})$ , and define $q_{p o s i j}^{* (t)} (τ_{k}) = x_{i j}^{'} \hat{β} (τ_{k}) \exp (b_{i}^{* (t)})$ . Generate $u_{i}^{* (t)} \sim N (0, {\hat{σ}}_{u}^{2})$ , and define ${\hat{p}}_{z i j}^{* (t)} = \exp (z_{i j}^{'} \hat{γ} + u_{i}^{* (t)}) (\exp (z_{i j}^{'} \hat{γ} + u_{i}^{* (t)}) + 1)^{- 1}$ . Define (29) $q_{i j}^{* (t)} (τ_{k}) = \{\begin{cases} 0 & if τ \leq {\hat{p}}_{z i j}^{* (t)} \\ {\hat{q}}_{p o s i j} (\frac{τ - {\hat{p}}_{z i j}^{* (t)}}{1 - {\hat{p}}_{z i j}^{* (t)}}) & if τ > {\hat{p}}_{z i j}^{* (t)} . \end{cases}$ (29) Define a bootstrap version of the τth population quantile by (30) $\begin{aligned} q_{i}^{* (t)} (τ) & = min {q_{i j}^{* (t)} (τ_{k}) : {\hat{F}}_{y_{i}}^{* (t)} (q_{i j}^{* (t)} (τ_{k})) \\ \geq τ; j = 1, \dots, N_{i}; k = 1, \dots, K}, \end{aligned}$ (30) where ${\hat{F}}_{y_{i}}^{* (t)} (t) = (N_{i} K)^{- 1} \sum_{j = 1}^{N_{i}} \sum_{k = 1}^{K} I [q_{i j}^{* (t)} (τ_{k}) \leq t]$ .
Generate a bootstrap sample as follows. Generate $v_{i j}^{* (t)} \overset{i i d}{\sim} Unif (0, 1)$ for $i = 1, \dots, D$ , and $j = 1, \dots, N_{i}$ . Define $y_{i j}^{* (t)} = y_{i j}^{*} (\hat{θ}, b_{i}^{* (t)}, v_{i j}^{* (t)})$ by (31) $\begin{aligned} y_{i j}^{* (t)} & = \{\begin{cases} q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)}}) \\ + (v_{i j}^{* (t)} - τ_{k_{i j}^{* (t)}}) \\ \times (\frac{\begin{matrix} q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)} + 1}) \\ - q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)}}) \end{matrix}}{τ_{k_{i j}^{* (t)} + 1} - τ_{k_{i j}^{* (t)}}}), \\ {\hat{p}}_{z i j}^{* (t)} < v_{i j}^{* (t)} < τ_{K} \\ 0, & v_{i j}^{* (t)} \leq {\hat{p}}_{z i j}^{* (t)} \\ q_{i j}^{* (t)} (τ_{K}), & v_{i j}^{* (t)} \geq τ_{K}, \end{cases} \end{aligned}$ (31) where $k_{i j}^{* (t)} = max {k : τ_{k} \leq v_{i j}^{* (t)}}$ . Define the bootstrap sample to be ${y_{i j}^{* (t)} : (i, j) \in A}$ , where A denotes the original sample. Note that the operation in the first line of (Equation31(31) $\begin{aligned} y_{i j}^{* (t)} & = \{\begin{cases} q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)}}) \\ + (v_{i j}^{* (t)} - τ_{k_{i j}^{* (t)}}) \\ \times (\frac{\begin{matrix} q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)} + 1}) \\ - q_{i j}^{* (t)} (τ_{k_{i j}^{* (t)}}) \end{matrix}}{τ_{k_{i j}^{* (t)} + 1} - τ_{k_{i j}^{* (t)}}}), \\ {\hat{p}}_{z i j}^{* (t)} < v_{i j}^{* (t)} < τ_{K} \\ 0, & v_{i j}^{* (t)} \leq {\hat{p}}_{z i j}^{* (t)} \\ q_{i j}^{* (t)} (τ_{K}), & v_{i j}^{* (t)} \geq τ_{K}, \end{cases} \end{aligned}$ (31) ) defines a linear interpolation of the estimated quantile function.
Repeat the estimation procedure of Section 2 using ${y_{i j}^{* (t)} : (i, j) \in A}$ to obtain ${\hat{q}}_{i}^{* (t)} (τ)$ . As in Berg and Lee (Citation2019a), we simplify the estimation procedure to reduce the computational burden. Rather than estimate the quantile regression coefficients sequentially to enforce the monotonicity constraint, as in (EquationA6(A6) $\begin{aligned} {\hat{β}}^{(m)} (τ_{1}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{[1]}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) \\ - x_{i j}^{{\hat{η}}^{(m - 1)}} β), \end{aligned}$ (A6) )-(EquationA7(A7) $\begin{aligned} {\hat{β}}^{(m)} (τ_{k}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{k}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) \\ - x_{i j}^{{\hat{η}}^{(m - 1)}} β) \end{aligned}$ (A7) ), we simultaneously minimise Koenker's check function for all quantile levels and then sort the estimates of the quantiles to obtain a nondecreasing quantile function (Chernozhukov, Fernandez-Val, & Galichon, Citation2009) for element $(i, j)$ . A more specific definition of the rearrangement operation is defined following (EquationA3(A3) ${\hat{β}}^{(0)} (τ_{k}) = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{k}} (y_{i j} / \exp ({\hat{b}}_{i}^{(0)}) - x_{i j}^{'} β) .$ (A3) ) of Appendix 2.

Define the bootstrap MSE estimator for ${\hat{q}}_{i} (τ)$ by (32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) The bootstrap MSE estimator is similar to bootstrap MSE estimators for small area predictors for parametric models developed in Lahiri, Maiti, Katzoff, and Parsons (Citation2007) and in Hall and Maiti (Citation2006). The MSE estimator (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ) is an estimator of $E [({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2}]$ and does not account for a possible bias of the estimator of the leading term due to estimating $θ$ . In a simulation study, Berg and Lee (Citation2019a) evaluate the quality of an MSE estimator similar to (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ) for the quantile regression model with no modification for zero-inflated data. Because the MSE estimator (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ) is similar in structure to the MSE estimator of Berg and Lee (Citation2019a), we do not present further simulation results here. Instead, we focus on an application of (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ) to the data presented in Section 4 in this manuscript.

3. Modification for an informative design

The development of Section 2 assumes that the sample design is noninformative for the quantile regression model. In this section, we consider an informative sample design. Assume all areas are included in the sample, and assume that a subset of elements is selected from area i. Let $π_{i j} = P (I_{i j} = 1 ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i})$ , where $I_{i j}$ is the sample inclusion indicator for element $(i, j)$ . We adapt the approach of Pfeffermann and Sverchkov (Citation2007) to the quantile regression setting in order to modify the predictors to account for unequal selection probabilities. Pfeffermann and Sverchkov (Citation2007) develop small area predictors for a fully parametric model under an informative sample design. Their approach exploits relationships between the sample distribution and the sample complement distribution. They construct predictors relative to the population distribution using estimates of the parameters of the sample distribution. For the fully parametric model considered in Pfeffermann and Sverchkov (Citation2007), a closed form expression for the small area predictor is available. For the quantile regression model, a closed-form expression relating the sample distribution to the sample complement distribution is not available. Nonetheless, the basic idea of the Pfeffermann and Sverchkov (Citation2007) approach applies easily to the quantile regression framework. Below, we use importance sampling to simulate from the sample complement distribution.

3.1. Procedure to account for informative design

First, we introduce the definitions of the population, sample, and sample complement distributions more formally. Let $f_{p} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j})$ be the density/mass function corresponding to the population distribution of $y_{i j}$ . Let $f_{s} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j}) = f_{p} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j}, I_{i j} = 1)$ denote the corresponding sample distribution. From Pfeffermann and Sverchkov (Citation2007; also see Kim & Yu, Citation2011 for a related result in the context of nonignorable nonresponse), the sample complement distribution is of the form (33) $\begin{aligned} f_{c} (y_{i j} ∣ b_{i}, u_{i}, x_{i j}, z_{i j}) \propto E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ \\ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}] f_{s} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j}), \end{aligned}$ (33) where $E_{s} [\cdot]$ denotes expectation with respect to the sample distribution, and $f_{c} (y_{i j} ∣ b_{i}, u_{i}, x_{i j}, z_{i j}) = f_{p} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j}, I_{i j} = 0)$ . (We refer the reader to Pfeffermann & Sverchkov, Citation2007 for further background on the concepts of the sample distribution and the sample complement distribution.)

We obtain estimates of $f_{s} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j})$ and of $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}]$ using the sample data. We use the quantile regression procedure defined in Section 2 to obtain an estimate of the quantiles of the distribution of $f_{s} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j})$ . Let ${\hat{q}}_{i j} (τ_{k})$ for $k = 1, \dots, K$ be the estimated quantiles based on the sample for evenly spaced quantile levels, obtained using the procedure of Section 2. Denote the estimate of $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}]$ based on the sample by (34) ${\hat{ω}}_{i j} (y_{i j}) = E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}] .$ (34) A variety of models and procedures may be used to obtain the estimates ${\hat{ω}}_{i j} (y_{i j})$ . We use a weight model similar to that of Pfeffermann and Sverchkov (Citation2007). In this section, we first define the method to simulate from the population distribution for an arbitrary definition of ${\hat{ω}}_{i j} (y_{i j})$ . We then define the procedure that we use to estimate $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}]$ .

We simulate from the population distribution using the relationship (Equation33(33) $\begin{aligned} f_{c} (y_{i j} ∣ b_{i}, u_{i}, x_{i j}, z_{i j}) \propto E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ \\ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}] f_{s} (y_{i j} ∣ b_{i}, x_{i j}, u_{i}, z_{i j}), \end{aligned}$ (33) ). Let ${\hat{q}}_{i j} (τ_{k})$ for $k = 1, \dots, K$ be the estimated quantiles based on the sample for evenly spaced quantile levels, obtained using the procedure of Section 2. Let ${\hat{ω}}_{i j} (y_{i j})$ be an estimate of $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ y_{i j}, x_{i j}, z_{i j}, b_{i}, u_{i}]$ based on the sample. Define a simulated population by sampling from ${{\hat{q}}_{i j} (τ_{k}) : k = 1 \dots, K}$ with probabilities proportional to ${\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))$ . For $r = 1, \dots, R$ , let (35) $\begin{aligned} {\tilde{q}}_{i j}^{(r)} & = \{\begin{cases} {\hat{q}}_{i j} (τ_{k}) with probability \\ \frac{{\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))}{\sum_{k = 1}^{K} {\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))} & if (i, j) \notin A \\ {\hat{q}}_{i j} (τ_{k}) with probability K^{- 1} & if (i, j) \in A . \end{cases} \end{aligned}$ (35) The ${{\tilde{q}}_{i j}^{(r)} : i = 1, \dots, D; j = 1, \dots, N_{i}; r = 1, \dots, R}$ defines an approximation for the population. We define a predictor of the τth population quantile by (36) $\begin{aligned} {\hat{q}}_{i} (τ) & = min {{\hat{q}}_{i j} (τ_{k}) : {\hat{F}}_{y_{i}}^{(R)} ({\hat{q}}_{i j} (τ_{k})) \\ \geq τ; j = 1, \dots, N_{i}; r = 1, \dots, R}, \end{aligned}$ (36) where ${\hat{F}}_{y_{i}}^{(R)} ({\hat{q}}_{i j} (τ_{k})) = (N_{i} R)^{- 1} \sum_{j = 1}^{N_{i}} \sum_{r = 1}^{R} I [{\tilde{q}}_{i j}^{(r)} \leq t]$ . This simulation procedure is essentially the ‘weighted bootstrap method’ defined in Section 3.2 of Smith and Gelfand (Citation1992). The quantile regression model lends itself naturally to a procedure such as (Equation35(35) $\begin{aligned} {\tilde{q}}_{i j}^{(r)} & = \{\begin{cases} {\hat{q}}_{i j} (τ_{k}) with probability \\ \frac{{\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))}{\sum_{k = 1}^{K} {\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))} & if (i, j) \notin A \\ {\hat{q}}_{i j} (τ_{k}) with probability K^{- 1} & if (i, j) \in A . \end{cases} \end{aligned}$ (35) ) to simulate from the sample complement distribution. Because the quantile estimates are already computed, one only needs to obtain the importance weight ${\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))$ .

Implementation of (Equation35(35) $\begin{aligned} {\tilde{q}}_{i j}^{(r)} & = \{\begin{cases} {\hat{q}}_{i j} (τ_{k}) with probability \\ \frac{{\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))}{\sum_{k = 1}^{K} {\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))} & if (i, j) \notin A \\ {\hat{q}}_{i j} (τ_{k}) with probability K^{- 1} & if (i, j) \in A . \end{cases} \end{aligned}$ (35) ) and (Equation36(36) $\begin{aligned} {\hat{q}}_{i} (τ) & = min {{\hat{q}}_{i j} (τ_{k}) : {\hat{F}}_{y_{i}}^{(R)} ({\hat{q}}_{i j} (τ_{k})) \\ \geq τ; j = 1, \dots, N_{i}; r = 1, \dots, R}, \end{aligned}$ (36) ) requires a model for $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ x_{i j}, z_{i j}, y_{i j}, b_{i}, u_{i}]$ . We assume (37) $\begin{aligned} E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ x_{i j}, z_{i j}, y_{i j}, b_{i}, u_{i}] \\ = \exp (α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i}), \end{aligned}$ (37) where $δ_{i} \sim N (0, σ_{δ}^{2})$ , and ${\tilde{x}}_{i j}$ may contain elements of $x_{i j}$ or $z_{i j}$ . To estimate $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ x_{i j}, z_{i j}, y_{i j}, b_{i}, u_{i}]$ we use a working model defined by (38) $\begin{aligned} \log (π_{i j}^{- 1} (1 - π_{i j})) & = α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i} + r_{i j}, \\ i = 1, \dots, D; j \in A_{i}, \end{aligned}$ (38) where $δ_{i} \sim N (0, σ_{δ}^{2})$ , and $r_{i j} \sim N (0, σ_{r}^{2})$ . The model (Equation38(38) $\begin{aligned} \log (π_{i j}^{- 1} (1 - π_{i j})) & = α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i} + r_{i j}, \\ i = 1, \dots, D; j \in A_{i}, \end{aligned}$ (38) ) is implicitly specified conditional on $I_{i j} = 1$ (i.e., a sample distribution model) and is defined only for sampled elements. Because we require an estimate of the mean of $π_{i j}^{- 1} (1 - π_{i j})$ with respect to the sample distribution as defined in (Equation37(37) $\begin{aligned} E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ x_{i j}, z_{i j}, y_{i j}, b_{i}, u_{i}] \\ = \exp (α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i}), \end{aligned}$ (37) ), we can estimate the parameters of the model (Equation38(38) $\begin{aligned} \log (π_{i j}^{- 1} (1 - π_{i j})) & = α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i} + r_{i j}, \\ i = 1, \dots, D; j \in A_{i}, \end{aligned}$ (38) ) using only the sample data, as in Pfeffermann and Sverchkov (Citation2007). We estimate $α_{0}$ , $α_{1}$ , $α_{2}$ , and $σ_{δ}^{2}$ using restricted maximum likelihood (REML) applied to the sample data. We denote the REML estimates by ${\hat{α}}_{0}$ , ${\hat{α}}_{1}$ , ${\hat{α}}_{2}$ , and ${\hat{σ}}_{δ}^{2}$ . We define the estimator of $E_{s} [π_{i j}^{- 1} (1 - π_{i j}) ∣ x_{i j}, z_{i j}, y, b_{i}, u_{i}]$ by ${\hat{ω}}_{i j} (y) = \exp ({\hat{α}}_{0} + {\tilde{x}}_{i j} {\hat{α}}_{1} + y {\hat{α}}_{2} + {\hat{δ}}_{i}),$ where ${\hat{δ}}_{i}$ is the EBLUP of $δ_{i}$ . As mentioned above, other possible models for $π_{i j}$ are possible. We use the model (Equation38(38) $\begin{aligned} \log (π_{i j}^{- 1} (1 - π_{i j})) & = α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i} + r_{i j}, \\ i = 1, \dots, D; j \in A_{i}, \end{aligned}$ (38) ) primarily for mathematical simplicity. The model (Equation38(38) $\begin{aligned} \log (π_{i j}^{- 1} (1 - π_{i j})) & = α_{0} + {\tilde{x}}_{i j}^{'} α_{1} + y_{i j} α_{2} + δ_{i} + r_{i j}, \\ i = 1, \dots, D; j \in A_{i}, \end{aligned}$ (38) ) is similar to that of Pfeffermann and Sverchkov (Citation2007), which has been vetted in the literature, and permits a computationally simple estimation procedure.

3.2. Simulation study for informative sampling modification

We conduct a limited simulation study to vet the modification for the informative sample design. The aim of the simulation is to verify that the modification for informative sampling reduces a bias in the predictor that ignores the survey weights when the sample design is informative for the specified model.

To focus attention on the informative sampling procedure, we do not use a zero-inflated model for the simulation. We use one of the simulation models from Berg and Lee (Citation2019a). The simulation model is defined by (39) $y_{i j} = β_{0} + β_{1} x_{i j} + b_{i} + e_{i j},$ (39) where $x_{i j} \overset{i i d}{\sim} N (0, 1)$ , $β_{0} = - 1.5$ , $β_{1} = 0.5$ , $b_{i} \sim N (0, 0.5)$ , and $e_{i j} = (1 + 0.1 x_{i j}) (e_{i j}^{*} - 2) / 2$ , and $e_{i j}^{*} \sim χ_{(2)}^{2}$ . We generate D=60 areas with $(N_{i}, n_{i}) = (143, 5)$ for 20 areas, $(N_{i}, n_{i}) = (286, 10)$ for 20 areas, and $(N_{i}, n_{i}) = (571, 20)$ for 20 areas. The MC sample size for each simulation is 200. The population quantile is $q_{i} (τ) = min {y_{i j} : F_{y_{i}} (y_{i j}) \geq τ : j = 1, \dots, N_{i}}$ , where $F_{y_{i}} (y) = N_{i}^{- 1} \sum_{j = 1}^{N_{i}} I [y_{i j} \leq y]$ .

A sample is selected using systematic probability proportional to size sampling. The inclusion probability for element j in area i is (40) $π_{i j} = \frac{n_{i} z_{i j}}{\sum_{j = 1}^{N_{i}} z_{i j}},$ (40) where (41) $\log (z_{i j}) = - y_{i j} / 3 + β_{0} / 3 + β_{1} x_{i j} / 3 + u_{i} / 15.$ (41) Table contains the average Monte Carlo (MC) MSE and average MC bias of two predictors, where the average is across areas of the same sample size. The predictor denoted ‘SRS’ is the predictor of Berg and Lee (Citation2019a), which ignores the unequal selection probabilities. The predictor denoted ‘Inf’ uses the modification (Equation35(35) $\begin{aligned} {\tilde{q}}_{i j}^{(r)} & = \{\begin{cases} {\hat{q}}_{i j} (τ_{k}) with probability \\ \frac{{\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))}{\sum_{k = 1}^{K} {\hat{ω}}_{i j} ({\hat{q}}_{i j} (τ_{k}))} & if (i, j) \notin A \\ {\hat{q}}_{i j} (τ_{k}) with probability K^{- 1} & if (i, j) \in A . \end{cases} \end{aligned}$ (35) ) to account for the informative design. The bias for the SRS procedure that ignores the weights is negative because the probability of selection increases as $y_{i j}$ decreases. Incorporating the survey weights through the procedure of Section 3.1 reduces the average MC MSE and absolute average MC bias of the predictor.

Table 1. Comparison of MC bias and MC MSE for LIGPD predictors.

Display Table

4. Illustration for Kansas CEAP data

We illustrate the procedures using data collected from the 2003–2006 CEAP surveys in Kansas. We consider the response variable, percolation. Approximately 12% of the sampled values of percolation are zero for Kansas. A preliminary analysis shows that the conditional distribution of the percolation variable given the covariates that we considered violates the assumptions of simple parametric models, such as the linear mixed effects model (Battese et al., Citation1988) and the lognormal mixed effects model (Berg & Chandra, Citation2014). Therefore, the percolation variable provides a realistic candidate for demonstrating the quantile regression procedures.

We apply the procedures of Sections 2 and 3 above to obtain county level predictors of the quantiles of the percolation variable for Kansas. We use M=2 steps of the iterative estimation procedure and T=100 bootstrap samples. For the informative sampling modification, we use $R = 100$ to obtain a simulated approximation for the population. As a covariate, we use a rainfall erosion index (RFACT). The covariate RFACT is defined geographically, as in Wischmeier and Smith (Citation1978, p. 11), for the full population. We obtain the RFACT from the NRI survey data. For this illustration, we treat the NRI as a population.

4.1. Model and estimators for CEAP data analysis

The rainfall factor is used as the univariate covariate in all components of the model. We consider an extension of the model (Equation9(9) $q_{p o s i j} (τ) = x_{i j}^{'} β (τ) \exp (b_{i}),$ (9) ) for the CEAP data analysis. The extended model for the conditional quantile of $y_{i j}$ given that $y_{i j} > 0$ is (42) $q_{p o s i j} (τ) = x_{i j}^{η} β (τ) \exp (b_{i}),$ (42) where $x_{i j}$ is the rainfall factor, and the power η is constant across quantile levels. We chose to expand the model to include the power η after exploratory work indicated a nonlinear association between $x_{i j}$ and $y_{i j}$ for $y_{i j} > 0$ . We provide an overview of the estimator of η in this section and relegate details to Appendix 2.

To estimate η, we add a step to the iterative estimation procedure defined in Section 2.2.1. After step 3 of Section 2.2.1, we implement the following step 4:

Define $\begin{aligned} {\tilde{L}}^{(m)} (η) & = \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} ∣ y_{i j} > 0, x_{i j}^{η}, b_{i}, {\hat{θ}}^{(m)}) \\ φ (b_{i} / {\hat{σ}}_{b}^{(m)}) d b_{i}, \end{aligned}$ and define ${\hat{η}}^{(m)} = {a r g m a x}_{η} {\tilde{L}}^{(m)} (η)$ .

The objective function, ${\tilde{L}}^{(m)}$ , has an interpretation similar to a profile likelihood. We replace $x_{i j}$ with $x_{i j}^{{\hat{η}}^{(m - 1)}}$ when implementing steps 1-3 of the procedure with estimated η. In each step m of the iteration, we restrict $x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ)$ such that $x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ)$ is nondecreasing in τ and $x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ) > 0.001$ . We use 0.001 as the lower bound because 0.001 is the smallest possible nonzero value for percolation. In the model for the probability of a zero, $z_{i j} = (1, x_{i j})^{'}$ . In the model for the survey weights, ${\tilde{x}}_{i j} = (1, x_{i j})^{'}$ . For the bootstrap, we use the simulation procedure defined in Section 2.2 with $q_{p o s i j}^{* (t)} (τ_{k}) = x_{i j}^{\hat{η}} \hat{β} (τ)$ , where $\hat{η}$ is the final estimator of η. We estimate η for each bootstrap sample, and define a bootstrap standard error for $\hat{η}$ as $\sqrt{(B - 1)^{- 1} \sum_{b = 1}^{B} ({\hat{η}}^{(b)} - \bar{η})^{2}}$ , where ${\hat{η}}^{(b)}$ is the estimate of η obtained in bootstrap sample b, and $\bar{η} = B^{- 1} \sum_{b = 1}^{B} {\hat{η}}^{(b)}$ .

4.2. Results for CEAP data analysis

The rainfall factor is positively correlated with percolation. Among units with a positive value for percolation, the correlation between the rainfall factor and percolation is 0.49, and the variance of percolation tends to increase with the rainfall factor. The estimate of the slope for the rainfall factor in the model for the probability that percolation is zero is $\hat{γ} = - 0.0139$ , with a standard error of 0.0035. The estimate of η is $\hat{η} = 1.075$ , and the bootstrap standard error is 0.014. An approximate $t -$ statistic for the null hypothesis that $η = 1$ is given by (43) $t = \frac{\hat{η} - 1}{\sqrt{(B - 1)^{- 1} \sum_{b = 1}^{B} ({\hat{η}}^{(b)} - \bar{η})^{2}}} = 5.4,$ (43) suggesting that η differs significantly from 1.

In Figure , county level estimates of the quartiles and the median are plotted along with normal theory 95% prediction intervals. The prediction intervals are calculated for the predictors that ignore the sampling weights. The intervals are defined as ${\hat{q}}_{i} (τ) \pm 1.96 \sqrt{{\hat{M S E}}_{i} (τ)}$ , where ${\hat{M S E}}_{i} (τ)$ is defined in (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ), and the lower interval endpoint is truncated at zero. The solid lines correspond to the procedure that ignores the sampling weights. The estimates that account for the sample design, as described in Section 3, are depicted with a dashed line.

Figure 1. Black: predictors of quartiles and the median based on the zero-inflated quantile regression model. Top left: 25 percentile. Top right: median. Bottom: 75 percentile. Solid black line: predictors do not use sampling weights. Dashed black line: predictors incorporate the sampling weights through the preocedure of Section 3.1. Green and red: upper and lower endpoints of 95% prediction intervals.

For this data set, the estimates that account for the informative sample design are nearly indistinguishable from the estimates that ignore the survey weights. Figure shows the estimates for the informative design plotted on the horizontal axis with the corresponding estimates that ignore the sampling weights plotted on the vertical axis. The two sets of estimates nearly lie on the 45 degree line through the origin.

Figure 2. Comparison of predictors that incorporate the modification for informative sampling (x-axis) to predictors that do not use the sampling weights (y-axis). Top left: 25 percentiles. Top right: median. Bottom: 75 percentile.

Figure contains square roots of the estimated MSEs plotted against the sample sizes for the areas. The variation in the widths of the intervals is due partly to variation in the sample sizes. The use of the multiplicative lognormal distribution for $b_{i}$ in (Equation42(42) $q_{p o s i j} (τ) = x_{i j}^{η} β (τ) \exp (b_{i}),$ (42) ) also contributes to the variation in the estimated root MSEs. The estimated MSEs from a model with an additive normal random effect show less variation than the estimated MSEs in Figure . Because the additive normal model does not preserve the parameter space for the zero-inflated data, we prefer the multiplicative model (Equation42(42) $q_{p o s i j} (τ) = x_{i j}^{η} β (τ) \exp (b_{i}),$ (42) ).

Figure 3. Estimated root mean squared errors plotted against county sample sizes. Estimated mean squared errors are defined in (Equation32(32) ${\hat{M S E}}_{i} (τ) = \frac{1}{T} \sum_{t = 1}^{T} ({\hat{q}}_{i}^{* (t)} (τ) - q_{i}^{* (t)} (τ))^{2} .$ (32) ).

Figure 3. Estimated root mean squared errors plotted against county sample sizes. Estimated mean squared errors are defined in (Equation32(32) MSEˆi(τ)=1T∑t=1T(qˆi∗(t)(τ)−qi∗(t)(τ))2.(32) ).

We also compare the estimates with estimated η to the estimates with $η = 1$ . The absolute differences between the predictions obtained from the model with estimated η and the predictions from the model with $η = 1$ are less than the estimated standard errors of the predictors with $η = 1$ for all but one area. We present results for estimated η because the $t -$ statistic defined in (Equation43(43) $t = \frac{\hat{η} - 1}{\sqrt{(B - 1)^{- 1} \sum_{b = 1}^{B} ({\hat{η}}^{(b)} - \bar{η})^{2}}} = 5.4,$ (43) ) indicates that $η \neq 1$ . For this data set, estimating η is of little practical significance.

5. Summary and future work

We develop two extensions to the mixed effects quantile regression small area procedure outlined in Section 1.2. One extension accommodates zero-inflated data. The second extension accounts for an informative sample design. To illustrate the procedures, we obtain predictors of quantiles of percolation for Kansas counties, using data from CEAP.

For this data analysis, incorporating the survey weights has only a minor effect on the estimates and estimated root mean squared errors. For this reason, we prefer the simpler predictors that do not use the sampling weights. In other applications, the effects of the sampling weights on the predictors may be important. For such situations, a mean squared error estimator that accounts for the modification for informative sampling would be desirable. Extending the bootstrap procedure of Pfeffermann and Sverchkov (Citation2007) to estimation of quantiles is an area for future work.

For several counties, the estimated root mean squared errors are undesirably large. Expanding the model to incorporate additional covariates or spatial dependence is a possible future direction. A different approach for modelling the zero-inflated data would be to use a censored quantile regression model, as discussed in Section 1.

Supplemental material

Supplemental Material

Download PDF (201.1 KB)

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by National Science Foundation [MMS-000716934].

Notes on contributors

Emily Berg

Emily Berg is an assistant professor in statistics, Iowa State University.

Danhyang Lee

Danhyang Lee is an assistant professor in statistics, University of Alabama.

References

Battese, G., Harter, R., & Fuller, W. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83(401), 28–36. doi: 10.1080/01621459.1988.10478561
Web of Science ®Google Scholar
Berg, E., & Chandra, H. (2014). Small area prediction for a unit-level lognormal model. Computational Statistics & Data Analysis, 78, 159–175. doi: 10.1016/j.csda.2014.03.007
Web of Science ®Google Scholar
Berg, E., & Lee, D. (2019a). Prediction of small area quantiles for the conservation effects assessment project using a mixed effects quantile regression model. Annals of Applied Statistics, Accepted.
Google Scholar
Berg, E., & Lee, D (2019b). Supplement to “Small Area Prediction of Quantiles for Zero-Inflated Data and an Informative Sample Design.” Supplementary material.
Google Scholar
Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421), 9–25.
Web of Science ®Google Scholar
Buchinsky, M., & Hahn, J. (1998). An alternative estimator for the censored quantile regression model. Econometrica, 66, 653–671. doi: 10.2307/2998578
Web of Science ®Google Scholar
Chambers, R., & Tzavidis, N. (2006). M-quantile models for small area estimation. Biometrika, 93(2), 255–268. doi: 10.1093/biomet/93.2.255
Web of Science ®Google Scholar
Chandra, H., & Sud, U. C. (2012). Small area estimation for zero-inflated data. Communications in Statistics-Simulation and Computation, 41(5), 632–643. doi: 10.1080/03610918.2011.598991
Web of Science ®Google Scholar
Chernozhukov, V., Fernandez-Val, I., & Galichon, A. (2009). Improving point and interval estimators of monotone functions by rearrangement. Biometrika, 96, 559–575. doi: 10.1093/biomet/asp030
Web of Science ®Google Scholar
Dreassi, E., Petrucci, A., & Rocco, E. (2014). Small area estimation for semicontinuous skewed spatial data: An application to the grape wine production in tuscany. Biometrical Journal, 56(1), 141–156. doi: 10.1002/bimj.201200271
Web of Science ®Google Scholar
Hall, P., & Maiti, T. (2006). Nonparametric estimation of mean-squared prediction error in nested-error regression models. The Annals of Statistics, 34, 1733–1750. doi: 10.1214/009053606000000579
Web of Science ®Google Scholar
Jang, W., & Wang, J. (2015). A semiparameteric Bayesian approach for joint-quantile regression with clustered data. Computational Statistics and Data Analysis, 84, 99–115. doi: 10.1016/j.csda.2014.11.008
Web of Science ®Google Scholar
Kim, J. K., & Yu, C. L. (2011). A semiparametric estimation of mean functionals with nonignorable missing data. Journal of the American Statistical Association, 106(493), 157–165. doi: 10.1198/jasa.2011.tm10104
Web of Science ®Google Scholar
Koenker, R. (2005). Quantile regression. New York: Cambridge University Press. doi: 10.1017/CBO9780511754098
Google Scholar
Koenker, R., & Ng, P. (2005). Inequality constrained quantile regression. Sankhya: The Indian Journal of Statistics, 67, 418–440.
Google Scholar
Lahiri, S. N., Maiti, T., Katzoff, M., & Parsons, V. (2007). Resampling-based empirical prediction: An application to small area estimation. Biometrika, 94, 469–485. doi: 10.1093/biomet/asm035
Web of Science ®Google Scholar
Lyu, X (2018). Empirical Bayes small area prediction of sheet and rill erosion under a zero-inflated lognormal model (Master's Thesis). Iowa State University.
Google Scholar
Opsomer, J. D., Claeskens, G., Ranalli, M. G., Kauermann, G., & Breidt, F. J. (2008). Non-parametric small area estimation using penalized spline regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1), 265–286. doi: 10.1111/j.1467-9868.2007.00635.x
Google Scholar
Pfeffermann, D., & Sverchkov, M. (2007). Small-area estimation under informative probability sampling of areas and within the selected areas. Journal of the American Statistical Association, 102(480), 1427–1439. doi: 10.1198/016214507000001094
Web of Science ®Google Scholar
Pfeffermann, D., Terryn, B., & Moura, F. A. (2008). Small area estimation under a two-part random effects model with application to estimation of literacy in developing countries. Survey Methodology, 34(2), 235–249.
Web of Science ®Google Scholar
Powell, J. L. (1986). Censored regression quantiles. Journal of Econometrics, 32(1), 143–155. doi: 10.1016/0304-4076(86)90016-3
Web of Science ®Google Scholar
Rao, J. N., & Molina, I. (2015). Small area estimation. Hoboken, NJ: John Wiley & Sons.
Google Scholar
Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. New York: John Wiley & Sons.
Google Scholar
Sinha, S. K., & Rao, J. N. K. (2009). Robust small area estimation. Canadian Journal of Statistics, 37(3), 381–399. doi: 10.1002/cjs.10029
Web of Science ®Google Scholar
Smith, A. F., & Gelfand, A. E. (1992). Bayesian statistics without tears: A sampling-resampling perspective. The American Statistician, 46(2), 84–88.
Web of Science ®Google Scholar
Verret, F., Rao, J. N. K., & Hidiroglou, M. A. (2015). Model-based small area estimation under informative sampling. Survey Methodology, 41(2), 333–347.
Web of Science ®Google Scholar
Wang, J., Fuller, W. A., & Qu, Y. (2008). Small area estimation under a restriction. Survey Methodology, 34, 29–36.
Web of Science ®Google Scholar
Wischmeier, W. H., & Smith, D. D (1978). Predicting rainfall erosion losses a guide to conservation planning. U.S. Department of Agriculture, Agriculture Handbook No. 537.
Google Scholar
You, Y., & Rao, J. N. K. (2002). A pseudo empirical best linear unbiased prediction approach to small area estimation using survey weights. Canadian Journal of Statistics, 30(3), 431–439. doi: 10.2307/3316146
Web of Science ®Google Scholar

Appendices

Appendix 1: Initial Estimators

We define an initial estimator of

b = (b_{1}, \dots, b_{D})^{'}

(A1)

{\hat{b}}^{(0)} = {a r g m i n}_{b} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{0.5} (\log (y_{i j}) - b_{i}),

(A1) where

- \sum_{i = 1}^{D - 1} {\hat{b}}_{i}^{(0)} = {\hat{b}}_{D}^{(0)}

. Let

{\hat{V}}_{1} ({\hat{b}}_{1}^{(0)}), \dots, {\hat{V}}_{D - 1} ({\hat{b}}_{D - 1}^{(0)})

be estimates of the variance of the asymptotic distribution of

({\hat{b}}_{1}^{(0)}, \dots, {\hat{b}}_{D - 1}^{(0)})

, estimated with the option se = "ker" in the R function summary.rq. To define an initial estimator of

σ_{b}^{2}

, define the area-level Fay-Herriot model,

(A2)

{\hat{b}}_{i}^{(0)} = b_{i} + a_{i},

(A2) where

a_{i}

has a distribution with mean 0 and variance

{\hat{V}}_{i} {{\hat{b}}_{i}^{(0)}}

, and

b_{i}

has a distribution with mean 0 and variance

σ_{b}^{2}

for

i = 1, \dots, D - 1

. The initial estimate of

σ_{b}^{2}

, denoted by

{\hat{σ}}_{b}^{2 (0)}

, is obtained by applying the estimation procedure of Wang, Fuller, and Qu (Citation2008) to the area level model (EquationA2

(A2)

{\hat{b}}_{i}^{(0)} = b_{i} + a_{i},

(A2) ). The preliminary estimate of

β (τ_{k})

for

k = 1, \dots, K

is defined by

(A3)

{\hat{β}}^{(0)} (τ_{k}) = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{k}} (y_{i j} / \exp ({\hat{b}}_{i}^{(0)}) - x_{i j}^{'} β) .

(A3) We rearrange

{x_{i j}^{'} {\hat{β}}^{(0)} (τ_{k}) : k = 1, \dots, K}

for every

(i, j)

to obtain a nondecreasing quantile function (Chernozhukov et al., Citation2009). The estimate

{\hat{q}}_{i j}^{(0)} (τ_{k})

is the kth order statistic of

{x_{i j}^{'} {\hat{β}}^{(0)} (τ_{k}) \exp ({\hat{b}}_{i}^{(0)}) : k = 1, \dots, K}

. Given the initial estimates of the quantile function, we use the procedure in Step 3 of Section 2.2 to obtain estimates

{\hat{ρ}}_{s}^{(0)}

and

{\hat{ξ}}_{s}^{(0)}

for

s = ℓ, u

Appendix 2: Details on Estimation of the Power η for the CEAP Data Analysis

Define an initial estimator of $θ$ as in Appendix 2. Define an initial estimator of η as ${\hat{η}}^{(0)} = {a r g m a x}_{η} {\tilde{L}}^{(0)} (η),$ where $\begin{aligned} {\tilde{L}}^{(0)} (η) & = \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} ∣ y_{i j} > 0, x_{i j}^{η}, b_{i}, {\hat{θ}}^{(0)}) \\ φ (b_{i} / {\hat{σ}}_{b}^{(0)}) d b_{i} . \end{aligned}$ For $m = 1, \dots, M$ , repeat the following:

Define the updated estimator of $σ_{b}^{2}$ by (A4) ${\hat{σ}}_{b}^{2 (m)} = (D - 1)^{- 1} \sum_{i = 1}^{D} E [b_{i}^{2} ∣ y_{p o s i}; {\hat{θ}}^{(m - 1)}] .$ (A4) Define a predictor of $b_{i}$ in the mth step by ${\hat{b}}_{i}^{(m)} = E [b_{i} ∣ y_{p o s i}; {\hat{θ}}^{(m - 1)}] .$ Also, define ${\hat{e}}_{b i}^{(m)} = E [\exp (b_{i}) ∣ y_{p o s i}, {\hat{θ}}^{(m - 1)}]$ . The conditional expectation for estimated η is defined as (A5) $\begin{aligned} E [h (b_{i}) ∣ y_{p o s i}; θ] \\ = \frac{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} h (b_{i}) f_{Y} (y_{i j} | y_{i j} > 0, \\ x_{i j}^{{\hat{η}}^{(m - 1)}}, b_{i}, {\hat{θ}}^{(m - 1)}) φ (b_{i} / {\hat{σ}}_{b}^{(m - 1)}) d b_{i} \end{matrix}}{\begin{matrix} \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} | y_{i j} > 0, \\ x_{i j}^{{\hat{η}}^{(m - 1)}}, b_{i}, {\hat{θ}}^{(m - 1)}) φ (b_{i} / {\hat{σ}}_{b}^{(m - 1)}) d b_{i} \end{matrix}} . \end{aligned}$ (A5) To approximate the integrals defining the conditional expectations, we use the Riemann sum described in Appendix 1.
We use the method of Koenker and Ng (Citation2005) to update the estimator of $β_{K}$ to maintain the monotonicity restriction. First, define (A6) $\begin{aligned} {\hat{β}}^{(m)} (τ_{1}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{[1]}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) \\ - x_{i j}^{{\hat{η}}^{(m - 1)}} β), \end{aligned}$ (A6) subject to the restriction that $x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{1}) > c_{0}$ , where $c_{0}$ is a specified constant. For $k = 2, \dots, K$ , define (A7) $\begin{aligned} {\hat{β}}^{(m)} (τ_{k}) & = {argmin}_{β} \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} ρ_{τ_{k}} (y_{i j} \exp (- {\hat{b}}_{i}^{(m)}) \\ - x_{i j}^{{\hat{η}}^{(m - 1)}} β) \end{aligned}$ (A7) subject to the restriction that $x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{k}) \geq x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{k - 1})$ for $j = 1, \dots, N_{i}$ and $i = 1, \dots, D$ . To enforce the monotonicity restrictions, we implement the constrained optimisation method of Koenker and Ng (Citation2005) using the method fn in the R function rq.
We modify the method of Jang and Wang (Citation2015) to estimate $ρ_{s}$ and $ξ_{s}$ for $s = ℓ, u$ . Specifically, (A8) $\begin{aligned} {\hat{ρ}}_{ℓ}^{(m)} & = 0.5 (τ_{1} + τ_{2}) \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ \times \frac{{\hat{q}}_{i j}^{(m)} (τ_{2}) - {\hat{q}}_{i j}^{(m)} (τ_{1})}{n (τ_{2} - τ_{1})}, \\ {\hat{ρ}}_{u}^{(m)} & = [1 - 0.5 (τ_{K} + τ_{K - 1})] \sum_{i = 1}^{D} \sum_{{j \in A_{i} : y_{i j} > 0}} \\ \times \frac{{\hat{q}}_{i j}^{(m)} (τ_{K}) - {\hat{q}}_{i j}^{(m)} (τ_{K - 1})}{n (τ_{K} - τ_{K - 1})}, \end{aligned}$ (A8) where ${\hat{q}}_{i j}^{(m)} (τ_{k}) = x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{k}) {\hat{e}}_{b i}^{(m)}$ , and $n = \sum_{i = 1}^{D} \sum_{j = 1}^{n_{i}} I [y_{i j} > 0]$ . Holding ${\hat{ρ}}_{ℓ}^{(m)}$ and ${\hat{ρ}}_{u}^{(m)}$ fixed, the estimator of $ξ_{s}$ is the maximum likelihood estimator using only ${y_{i j} < {\hat{ℓ}}_{i j}^{(m)}}$ for $s = ℓ$ and ${y_{i j} > {\hat{u}}_{i j}^{(m)}}$ for s=u, where ${\hat{ℓ}}_{i j}^{(m)} = 0.5 (x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{1}) + x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{2})) {\hat{e}}_{b i}^{(m)}$ and ${\hat{u}}_{i j}^{(m)} = 0.5 (x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{K}) + x_{i j}^{{\hat{η}}^{(m - 1)}} {\hat{β}}^{(m)} (τ_{K - 1})) {\hat{e}}_{b i}^{(m)}$ . Precisely, (A9) $\begin{aligned} {\hat{ξ}}_{ℓ}^{(m)} & = {argmax}_{ξ} \prod_{{(i j) : 0 < y_{i j} < {\hat{ℓ}}_{i j}^{(m)}}} \\ \times g (- (y_{i j} - {\hat{ℓ}}_{i j}^{(m)})) ∣ {\hat{ρ}}_{ℓ}^{(m)}, ξ), \end{aligned}$ (A9)
and (A10) $\begin{aligned} {\hat{ξ}}_{u}^{(m)} & = {argmax}_{ξ} \prod_{{(i j) : y_{i j} > {\hat{u}}_{i j}^{(m)} > 0}} g (y_{i j} - {\hat{u}}_{i j}^{(m)} ∣ {\hat{ρ}}_{u}^{(m)}, ξ) . \end{aligned}$ (A10)
Define an updated estimator of η as ${\hat{η}}^{(m)} = {a r g m a x}_{η} {\tilde{L}}^{(m)} (η),$ where $\begin{aligned} {\tilde{L}}^{(m)} (η) & = \int_{- \infty}^{\infty} \prod_{{j \in A_{i} : y_{i j} > 0}} f_{Y} (y_{i j} ∣ y_{i j} > 0, x_{i j}^{η}, b_{i}, {\hat{θ}}^{(m)}) \\ \times φ (b_{i} / {\hat{σ}}_{b}^{(m)}) d b_{i} . \end{aligned}$

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Small area prediction of quantiles for zero-inflated data and an informative sample design

ABSTRACT