1,736

Views

CrossRef citations to date

Altmetric

STATISTICS

Estimation of Population Mean by Using a Generalized Family of Estimators Under Classical Ranked Set Sampling

Asad Alia Department of Economics and Statistics, School of Business and Economics, University of Management and Technology, Lahore, PakistanCorrespondence[email protected]

https://orcid.org/0000-0001-9032-9878 View further author information

Muhammad Moeen Butta Department of Economics and Statistics, School of Business and Economics, University of Management and Technology, Lahore, Pakistan

https://orcid.org/0000-0003-1736-8998 View further author information

Kanwal Iqbala Department of Economics and Statistics, School of Business and Economics, University of Management and Technology, Lahore, Pakistan

https://orcid.org/0000-0001-5186-1845 View further author information

Muhammad Hanifb Department of Statistics, National College of Business Administration and Economics Lahore, Pakistan

https://orcid.org/0000-0002-0775-3457 View further author information

Muhammad Zubaira Department of Economics and Statistics, School of Business and Economics, University of Management and Technology, Lahore, Pakistan

https://orcid.org/0000-0001-7142-9929 View further author information

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Estimation of population mean of study variable Y suffers loss of precision in the presence of high variation in the data set. The use of auxiliary information incorporated in construction of an estimator under Rank set sampling scheme results in efficient estimation of population mean. In this paper, we propose an efficient generalized family of estimators to estimate finite population mean of study variable under ranked set sampling utilizing information on an auxiliary variable. Bias and Mean Square Error (MSE) of the proposed generalized family of estimators are derived. The conditions of efficiency of proposed generalized family of estimators from competitor estimators are also derived. The applications of estimator are discussed using simulation study and real-life data sets for comparisons of efficiency. It is concluded that when correlation between study and auxiliary variables increases, the proposed generalized family of estimators proves to be the efficient estimator of population mean of the study variable.

KEYWORDS:

1. Introduction

In many situations of practical interest, mainly in environmental and ecological studies, the variable of interest, say Y, is not easily observable in the sense that measurement may be expensive, time consuming, invasive or even destructive. Although data collection may be complex, ranking the potential sampled units with respect to an available auxiliary variable can often be relatively simple at no additional cost or for a very little cost. In those situations, where the variations in study variable is high and it is strongly correlated with auxiliary variable, the Ranked Set Sampling (RSS) proposed by Mclntyre (McIntyre, Citation1952) is more efficient as compared to Simple Random Sampling (SRS) (Patil et al., Citation1993, Citation1994; Stokes, Citation1977).

Literature on RSS has rapidly grown and several estimators, originally conceived for the SRS, have been re-proposed to estimate the mean of the study variable by changing their sampling design into RSS framework (Ali et al., Citation2021; Iqbal et al., Citation2020; Kadilar et al., Citation2009; Khan & Shabbir, Citation2016a, Citation2016b, Citation2016c; Mandowara & Mehta, Citation2013, Citation2016; Mehta & Mandowara, Citation2016; Pelle & Perri, Citation2018; Samawi & Muttlak, Citation1996; Singh et al., Citation2014; Vishwakarma et al., Citation2017). Motivated by these studies, and in line with many other contributions present in the literature section of this article, we propose an efficient generalized family of estimators by changing the sampling design of the estimator proposed by Shahzad et al. (Shahzad et al., Citation2019).

Notations under Ranked Set Sampling Design

Let Ω = {1,2, …, N} be a finite population of N units, Y the variable under study and X an auxiliary variable which is highly correlated with Y. Let µ_y and µ_x denote the population means of Y and X, respectively, $S_{y}^{2}$ and $S_{x}^{2}$ the variances, C_y and C_x the coefficients of variation, $ρ_{x y}$ the correlation coefficient between X and Y, $β_{2 (x)}$ and $β_{1 (x)}$ the kurtosis and skewness, and $C_{x y} = ρ_{x y} C_{x} C_{y}$ . Let us denote $(X_{j (i)}, Y_{j [i]})$ as the pair of the i^th-order statistics of X and the associated element Y in the j^th cycle. Then the ranked set sample is,

\begin{aligned} (X_{1 (1)}, Y_{1 [1]}) \dots, (X_{1 (m)}, Y_{1 [m]}), \\ (X_{2 (1)}, Y_{2 [1]}) \dots, (X_{2 (m)}, Y_{2 [m]}), \\ . \\ . \\ . \\ (X_{r (1)}, Y_{r [1]}) \dots, (X_{r (m)}, Y_{r [m]}) . \end{aligned}

To obtain biases and mean square error, we consider following notations under RSS:

$e_{0} = \frac{{\overset{ˉ}{y}}_{[r s s]} - μ_{y}}{μ_{y}},$ $e_{1} = \frac{{\overset{ˉ}{x}}_{(r s s)} - μ_{x}}{μ_{x}} .$ ${\overset{ˉ}{y}}_{[r s s]} = μ_{y} (1 + e_{0}),$ ${\overset{ˉ}{x}}_{(r s s)} = μ_{x} (1 + e_{1}) .$ ${\overset{ˉ}{x}}_{(i)} = \frac{\sum_{j = 1}^{r} X_{j (i)}}{r},$ ${\overset{ˉ}{y}}_{[i]} = \frac{\sum_{j = 1}^{r} y_{j [i]}}{r},$ $i = 1, 2, \dots, m .$

$τ_{x (i)} = {\overset{ˉ}{x}}_{(i)} - μ_{x}$ : Deviation of i^th cycle ranked mean from population mean µ_x.

$τ_{y [i]} = {\overset{ˉ}{y}}_{[i]} - μ_{y}$ : Deviation of i^th cycle ranked mean from population mean µ_y.

$τ_{x y (i)} = ({\overset{ˉ}{x}}_{(i)} - μ_{x}) ({\overset{ˉ}{y}}_{[i]} - μ_{y})$ : Cross product of the deviations.

$γ = \frac{1}{n} - \frac{1}{N} = \frac{1}{m r} - \frac{1}{N} ≅ \frac{1}{m r},$ $C_{y}^{2} = \frac{S_{y}^{2}}{{μ_{y}}^{2}},$ $C_{x}^{2} = \frac{S_{x}^{2}}{{μ_{x}}^{2}},$ $C_{x y} = \frac{S_{x y}}{μ_{x} μ_{y}} = ρ_{x y} C_{y} C_{x},$ $W_{x (i)}^{2} = \frac{\sum_{i = 1}^{m} τ_{x (i)}^{2}}{m^{2} r {μ_{x}}^{2}},$ $W_{y [i]}^{2} = \frac{\sum_{i = 1}^{m} τ_{y [i]}^{2}}{m^{2} r {μ_{y}}^{2}},$ $W_{x y (i)} = \frac{\sum_{i = 1}^{m} τ_{x y (i)}}{m r μ_{x} μ_{y}} .$

To obtain biases and mean square error, we consider following notations under SRS:

$ϑ_{0} = \frac{{\overset{ˉ}{y}}_{[s r s]} - μ_{y}}{μ_{y}},$ $ϑ_{1} = \frac{{\overset{ˉ}{x}}_{(s r s)} - μ_{x}}{μ_{x}},$

${\overset{ˉ}{y}}_{[s r s]} = μ_{y} (1 + ϑ_{0}),$ ${\overset{ˉ}{x}}_{(s r s)} = μ_{x} (1 + ϑ_{1}) .$

(1.1)

\begin{aligned} E (e_{0}) = E (e_{1}) = 0, E (e_{0}^{2}) = V_{y y} - W_{y [i]}^{2} = V_{y y}, E (e_{1}^{2}) = \\ V x x - W_{x (i)}^{2} = V_{x x}, E (e_{0} e_{1}) = V x y - W_{x y (i)} = V_{x y}, E (ϑ_{0}) = \\ E (ϑ_{1}) = 0, E (ϑ_{0}^{2}) = γ C_{y}^{2} = V y y, E (ϑ_{1}^{2}) = γ C_{x}^{2} = \\ V x x, E (ϑ_{0} ϑ_{1}) = γ C_{x y} = V_{x y} . \end{aligned}

(1.1)

2. Some Existing Estimators under SRS and RSS

Following are given some famous estimators along with their mean square errors.

Mandowara and Mehta (Mandowara & Mehta, Citation2013) proposed following estimators,

(2.1)

Δ_{(R S S) m m 1} = {\overset{ˉ}{y}}_{[r s s]} {\{\frac{(μ_{x} C_{x} + β_{2 (x)})}{({\overset{ˉ}{x}}_{(r s s)} C_{x} + β_{2 (x)})}\}}^{δ},

(2.1)

(2.2)

Δ_{(R S S) m m 2} = {\overset{ˉ}{y}}_{[r s s]} {\{\frac{(μ_{x} β_{2 (x)} + C_{x})}{({\overset{ˉ}{x}}_{(r s s)} β_{2 (x)} + C_{x})}\}}^{α},

(2.2)

(2.3)

Δ_{(R S S) m m 4} = {\overset{ˉ}{y}}_{[r s s]} \{\frac{(μ_{x} + β_{2 (x)})}{({\overset{ˉ}{x}}_{(r s s)} + β_{2 (x)})}\} .

(2.3)

The Bias and MSE of $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ and $Δ_{(R S S) m m 4}$ are,

B (Δ_{(R S S) m m 1}) = \frac{μ_{y}}{2} [δ ϕ_{2} \{(δ + 1) V_{x x} - 2 V_{x y}\}],

(2.4)

M S E (Δ_{(R S S) m m 1}) = μ_{y}^{2} (V_{y y} + δ^{2} ϕ_{2}^{2} V_{x x} - 2 δ ϕ_{2} V_{x y}),

(2.4)

B (Δ_{(R S S) m m 2}) = \frac{μ_{y}}{2} [α ϕ_{1} \{(α + 1) V_{x x} - 2 V_{x y}\}],

(2.5)

M S E (Δ_{(R S S) m m 2}) = μ_{y}^{2} (V_{y y} + α^{2} ϕ_{1}^{2} V_{x x} - 2 α ϕ_{1} V_{x y}),

(2.5)

B (Δ_{(R S S) m m 4}) = μ_{y} (λ_{2}^{2} V_{x x} - λ_{2} V_{x y}),

(2.6)

M S E (Δ_{(R S S) m m 4}) = μ_{y}^{2} (V_{y y} + λ_{2}^{2} V_{x x} - 2 λ_{2} V_{x y}) .

(2.6)

Which are minimum for,

$δ_{o p t} = ρ_{x y} \frac{C_{y}}{ϕ_{2} C_{x}},$ $ϕ_{2} = \frac{μ_{x} C_{x}}{μ_{x} C_{x} + β_{2 (x)}},$ $α_{o p t} = ρ_{x y} \frac{C_{y}}{ϕ_{1} C_{x}},$ $ϕ_{1} = \frac{μ_{x} β_{2 (x)}}{μ_{x} β_{2 (x)} + C_{x}},$ $λ_{2} = \frac{μ_{x}}{μ_{x} + β_{2 (x)}} .$

Vishwakarma, Zeeshan and Bouza (Vishwakarma et al., Citation2017) developed the following exponential estimator,

(2.7)

Δ_{(R S S) v z} = {\overset{ˉ}{y}}_{[r s s]} exp [\frac{μ_{x} - {\overset{ˉ}{x}}_{(r s s)}}{μ_{x} + {\overset{ˉ}{x}}_{(r s s)}}] .

(2.7)

The MSE of $Δ_{(R S S) v z}$ is,

(2.8)

M S E (Δ_{(R S S) v z}) = μ_{y}^{2} [\frac{V_{x x}}{4} + V_{y y} - V_{x y}] .

(2.8)

Shahzad et al. (Shahzad et al., Citation2019) introduced the generalized form of the estimators under simple random sampling given as,

(2.9)

\begin{aligned} Δ_{(S R S) s h} = [\begin{matrix} w_{s h 1} \overset{ˉ}{y} {\{\frac{(a μ_{x} + b)}{α (a \overset{ˉ}{x} + b) + (1 - α) (a μ_{x} + b)}\}}^{g} \\ + w_{s h 2} (μ_{x} - \overset{ˉ}{x}) \end{matrix}] . \\ exp [\frac{η_{s h} (μ_{x} - \overset{ˉ}{x})}{η_{s h} (2 ξ μ_{x} - ϕ μ_{x} + \overset{ˉ}{x}) + 2 λ}] . \end{aligned}

(2.9)

Where $w_{s h 1}, w_{s h 2}, α, a, b, g, η_{s h}, ξ, ϕ a n d λ$ are all constants. The bias and MSE of $Δ_{(S R S) s h}$ are,

\begin{aligned} B (Δ_{(S R S) s h}) = w_{s h 1} μ_{y} 1 - k_{1} V_{x y} + \\ (\frac{g (g + 1)}{2} α^{2} Γ^{2} + α γ_{s h} Γ g + \frac{3}{2} γ_{s h}^{2}) V_{x x} \\ + w_{s h 2} μ_{x} γ_{s h} V_{x x} - μ_{y}, \end{aligned}

M S E (Δ_{(S R S) s h}) = μ_{y}^{2} + w_{s h 1}^{2} Θ_{A s h} + w_{s h 2}^{2} Θ_{B s h} + w_{s h 1} w_{s h 2} Θ_{C s h} + w_{s h 1} Θ_{D s h} + w_{s h 2} Θ_{E s h} .

Where,

$k_{1} = α Γ + γ_{s h},$ $γ_{s h} = \frac{η_{s h} μ_{x}}{(b_{k} + 1) η_{s h} μ_{x} + 2 λ},$ $b_{k} = 2 ξ - ϕ,$ $k_{2} = \frac{g (g + 1)}{2} α^{2} Γ^{2} + α γ_{s h} Γ g + \frac{3}{2} γ_{s h}^{2},$

$Θ_{A s h} = μ_{y}^{2} [1 + V_{y y}^{'} + (k_{1}^{2} + 2 k_{2}) V_{x x}^{'} - 4 k_{1} V_{x y}^{'}],$ $Θ_{B s h} = μ_{x}^{2} V_{x x}^{'},$ $Θ_{C s h} = 2 μ_{y} μ_{x} [(k_{1} + γ_{s h}) V_{x x}^{'} - V_{x y}^{'}],$

$Θ_{D s h} = 2 μ_{y}^{2} [k_{1} V_{x y}^{'} - k_{2} V_{x x}^{'} - 1],$ $Θ_{E s h} = - 2 μ_{y} μ_{x} γ_{s h} V_{x x}^{'} .$

Which is minimum for,

$w_{s h 1}^{o p t} = \frac{- 2 Θ_{B s h} Θ_{D s h} + Θ_{C s h} Θ_{E s h}}{4 Θ_{A s h} Θ_{B s h} - Θ_{C s h}^{2}},$ $w_{s h 2}^{o p t} = \frac{Θ_{C s h} Θ_{D s h} - 2 Θ_{A s h} Θ_{E s h}}{4 Θ_{A s h} Θ_{B s h} - Θ_{C s h}^{2}} .$

The minimum MSE of $Δ_{(S R S) s h}$ is given by,

(2.10)

\begin{aligned} M S E {(Δ_{(S R S) s h})}_{m i n} = μ_{y}^{2} - \\ \frac{Θ_{B s h} Θ_{D s h}^{2} + Θ_{A s h} Θ_{E s h}^{2} - Θ_{C s h} Θ_{D s h} Θ_{E s h}}{4 Θ_{A s h} Θ_{B s h} - Θ_{C s h}^{2}} . \end{aligned}

(2.10)

3. Proposed Generalized Family of Estimators under RSS

Motivated from Shahzad et al. (Shahzad et al., Citation2019), we propose the following generalized family of estimators under Ranked set sampling,

(3.1)

\begin{aligned} Δ_{(R S S) G} = [\begin{matrix} w_{1} {\overset{ˉ}{y}}_{[r s s]} {\{\frac{(a μ_{x} + b)}{α (a {\overset{ˉ}{x}}_{(r s s)} + b) + (1 - α) (a μ_{x} + b)}\}}^{g} \\ + w_{2} (μ_{x} - {\overset{ˉ}{x}}_{(r s s)}) \end{matrix}] \\ exp [\frac{η (μ_{x} - {\overset{ˉ}{x}}_{(r s s)})}{η (2 ξ μ_{x} - ϕ μ_{x} + {\overset{ˉ}{x}}_{(r s s)}) + 2 λ}] . \end{aligned}

(3.1)

Where $w_{1} a n d w_{2}$ are unknown constants and $α, a, b, g, η, ξ, ϕ a n d λ$ are suitably chosen known constants.

Derivation of Bias and Mean Square Error

Rewriting the above estimator with “e” terms under first order of approximation we get,

\begin{aligned} Δ_{(R S S) G} = \\ [\begin{matrix} w_{1} μ_{y} (\begin{matrix} 1 + e_{0} - α g Γ e_{1} + \frac{g (g + 1)}{2} α^{2} Γ^{2} e_{1}^{2} \\ - α g Γ e_{0} e_{1} \end{matrix}) \\ - w_{2} μ_{x} e_{1} \end{matrix}] \\ [1 - ν e_{1} + \frac{3}{2} ν^{2} e_{1}^{2}] . \end{aligned}

Where,

Γ = \frac{a μ_{x}}{(a μ_{x} + b)}, k_{1} = (α g Γ + ν), k_{2} = (\frac{g (g + 1)}{2} α^{2} Γ^{2} + ν α g Γ + \frac{3}{2} ν^{2}), b_{k} = 2 ξ - ϕ, υ = \frac{η μ_{x}}{η μ_{x} (b_{k} + 1) + 2 λ} .

Subtracting $μ_{y}$ on both sides,

(3.2)

Δ_{(R S S) G} - μ_{y} = w_{1} μ_{y} (1 + e_{0} - k_{1} e_{1} + k_{2} e_{1}^{2} - k_{1} e_{0} e_{1}) - w_{2} μ_{x} (e_{1} - ν e_{1}^{2}) - μ_{y} .

(3.2)

For bias, we apply expectation on both sides of 3.2, the expression for bias of $Δ_{(R S S) G}$ is given as,

B i a s (Δ_{(R S S) G}) = [w_{1} μ_{y} (1 - k_{1} V_{x y} + k_{2} V_{x x}) + w_{2} μ_{x} ν V_{x x} - μ_{y}] .

For MSE, we apply square and expectation on both sides of 3.2, the expression for MSE of $Δ_{(R S S) G}$ is given as,

\begin{aligned} M S E (Δ_{(R S S) G}) = \\ [\begin{matrix} μ_{y}^{2} + w_{1}^{2} Θ_{A a m} + w_{2}^{2} Θ_{B a m} + w_{1} w_{2} Θ_{C a m} \\ + w_{1} Θ_{D a m} + w_{2} Θ_{E a m} \end{matrix}] . \end{aligned}

Where,

$Θ_{A a m} = [μ_{y}^{2} (1 + V_{y y} + (k_{1}^{2} + 2 k_{2}) V_{x x} - 4 k_{1} V_{x y})],$ $Θ_{B a m} = [μ_{x}^{2} V_{x x}],$ $Θ_{C a m} = 2 μ_{y} μ_{x} ((k_{1} + v) V_{x x} - V_{x y}),$

$Θ_{D a m} = 2 μ_{y}^{2} (k_{1} V_{x y} - k_{2} V_{x x} - 1),$ $Θ_{E a m} = - 2 μ_{y} μ_{x} v V_{x x} .$

For minimizing MSE, we obtained the optimum values of $w_{1}$ and $w_{2}$ as follows:

w_{1}^{o p t} = (\frac{- 2 Θ_{B a m} Θ_{D a m} + Θ_{C a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}),

and

w_{2}^{o p t} = (\frac{Θ_{C a m} Θ_{D a m} - 2 Θ_{A a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}) .

Hence, the minimum Bias and MSE are given by,

\begin{aligned} B i a s {(Δ_{(R S S) G})}_{min} = \\ [\begin{matrix} w_{1}^{o p t} μ_{y} \{1 - k_{1} V_{x y} + k_{2} V_{x x}\} + \\ w_{2}^{o p t} μ_{x} ν V_{x x} - μ_{y} \end{matrix}], \end{aligned}

(3.3)

\begin{aligned} M S E {(Δ_{(R S S) G})}_{min} = \\ [\begin{matrix} μ_{y}^{2} - \\ \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}} \end{matrix}] . \end{aligned}

(3.3)

The estimator of $B i a s {(Δ_{(R S S) G})}_{min}$ and $M S E {(Δ_{(R S S) G})}_{min}$ based on sample measurements are given as follows:

\begin{aligned} B i a s {(Δ_{(R S S) G})}_{min} = \\ [\begin{matrix} {w_{1}}^{o p t} μ_{y} \{1 - k_{1} V_{x y} + k_{2} V_{x x}\} \\ + w_{2}^{o p t} μ_{_{x}} ν V_{x x} - μ_{y} \end{matrix}], \end{aligned}

(3.4)

M S E {(Δ_{(R S S) G})}_{\min} = [μ_{y}^{- 2} - \frac{{\overset{⌢}{Θ}}_{B a m} {\overset{⌢}{Θ}}_{D a m}^{2} + {\overset{⌢}{Θ}}_{A a m} {\overset{⌢}{Θ}}_{E a m}^{2} - {\overset{⌢}{Θ}}_{C a m} {\overset{⌢}{Θ}}_{D a m} {\overset{⌢}{Θ}}_{E a m}}{4 {\overset{⌢}{Θ}}_{A a m} {\overset{⌢}{Θ}}_{B a m} - {\overset{⌢}{Θ}}_{C a m}^{2}}] .

(3.4)

Where,

${\overset{⌢}{Θ}}_{A a m} = [{\overset{⌢}{μ}}_{y}^{2} (1 + {\overset{⌢}{V}}_{y y} + (k_{1}^{2} + 2 k_{2}) {\overset{⌢}{V}}_{x x} - 4 k_{1} {\overset{⌢}{V}}_{x y})],$ ${\overset{⌢}{Θ}}_{B a m} = [{\overset{⌢}{μ}}_{x}^{2} {\overset{⌢}{V}}_{x x}],$ ${\overset{⌢}{Θ}}_{C a m} = 2 {\overset{⌢}{μ}}_{y} {\overset{⌢}{μ}}_{x} ((k_{1} + v) {\overset{⌢}{V}}_{x x} - {\overset{⌢}{V}}_{x y}),$

${\overset{⌢}{Θ}}_{D a m} = 2 {\overset{⌢}{μ}}_{y}^{2} (k_{1} {\overset{⌢}{V}}_{x y} - k_{2} {\overset{⌢}{V}}_{x x} - 1),$ ${\overset{⌢}{Θ}}_{E a m} = - 2 {\overset{⌢}{μ}}_{y} {\overset{⌢}{μ}}_{x} v {\overset{⌢}{V}}_{x x},$ ${\overset{⌢}{w}}_{1}^{o p t} = (\frac{- 2 {\overset{⌢}{Θ}}_{B a m} {\overset{⌢}{Θ}}_{D a m} + {\overset{⌢}{Θ}}_{C a m} {\overset{⌢}{Θ}}_{E a m}}{4 {\overset{⌢}{Θ}}_{A a m} {\overset{⌢}{Θ}}_{B a m} - {\overset{⌢}{Θ}}_{C a m}^{2}}),$

{\overset{⌢}{w}}_{2}^{o p t} = (\frac{{\overset{⌢}{Θ}}_{C a m} {\overset{⌢}{Θ}}_{D a m} - 2 {\overset{⌢}{Θ}}_{A a m} {\overset{⌢}{Θ}}_{E a m}}{4 {\overset{⌢}{Θ}}_{A a m} {\overset{⌢}{Θ}}_{B a m} - {\overset{⌢}{Θ}}_{C a m}^{2}}) .

All of these are sample observations, so we will calculate these observations for quantifying Bias and MSE of our estimator for any given sample.

4. Efficiency Comparison

We derive the theoretical conditions to compare the efficiency of our proposed generalized family of estimators to their competitor estimators.

By (3.3) and (2.4),

M S E {(Δ_{(R S S) G})}_{min} < M S E (Δ_{(R S S) m m 1})

[μ_{y}^{2} - \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}] - [μ_{y}^{2} (V_{y y} + δ^{2} ϕ_{2}^{2} V_{x x} - 2 δ ϕ_{2} V_{x y})] < 0

By (3.3) and (2.5),

M S E {(Δ_{(R S S) G})}_{min} < M S E (Δ_{(R S S) m m 2})

[μ_{y}^{2} - \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}] - [μ_{y}^{2} (V_{y y} + α^{2} ϕ_{1}^{2} V_{x x} - 2 α ϕ_{1} V_{x y})] < 0

By (3.3) and (2.6),

M S E {(Δ_{(R S S) G})}_{min} < M S E (Δ_{(R S S) m m 4})

[μ_{y}^{2} - \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}] - [μ_{y}^{2} (V_{y y} + λ_{2}^{2} V_{x x} - 2 λ_{2} V_{x y})] < 0

By (3.3) and (2.8),

M S E {(Δ_{(R S S) G})}_{min} < M S E (Δ_{(R S S) v z})

[μ_{y}^{2} - \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}] - [μ_{y}^{2} [\frac{V_{x x}}{4} + V_{y y} - V_{x y}]] < 0

By (3.3) and (2.10),

M S E {(Δ_{(R S S) G})}_{min} < M S E {(Δ_{(S R S) s h})}_{m i n}

[μ_{y}^{2} - \frac{Θ_{B a m} Θ_{D a m}^{2} + Θ_{A a m} Θ_{E a m}^{2} - Θ_{C a m} Θ_{D a m} Θ_{E a m}}{4 Θ_{A a m} Θ_{B a m} - Θ_{C a m}^{2}}] - [μ_{y}^{2} - \frac{Θ_{B s h} Θ_{D s h}^{2} + Θ_{A s h} Θ_{E s h}^{2} - Θ_{C s h} Θ_{D s h} Θ_{E s h}}{4 Θ_{A s h} Θ_{B s h} - Θ_{C s h}^{2}}] < 0

Note: When these conditions are satisfied, the proposed estimators will perform more efficiently as compared to their competitor estimators.

5. Simulation Study

A hypothetical data for the study variable (Y) and auxiliary variable (X) is generated by using Bivariate Normal Distribution with parameters,

\begin{aligned} N = 1500, n = \{20, 30\}, m = \{5, 10\}, r = \{4, 3\}, \\ μ_{x} = 850, μ_{y} = 550, C_{y} = 1.25, C_{x} = 1.5, \\ ρ_{x y} = \{0.4, 0. 7, 0.8, 0.9, 0.99\} . \end{aligned}

Samples of different values of n have been simulated 50,000 to calculate their average mean square errors and percent relative efficiencies. Percent Relative Efficiencies (PREs) of our proposed generalized family of estimators along with competitor estimators from literature have been presented in for different values of n and $ρ_{x y}$ .

shows that, when correlation coefficient of x and y equals to 0.4 and n = 20, our proposed estimator will be 257.31% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 204.05%, 201.47%, 223.89% and 224.67% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. At the same correlation coefficient, when we increase the sample size by n = 30, our proposed estimator will be 262.10% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 201.59%, 198.80%, 237.06% and 232.31% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when correlation coefficient of x and y equals to 0.7 and n = 20, our proposed estimator will be 274.65% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 204.64%, 201.69%, 233.34% and 231.27% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. At the same correlation coefficient, when we increase the sample size by n = 30, our proposed estimator will be 319.29% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 209.63%, 202.55%, 252.28% and 241.51% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when correlation coefficient of x and y equals to 0.8 and n = 20, our proposed estimator will be 366.52% more efficient than $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 218.39%, 209.52%, 249.15% and 241.89% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. At the same correlation coefficient, when we increase the sample size by n = 30, our proposed estimator will be 394.62% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 190.97%, 186.10%, 248.46% and 219.85% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively.

shows that, when correlation coefficient of x and y equals to 0.9 and n = 20, our proposed estimator will be 437.43% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 210.49%, 193.42%, 292.25% and 236.98% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. At the same correlation coefficient, when we increase the sample size by n = 30, our proposed estimator will be 516.48% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 185.20%, 178.34%, 318.10% and 258.32% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when correlation coefficient of x and y equals to 0.99 and n = 20, our proposed estimator will be 614.42% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 278.19%, 269.29%, 367.86% and 331.27% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. At the same correlation coefficient, when we increase the sample size by n = 30, our proposed estimator will be 680.86% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator is 227.26%, 216.05%, 351.55% and 305.36% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. Simulated results in show the trend that when we increase the sample size, efficiency of proposed estimators under RSS design also increases as compare to estimator under SRS design.

Results also revealed that as we increase the $ρ_{x y}$ , proposed estimator in RSS performs more efficiently as compared to its competitor estimator in SRS (i.e. $Δ_{(S R S) s h}$ ). Therefore, we may say that as the correlation coefficient of x and y increases, the use of RSS is more appropriate as compared to SRS.

6. Real-Life Applications

To observe performances of the estimators, we use the following real-life data sets. The descriptions of these populations are given below.

Population I [source: (James et al., Citation2013)]

The summary statistics are given below.

Y: Acceleration of automobiles

X: Engine horsepower of automobiles

Objective: To estimate population mean of Acceleration of automobiles.

\begin{aligned} N = 392, n = 30, m = 10, r = 3, \\ μ_{x} = 104.4694, μ_{y} = 15.5413, S_{y} = 2.7589, \\ S_{x} = 38.4912, C_{x} = 0. 3684, \\ C_{y} = 0. 1775, C_{x y} = - 0.0 451, β_{2 (x)} = 0.6541, \\ β_{1 (x)} = 1.079, ρ_{x y} = 0.9091, \end{aligned}

Population II [source: (Multiple Indicator Cluster Survey (MICS, 2018–19)]

The summary statistics are given as below.

Y: Body Mass Index (BMI)

X: Weight

Objective: To estimate population mean of Body Mass Index (BMI).

\begin{aligned} N = 39118, n = 30, m = 10, r = 3, \\ μ_{x} = 12.1883, μ_{y} = 16.8151, S_{y} = 10.8438, S_{x} = 10.7911, \\ C_{x} = 0.8854, C_{y} = 0.6449, C_{x y} = 0.4877, \\ β_{2 (x)} = 54.4802, β_{1 (x)} = 7.0622, ρ_{x y} = 0.5542, \end{aligned}

Population III [source: (Multiple Indicator Cluster Survey (MICS, 2018–19)]

The summary statistics are:

Y: Weight

X: Height

Objective: To estimate population mean of Weight.

\begin{aligned} N = 39118, n = 30, m = 10, r = 3, \\ μ_{x} = 94.6221, μ_{y} = 12.1883, S_{y} = 10.7911, \\ S_{x} = 101.0391, C_{x} = 1.0678, C_{y} = 0.8853, \\ C_{x y} = 0.7016, β_{2 (x)} = 74.6241, \\ β_{1 (x)} = 8.658, ρ_{x y} = 0.7421, \end{aligned}

Population IV [source: (Daly et al., Citation2001)]

The summary statistics are:

Y: Body Mass Index (BMI) of Crohn’s disease patients

X: Weight of Crohn’s disease patients

Objective: To estimate population mean of Body Mass Index (BMI) of Crohn’s disease patients.

\begin{aligned} N = 117, n = 20, m = 5, r = 4, \\ μ_{x} = 69.0256, μ_{y} = 26.0624, S_{y} = 4.9888, S_{x} = 14.2438, \\ C_{x} = 0.2063, C_{y} = 0.1914, C_{x y} = 0.0325, β_{2 (x)} = 0.7746, \\ β_{1 (x)} = 0.6571, ρ_{x y} = 0.8222, \end{aligned}

Population V [source: (Husby et al., Citation2005)]

The summary statistics are:

Y: Body Mass Index (BMI)

X: Thigh Circumference

Objective: To estimate population mean of Body Mass Index (BMI).

\begin{aligned} N = 36, n = 8, m = 4, r = 2, \\ μ_{x} = 49.3806, μ_{y} = 25.678, S_{y} = 3.8198, S_{x} = 3.7599, \\ C_{x} = 0.0761, C_{y} = 0.1488, C_{x y} = 0.0066, β_{2 (x)} = - 0.6159, \\ β_{1 (x)} = - 0.0607, ρ_{x y} = 0.9848, \end{aligned}

Percent Relative Efficiencies (PREs) of our proposed generalized family of estimators along with competitor estimators from literature have been presented in for different real-life populations. shows that, when we consider the population I, our proposed estimator will be 363.74% more efficient then $Δ_{(S R S) s h}$ . In the same situation the proposed estimator is 150.18%, 150.18%, 155.15% and 153.49% more efficient than $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when we consider the population II, our proposed estimator will be 418.12% more efficient than $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 140.35%, 140.35%, 177.79% and 151.89% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when we consider the population III, our proposed estimator will be 149.93% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 118.81%, 118.81%, 127.66% and 121.52% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when we consider the population IV, our proposed estimator will be 221.91% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 107.11%, 107.11%, 100.32% and 120.72% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively. shows that, when we consider the population V, our proposed estimator will be 410.47% more efficient then $Δ_{(S R S) s h}$ . In the same situation proposed estimator will be 104.32%, 104.32%, 124.13% and 186.79% more efficient then $Δ_{(R S S) m m 1}$ , $Δ_{(R S S) m m 2}$ , $Δ_{(R S S) m m 4}$ and $Δ_{(R S S) v z}$ respectively.

Table 6. PRE of Estimators for Population I

Display Table

Table 7. PRE of Estimators for Population II

Display Table

Table 8. PRE of Estimators for Population III

Display Table

Table 9. PRE of Estimators for Population IV

Display Table

Table 10. PRE of Estimators for Population V

Display Table

7. Conclusion

In this study, we proposed generalized family of estimators under RSS to estimate the finite population mean motivated from Shahzad et al. (Shahzad et al., Citation2019). The biases and MSEs of the proposed estimators were derived up to first order of approximation. The efficiency conditions for the proposed generalized estimator were also derived. On the basis of simulation study and real-life data sets, MSEs of all estimators have been computed and it is shown that the proposed generalized family of estimators are more efficient than the competitor estimators under SRS and RSS. It may concluded that with an increase in sample size and $ρ_{x y}$ the proposed estimator in RSS performs more efficiently as compared to its competitor estimators in SRS (i.e. $Δ_{(S R S) s h}$ ).

Public interest statement

Estimation of population parameters with minimum mean square error is very important issue of survey sampling. Different sampling designs and estimators have been proposed by researchers to deal with this issue. In this study, we proposed a generalized family of estimators for estimating population mean under classical ranked set sampling. Mathematical comparison, Simulation study and real-life applications have been utilized for comparison of efficiency.

Acknowledgements

The authors wish to thanks DG-Bureau of Statistics Punjab for providing the data about Multiple Indicator Cluster Survey (MICS) for the year 2018–19.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The authors received no direct funding for this research.

References

Ali, A., Butt, M. M., Azad, M. D., Ahmed, Z., & Hanif, M. (2021). Stratified Extreme-cum-Median Ranked Set sampling. Pakistan Journal of Statistics, 37(3), 215–9.
Google Scholar
Daly, M. J., Rioux, J. D., Schaffner, S. F.,Hudson, T. J., & Lander, E. S. (2001). High-resolution haplotype structure in the human genome. Nature Genetics, 29(2), 229–232. https://doi.org/https://doi.org/10.1038/ng1001-229
PubMed Web of Science ®Google Scholar
Husby, C. E., Stasny, E. A., & Wolfe, D. A. (2005). An application of ranked set sampling for mean and median estimation using USDA crop production data. Journal of Agricultural, Biological, and Environmental Statistics, 10(3), 354–373. https://doi.org/https://doi.org/10.1198/108571105X58234
Web of Science ®Google Scholar
Iqbal, K., Moeen, M., Ali, A., & Iqabl, A. (2020). Mixture regression cum ratio estimators of population mean under stratified random sampling. Journal of Statistical Computation and Simulation, 90(5), 854–868. https://doi.org/https://doi.org/10.1080/00949655.2019.1710149
Web of Science ®Google Scholar
James, G., Witten, D., Hastie, T.,& Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, pp. 18). springer.
Google Scholar
Kadilar, C., Unyazici, Y., & Cingi, H. (2009). Ratio estimator for the population mean using ranked set sampling. Statistical Papers, 50(2), 301–309. https://doi.org/https://doi.org/10.1007/s00362-007-0079-y
Web of Science ®Google Scholar
Khan, L., & Shabbir, J. (2016a). A class of Hartley-Ross type unbiased estimators for population mean using ranked set sampling. Hacettepe Journal of Mathematics and Statistics, 45(3), 917–928.
Web of Science ®Google Scholar
Khan, L., & Shabbir, J. (2016b). An efficient class of estimators for the finite population mean in ranked set sampling. Open Journal of Statistics, 6(3), 426–435. https://doi.org/https://doi.org/10.4236/ojs.2016.63038
Google Scholar
Khan, L., & Shabbir, J. (2016c). Hartley-Ross type unbiased estimators using ranked set sampling and stratified ranked set sampling. The North Carolina Journal of Mathematics and Statistics, 2, 10–22.
Google Scholar
Mandowara, V. L., & Mehta, N. (2013). Efficient generalized ratio-product type estimators for finite population mean with ranked set sampling. Austrian Journal of Statistics, 42(3), 137–148. https://doi.org/https://doi.org/10.17713/ajs.v42i3.147
Google Scholar
Mandowara, V. L., & Mehta, N. (2016). On the improvement of product method of estimation in ranked set sampling. Chilean Journal of Statistics (Chjs), 7(1), 43–53.
Web of Science ®Google Scholar
McIntyre, G. A. (1952). A method for unbiased selective sampling, using ranked sets. Australian Journal of Agricultural Research, 3(4), 385–390. https://doi.org/https://doi.org/10.1071/AR9520385
Google Scholar
Mehta, N., & Mandowara, V. L. (2016). A is the modified ratio-cum-product estimator of finite population mean using ranked set sampling. Communications in Statistics-Theory and Methods, 45(2), 267–276. https://doi.org/https://doi.org/10.1080/03610926.2013.830748
Web of Science ®Google Scholar
Multiple Indicator Cluster Survey (MICS, 2018-19). Unpublished Survey, Bureau of Statistics Punjab. [http://bos.gop.pk/mics]
Google Scholar
Patil, G. P., Sinha, A. K., & Taille, C. (1993). Relative precision of ranked set sampling: A comparison with the regression estimator. Environmetrics, 4(4), 399–412. https://doi.org/https://doi.org/10.1002/env.3170040404
Web of Science ®Google Scholar
Patil, G. P., Sinha, A. K., & Taillie, C. (1994). Ranked set sampling. Handbook of Statistics, 12, 167–200.
Google Scholar
Pelle, E., & Perri, P. F. (2018). Improving mean estimation in ranked set sampling using the Rao regression-type estimator. Brazilian Journal of Probability and Statistics, 32(3), 467–496. https://doi.org/https://doi.org/10.1214/17-BJPS350
Web of Science ®Google Scholar
Samawi, H. M., & Muttlak, H. A. (1996). Estimation of ratio using rank set sampling. Biometrical Journal, 38(6), 753–764. https://doi.org/https://doi.org/10.1002/bimj.4710380616
Web of Science ®Google Scholar
Shahzad, U., Hanif, M., Koyuncu, N.,Luengo, A. V. G., & Khan, N. (2019). An efficient generalized family of estimators for mean estimation under simple random sampling. Investigación Operacional, 40(1), 28–45.
Google Scholar
Singh, H. P., Tailor, R., & Singh, S. (2014). General procedure for estimating the population mean using ranked set sampling. Journal of Statistical Computation and Simulation, 84(5), 931–945. https://doi.org/https://doi.org/10.1080/00949655.2012.733395
Web of Science ®Google Scholar
Stokes, L. S. (1977). Ranked set sampling with concomitant variables. Communications in Statistics-Theory and Methods, 6(12), 1207–1211. https://doi.org/https://doi.org/10.1080/03610927708827563
Web of Science ®Google Scholar
Vishwakarma, G. K., Zeeshan, S. M., & Bouza, C. N. (2017). Ratio and product type exponential estimators for population mean using ranked set sampling. Investigación Operacional, 38(3), 266–271.
Google Scholar

Estimation of Population Mean by Using a Generalized Family of Estimators Under Classical Ranked Set Sampling

ABSTRACT

1. Introduction

Notations under Ranked Set Sampling Design

2. Some Existing Estimators under SRS and RSS

3. Proposed Generalized Family of Estimators under RSS

Derivation of Bias and Mean Square Error

4. Efficiency Comparison

5. Simulation Study

Table 1. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.4

Table 2. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.7

Table 3. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.8

Table 4. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.9

Table 5. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.99

6. Real-Life Applications

Table 6. PRE of Estimators for Population I

Table 7. PRE of Estimators for Population II

Table 8. PRE of Estimators for Population III

Table 9. PRE of Estimators for Population IV

Table 10. PRE of Estimators for Population V

7. Conclusion

Public interest statement

Acknowledgements

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Estimation of Population Mean by Using a Generalized Family of Estimators Under Classical Ranked Set Sampling

ABSTRACT

1. Introduction

Notations under Ranked Set Sampling Design

2. Some Existing Estimators under SRS and RSS

3. Proposed Generalized Family of Estimators under RSS

Derivation of Bias and Mean Square Error

4. Efficiency Comparison

5. Simulation Study

Table 1. PRE of Estimators by Simulation Study with ρxy = 0.4

Table 2. PRE of Estimators by Simulation Study with ρxy = 0.7

Table 3. PRE of Estimators by Simulation Study with ρxy = 0.8

Table 4. PRE of Estimators by Simulation Study with ρxy = 0.9

Table 5. PRE of Estimators by Simulation Study with ρxy = 0.99

6. Real-Life Applications

Table 6. PRE of Estimators for Population I

Table 7. PRE of Estimators for Population II

Table 8. PRE of Estimators for Population III

Table 9. PRE of Estimators for Population IV

Table 10. PRE of Estimators for Population V

7. Conclusion

Public interest statement

Acknowledgements

Disclosure statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.4

Table 2. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.7

Table 3. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.8

Table 4. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.9

Table 5. PRE of Estimators by Simulation Study with $ρ_{x y}$ = 0.99