837
Views
1
CrossRef citations to date
0
Altmetric
Research Article

An improved generalized class of estimators for population variance using auxiliary variables

, & | (Reviewing Editor)
Article: 1454579 | Received 12 Jul 2017, Accepted 16 Mar 2018, Published online: 05 Apr 2018

Abstract

This paper proposed an improved generalized class of estimator for estimating population variance using auxiliary variables based on simple random sampling without replacement. The expression of mean square error of the proposed estimator is obtained up to the first order of approximation. We have derived the conditions for the parameters under which the proposed estimator performs better compared to the usual estimator and other existing estimators. An empirical study and simulation study are also carried out with the support of theoretical results.

Public Interest Statement

Variance estimation plays an important role in survey sampling. It is used for two purposes. One is the analytic purpose such as for constructing confidence intervals or performing hypothesis testing and another one is for descriptive purpose such as to evaluate the efficiency of the survey designs or to provide estimates for planning surveys. Several authors have paid their attention towards formation of estimators for the estimation of population variance using auxiliary information. Motivated by their works, we have proposed an improved generalized class of estimators for estimation of population variance using auxiliary information. The expression of mean square error is derived up to the first order of approximation. We have compared our proposed estimator with the other existing estimators. The efficiencies of the estimators are validated using real population data-sets and simulation study.

1. Introduction

Through sampling literature it is well established that the use of Auxiliary information in Sampling Survey at the estimation stage improve the efficiency of the estimators of the population parameters. viz. population mean, variance, median, population correlation coefficient, etc. Ratio, regression and product method of estimation are good examples in this context. Several author including Ahmed, Raman, and Hossain (Citation2000), Gupta and Shabbir (Citation2008), Singh and Solanki (Citation2009–2010, Citation2013), Yadav and Kadilar (Citation2013), Solanki and Singh (Citation2013) proposed estimators of population variance using information of auxiliary variable under different sampling schemes. Asghar, Sanaullah, and Hanif (Citation2014) proposed generalized exponential-type estimators using information on auxiliary variable for estimating the population variance Sy2. Singh, Chauhan, Sawan, and Smarandache (Citation2009) and Adichwal, Sharma, and Singh (Citation2015) proposed estimator of population variance using two auxiliary variables and suggested to use two auxiliary variables, if they are made available. In this paper, we have also proposed an improved estimator for the population variance Sy2 based on simple random sampling without replacement utilizing information of two auxiliary variables X and Z.

Let us consider a simple random sample (SRS) of size n is drawn from the given population of N units. Let the value of the study variable Y and the auxiliary variables X and Z for the ith units (i = 1, 2, 3, …, N) of the population be denoted by Yi, Xi, and Zi for the ith unit in the sample (i = 1, 2, 3, …, n) by yi, xi and zi, respectively. From the sample observation we havey¯=1ni=1nyi,x¯=1ni=1nxi,z¯=1ni=1nzi,sy2=1n-1i=1nyi-y¯2,sx2=1n-1i=1nxi-x¯2andsz2=1n-1i=1nzi-z¯2

Let us definey=sy2-Sy2Sy2,x=sx2-Sx2Sx2,z=sz2-Sz2Sz2

such that E(∊x) = E(∊y) = E(∊z) = 0E(x2)=1n(040-1),E(y2)=1n(400-1),E(z2)=1n(004-1)E(xy)=1n(220-1),E(yz)=1n(202-1),E(zx)=1n(022-1)

where, pqr=μpqr(μ200p/2μ020q/2μ002r/2), μpqr=1Ni=1N(yi-Y¯)p(xi-X¯)q(zi-Z¯)r; p, q, r being the non-negative integers.

2. Estimators in literature

In order to estimate population variance of the study variable Y, using the information of two auxiliary variables X and Z, Singh et al. (Citation2009) proposed a general class of exponential estimator given as(2.1) t1=sy2kexpSx2-sx2Sx2+sx2+(1-k)expsz2-Sz2sz2+Sz2(2.1)

Adichwal et al. (Citation2015) proposed the following two generalized class of estimators for estimation of population variance using two auxiliary variables as(2.2) tN=wsy2+(1-w)sy2stX2Sx2expαstZ2-Sz2stZ2+Sz2(2.2) (2.3) tM=δsy2Sx2sx2βexpγSz2-sz2Sz2+sz2+(1-δ)sy2stX2Sx2βexpγstZ2-Sz2stZ2+Sz2(2.3)

The mean square error (MSE) expressions of the estimators t1, tN and tM are, respectively, given by(2.4) MSE(t1)=Sy4n(400-1)+k024(040-1)+(1-k0)24(004-1)-k0(220-1)+(1-k0)(202-1)-k0(1-k0)2(022-1)(2.4)

where, k=004+2220+202+022-6040+004+2022-4=k0(2.5) MSE(tN)=1nSy4α24(w0-1)2g2(004-1)+(400-1)+(w0-1)2g2(040-1)+2g(w0-1)(220-1)+αg(w0-1)(202-1)+αg2(w0-1)2(022-1)(2.5)

where, w0=1-2g(220-1)+αg(202-1)α2g22(004-1)+2g2(040-1)+2αg2(022-1)(2.6) MSE(tM)=Sy4nβ2(δ0g-g-δ0)2(040-1)+(400-1)+γ24(δ0g-g-δ0)2(004-1)+2β(δ0g-g-δ0)(220-1)+γ(δ0g-g-δ0)(202-1)+γβ(δ0g-g-δ0)2(022-1)(2.6)

where, δ0=4β2g(040-1)+γ2g(004-1)-4β(220-1)-2γ(202-1)+4γβg(022-1)(g-1)4β2(040-1)+γ2(004-1)+4γβ(022-1)

3. Proposed estimators

In this paper, we have proposed a generalized exponential-type estimator for population variance for the study variable Y based on simple random sampling without replacement using information of two auxiliary variable X and Z given by(3.1) t=λsy2expηSX2-sx2SX2+a-1sx2+ψSZ2-sz2SZ2+b-1sz2(3.1)

where η, ψ, a, and b are suitable chosen constants to be determined such that the MSE of t is minimum.

Expanding Equation (3.1) in terms of e’s up to the first order of approximation, we have(3.2) t=λSY2(1+y)exp-ηxa+(a-1)x+-ψzb+(b-1)z(3.2) (3.3) or,t-SY2=(λ-1)SY2+λSY2y-ψzb-ηxa-ηyzb-ηxya(3.3) (3.4) or,t-SY2=(λ-1)SY2+λSY2y-ψz-ηx-ψyz-ηxy(3.4)

where, ηa=η and ψb=ψ

Squaring both sides of Equation (3.4) and taking its expectations we get the MSE expressions of the estimator t as,(3.5) MSE(t)=(λ-1)2SY4+λ2nSY4400-1+ψ2004-1+η2040-1-2ψ202-1-2η220-1+2ηψ022-1(3.5)

Partially differentiating Equation (3.5) with respect to λ, η*, and ψ* and equating it to zero, we get the optimum value of λ, η* and ψ* as λopt=11+1n400-1+ψ2004-1+η2040-1-2ψ202-1-2η220-1+2ηψ022-1ηopt=220-1-ψ022-1040-1ψopt=202-1040-1-220-1022-1040-1004-1-022-12

Substituting the optimum value of λopt, ηoptand ψopt in Equation (3.5), we obtain the minimum MSE associated with the estimators t as,(3.6) MSE(t)min=(λopt-1)2SY4+λopt2nSY4400-1+ψopt2004-1+ηopt2040-1-2ψopt202-1-2ηopt220-1+2ηoptψopt022-1(3.6)

4. Efficiency comparison

In this section, we are comparing the minimum MSE of the proposed estimator t with usual estimator sy2 and other existing estimators.

The variance of the usual estimator sy2 under SRSWOR is given by(4.1) V(sy2)=SY4n400-1(4.1) (4.2) V(sy2)-MSE(t)min=1-λopt2nSY4400-1-(λopt-1)2SY4-λ2nSY4ψopt2004-1+ηopt2040-1-2ψopt202-1-2ηopt220-1+2ηoptψopt022-10(4.2)

(4.3) MSE(tN)-MSE(t)min=1nSy4α24(w0-1)2g2(004-1)+(400-1)+(w0-1)2g2(040-1)+2g(w0-1)(220-1)+αg(w0-1)(202-1)+αg2(w0-1)2(022-1)-(λopt-1)2SY4+λopt2nSY4400-1+ψopt2004-1+ηopt2040-1-2ψopt202-1-2ηopt220-1+2ηoptψopt022-10(4.3)

(4.4) MSE(tM)-MSE(t)min=Sy4nβ2(δ0g-g-δ0)2(040-1)+(400-1)+γ24(δ0g-g-δ0)2(004-1)+2β(δ0g-g-δ0)(220-1)+γ(δ0g-g-δ0)(202-1)+γβ(δ0g-g-δ0)2(022-1)-(λopt-1)2SY4+λopt2nSY4400-1+ψopt2004-1+ηopt2040-1-2ψopt202-1-2ηopt220-1+2ηoptψopt022-10(4.4)

When the conditions (4.2) to (4.4) are satisfied, our suggested estimator t will be more efficient as compared to sy2, t1, tN and tM respectively.

5. Empirical study

To illustrate the performance of various estimators of Sy2, we consider the following data sets

Population I [Source: Sarjinder Singh (Citation2003, p. 1116)].

y: Fish caught in year 1995, x: Fish caught in year 1993, z: Fish caught in year 1994,

N = 69, n = 25.

400 = 7.7685, ∂040 = 9.9860, ∂004 = 9.9851, ∂220 = 8.3107, ∂202 = 8.1715, ∂022 = 9.6631, Sy2=37199578.

Population II [Source: Murthy (Citation1967, p. 399)].

y: Area under wheat in 1964, x: Area under wheat in 1963, z: Cultivated area in 1961,

N = 34, n = 15.

400 = 3.7879, ∂040 = 2.9123, ∂004 = 2.8082, ∂220 = 3.1046, ∂202 = 2.9790, ∂022 = 2.7379, Sy2=22564.55704.

From Table , it is clear that the MSE of the estimator “t” is less as compared to the estimators sy2, tN, tM and t1. In terms of PRE’s, it is observed that the PRE of the proposed estimator t is higher as compared to PRE’s of the estimators sy2, tN, tM, and t1 for both the given data-sets. So, the estimator t is more efficient than the estimator sy2 and other existing estimators.

Table 1. PRE of various estimators with respect to Sy2

6. Simulation study

This section presents the computational procedure for the comparison of proposed estimator with other existing estimators. The simulation study is based on the algorithm proposed by Reddy, Rao, and Boiroju (Citation2010) to illustrate the performance of various estimators of Sy2. The following algorithm explain the simulation procedure used in this paper.

Step-1: Generate two independent random variable X from N μ,σ2 and X1 from Nμ1,σ12 using box-Muller method (Jhonson, Citation1987).

Step-2: Set Y=ρX+1-ρ2X1 where 0 < ρ = 0.4, 0.6, 0.8 < 1.

Step-3: return the pair (YX).

Step-4: Consider the population-I with the parameters μ = 5,σ = 3, μ1 = 5 and σ1 = 3 in step-1 and repeat the steps 1 to 3 for 1000 times. This population will contain the same variance for the variable Y and X.

Step-5: Similarly, generate the population-II with the parameters μ = 3, σ = 2, μ1 = 5 and σ1 = 3 in step-1 and repeat the steps 1 to 3 for 1000 times. This population will have different variances for the variable Y and X.

Step-6: From the population of size N = 1000, draw 2000 SRSs yi,xii=1,2,,n without replacement of size n = 40, 50 and 60.

Step-7: The Average mean squared error (MSE) of the estimators are defined byAverage MSEt=12000k=12000Etk-SY22

The PRE of an estimator t with respect to the usual estimator sy2 is defined byPREt=MSEsy2×100MSEt

The obtained results of the simulation study are as follows.

From Tables and we observe that as the sample size and the value of correlation coefficient increases, the average PRE’s of the estimators also increases for both the population I and II. The average PRE of the estimator “t” is higher in all cases indicating that the proposed estimator t is more efficient as compared to usual estimator sy2 and other existing estimators tN, tM, and t1.

Table 2. Comparison of proposed estimator t with other existing estimators for Population-I

Table 3. Comparison of proposed estimator t with other existing estimators for Population-II

7. Conclusion

This paper proposed an improved generalized class of estimator for estimating population variance based on simple random sampling without replacement using information of two auxiliary variables. The performance of the proposed estimator is verified by using two real population data-sets and by simulation study. Tables clearly show that the proposed estimator t is more efficient as compared to the usual estimator and other existing estimators. Hence, it is recommended for use in practice.

Funding

The authors received no direct funding for this research.

Additional information

Notes on contributors

Rajesh Singh

Nitesh K. Adichwal is a research fellow at the Banaras Hindu University in Varanasi. His research interest is in the area of sampling survey and demography. He has completed his MSc in Statistics from Purvanchal University, Jaunpur in 2010. He has also completed his MSc in Population Studies from International institute for population sciences, Mumbai in 2014. This paper has also published research papers in the area of sampling survey and demography in various national and international journals.

References

  • Adichwal, N. K., Sharma, P., & Singh, R. (2015). Generalized class of estimators for population variance using information on two auxiliary variables. International Journal of Applied and Computational Mathematics, 1–11.
  • Ahmed, M. S., Raman, M. S., & Hossain, M. I. (2000). Some competitive estimators of finite population variance multivariate auxiliary information. Information and Management Sciences, 11(1), 49–54.
  • Asghar, A., Sanaullah, A., & Hanif, M. (2014). Generalized exponential- type estimator for population variance in survey sampling. Revista Colombiana de Estadística, 37(1), 211–222.
  • Gupta, S., & Shabbir, J. (2008). Variance estimation in simple random sampling using auxiliary information. Hacettepe Journal of Mathematics and Statistics, 37, 57–67.
  • Jhonson, M. E. (1987). Multivariate Statistical Simulation. New York, NY: John Wiley & Sons.
  • Murthy, M. N. (1967). Sampling theory and methods. Calcutta: Statistical Publishing Society.
  • Reddy, M. K., Rao, K. R., & Boiroju, N. K. (2010). Comparison of ratio estimators using Monte Carlo simulation. International Journal of Agriculture and Statistical Sciences, 6(2), 517–527.
  • Singh, H. P., & Solanki, R. S. (2013). A new procedure for variance estimation in simple random sampling using auxiliary information. Statistical Papers, 54(2), 479–497.10.1007/s00362-012-0445-2
  • Singh, H. P., & Solanki, R. S. (2009–2010). Estimation of finite population variance using auxiliary information in presence of random non-response. Gujarat Statistical Review, 36 & 37 (1 & 2), 46–58.
  • Singh, R., Chauhan, P., Sawan, N., & Smarandache, F. (2009). Improved exponential estimator for population variance using two auxiliary variables. Studies in Statistical Inference, Sample Techniques and Demography, 36–44.
  • Singh, S. (2003). Advanced sampling theory with applications: How Michael”“ Selected”“ Amy(Vol. 2). Springer Science & Business Media.10.1007/978-94-007-0789-4
  • Solanki, R. S., & Singh, H. P. (2013). An Improved class of estimators for the population variance. Model Assisted Statistics and Applications, 8(3), 229–238.
  • Yadav, S. K., & Kadilar, C. (2013). Improved exponential type ratio estimator of population variance. Revista Colombiana de Estadística, 36(1), 145–152.