1,283
Views
13
CrossRef citations to date
0
Altmetric
Research Article

The generalized weighted Lindley distribution: Properties, estimation, and applications

& | (Reviewing Editor)
Article: 1256022 | Received 27 Jul 2016, Accepted 25 Oct 2016, Published online: 28 Nov 2016

Abstract

In this study, a three-parameter lifetime distribution namely generalized weighted Lindley (GLW) distribution is proposed. The GLW distribution is an useful generalization of the weighted Lindley distribution which accommodates increasing, decreasing, decreasing-increasing-decreasing, bathtub, and unimodal hazard rate making it a flexible model for reliable data. A significant account of mathematical properties for this distribution is presented. Different estimation procedures are discussed such as maximum likelihood estimators, method of moments, ordinary and weighted least-squares, percentile, maximum product of spacings, and minimum distance estimators. The estimators are compared by extensive numerical simulations. Finally, two data-sets are analyzed for illustrative purposes proving that the GWL outperforms several other three-parameter lifetime distributions.

Public Interest Statement

We have proposed and presented a probability distribution called generalized weighted Lindley (WL) distribution. This distribution is an useful generalization of the WL distribution which accommodates increasing, decreasing, decreasing-increasing-decreasing, bathtub, and unimodal hazard rate. A significant account of mathematical properties for this distribution was presented. Different estimation procedures were proposed and compared by extensive numerical simulations. We believe that new distribution will allow the users to describe different data-sets obtaining a better predictive performance in comparison with other usual distributions.

1. Introduction

In recent years, several new extensions of the exponential distribution have been introduced in the literature for describing real problems. Ghitany, Atieh, and Nadarajah (Citation2008) investigated different properties of the Lindley distribution and outlined that in many cases the Lindley distribution outperforms exponential distribution. Since then, many generalizations of the Lindley distribution have been introduced such as generalized Lindley (Zakerzadeh & Dolati, Citation2009), extended Lindley (Bakouch, Al-Zahrani, Al-Shomrani, Marchi, & Louzada, Citation2012), exponential Poisson Lindley (Barreto-Souza & Bakouch, Citation2013), and Power Lindley (Ghitany, Al-Mutairi, Balakrishnan, & Al-Enezi, Citation2013) distribution.

Ghitany, Alqallaf, Al-Mutairi, and Husain (Citation2011) introduced a new class of weighted Lindley (WL) distribution adding more flexibility to the Lindley distribution. Let T be a random variable with a WL distribution. Then probability density function (p.d.f) is given by(1) f(t|λ,ϕ)=λϕ+1(λ+ϕ)Γ(ϕ)tϕ-1(1+t)e-λt,(1)

for all t>0 , ϕ>0 and λ>0 and Γ(ϕ)=0e-xxϕ-1dx is the gamma function. One of its peculiarities is that the hazard function can have an increasing (ϕ1) or bathtub (0<ϕ<1) shape. Different properties and estimation methods for this model were presented by Mazucheli, Louzada, and Ghitany (Citation2013), Ali (Citation2015), Wang and Wang (Citationin press), Al-Mutairi, Ghitany, and Kundu (Citation2015).

In this study, a new lifetime distribution family is proposed which is a direct generalization of the WL distribution. The p.d.f is given by(2) f(t|ϕ,λ,α)=αλαϕ(λ+ϕ)Γ(ϕ)tαϕ-1(λ+(λt)α)e-(λt)α,(2)

for all t>0, ϕ>0,λ>0 and α>0. Important probability distributions can be obtained from the GWL distribution as the WL distribution (α=1) , Power Lindley distribution (ϕ=1) and the Lindley distribution (ϕ=1 and α=1). Due to this relationship, such model could also be named as weighted power Lindley or generalized power Lindley distribution.

Torabi, Falahati-Naeini, and Montazeri (Citation2014) discussed a class of distribution with four parameters which is a generalization of the proposed model. Such distribution includes the generalized WL, generalized gamma (GG) distribution, gamma and Weibull, among others. The main difference of this study lies in the fact that the proposed three-parameter distribution has a simple structure with less computational issues. In this way, the behavior of the p.d.f and the hazard function can be studied. This model has different forms of hazard function such as increasing, decreasing, bathtub, unimodal, or decreasing-increasing-decreasing shape making the GWL distribution a flexible model for reliable data. Moreover, a significant account of mathematical properties for the new distribution is provided.

The inferential procedures for the parameters of GLW distribution are presented considering different methods such as maximum likelihood estimators (MLE), methods of moments (ME), ordinary least-squares estimation (OLSE), weighted least-squares estimation (WLSE), maximum product of spacings (MPS), Cramer-von Mises type minimum distance (CME), Anderson–Darling (ADE) and right-tail Anderson–Darling (RADE). The performance of these estimation procedures are compared using extensive numerical simulations. Finally, two data-sets are analyzed for illustrative purposes proving that the GWL outperforms several usual three-parameter lifetime distributions such as the GG distribution (Stacy, Citation1962), the generalized Weibull (GW) distribution (Mudholkar, Srivastava, & Kollia, Citation1996), the generalized exponential-Poisson (GEP) distribution (Barreto-Souza & Cribari-Neto, Citation2009), and the exponentiated Weibull (EW) distribution (Mudholkar, Srivastava, & Freimer, Citation1995).

The results of this paper are organized as follows. Section 2 provides a significant account of mathematical properties for the new distribution. Section 3 presents the eight estimation methods which are considered. In the Section 4, a simulation study is presented in order to identify the most efficient procedure. Section 5 illustrates the proposed methodology in two real data-sets. Section 6 summarizes the present work.

2. Generalized weighted Lindley distribution

The generalized WL distribution (2) can be expressed as a two-component mixturef(t|ϕ,λ,α)=pf1(t|ϕ,λ,α)+(1-p)f2(t|ϕ,λ,α)

where p=λ/(λ+ϕ) and TjGG(ϕ+j-1,λ,α), for j=1,2, i.e. fj(t|λ,ϕ) has GG distribution, given by(3) fj(t|ϕ,λ,α)=αΓ(ϕ+j-1)λα(ϕ+j-1)tα(ϕ+j-1)-1e-(λt)α.(3)

The behavior of the p.d.f. (2) when t0 and t are, respectively, given byf(0)=,ifαϕ<1αλ2(λ+ϕ)Γ(ϕ),ifαϕ=10,ifαϕ>1,f()=0.

Figure gives examples of the shapes of the density function for different values of ϕ,λ and α.

Figure 1. Density function shapes for GWL distribution considering different values of ϕ,λ and α.

Figure 1. Density function shapes for GWL distribution considering different values of ϕ,λ and α.

The cumulative distribution function from the GWL distribution is given by(4) F(t|ϕ,λ,α)=γϕ,(λt)α(λ+ϕ)-(λt)αϕe-(λt)α(λ+ϕ)Γ(ϕ)(4)

where γ[y,x]=0xwy-1e-wdw is the lower incomplete gamma function.

2.1. Moments

Many important features and properties of a distribution can be obtained through its moments such as mean, variance, kurtosis, and skewness. In this section, important moment functions such as the moment-generating function, r-th moment, r-th central moment, among others are presented.

Theorem 2.1

For the random variable T with GWL distribution, the moment-generating function is given by(5) MX(t)=r=0trλrr!rα+ϕ+λΓ(rα+ϕ)(λ+ϕ)Γ(ϕ).(5)

Proof

Note that, the moment-generating function from GG distribution (3) is given byMX,j(t)=r=0trr!Γrα+ϕ+j-1λrΓ(ϕ+j-1).

Since the GWL distribution (2) can be expressed as a two-component mixture, we haveMX(t)=EetX=0etxf(x|ϕ,λ,α)dx=pMX,1(t)+(1-p)MX,2(t)=λ(λ+ϕ)r=0trr!Γrα+ϕλrΓ(ϕ)+ϕ(λ+ϕ)r=0trr!Γrα+ϕ+1λrΓ(ϕ+1)=1(λ+ϕ)r=0trr!λΓ(rα+ϕ)λrΓ(ϕ)+1(λ+ϕ)r=0trr!rα+ϕΓrα+ϕλrΓ(ϕ)=r=0trλrr!rα+ϕ+λΓrα+ϕ(λ+ϕ)Γ(ϕ).

Corollary 2.2

For the random variable T with GWL distribution, the r-th moment is given by(6) μr=E[Tr]=rα+ϕ+λΓrα+ϕ(λ+ϕ)λrΓ(ϕ).(6)

Proof

Note that, μr=MX(r)(0)=dnMX(0)dtn and the result follows.

Corollary 2.3

For the random variable T with GWL distribution, the r-th central moment is given by(7) Mr=E[T-μ]r=i=0rri(-μ)r-iE[Ti]=i=0rri-1α+ϕ+λΓ1α+ϕλ(λ+ϕ)Γ(ϕ)r-iiα+ϕ+λΓ(iα+ϕ)(λ+ϕ)λiΓ(ϕ).(7)

Corollary 2.4

A random variable T with GWL distribution has the mean and variance respectively given by(8) μ=1α+ϕ+λΓ1α+ϕλ(λ+ϕ)Γ(ϕ),(8) (9) σ2=λ(λ+ϕ)2α+ϕ+λΓ2α+ϕ-1α+ϕ+λ2Γ1α+ϕ2λ2(λ+ϕ)2Γ(ϕ)2.(9)

Proof

From (6) and considering r=1 we have μ1=μ. The second result follows from (7) considering r=2 with some algebraic manipulations.

Another moment function that can be easily achieved for GWL distribution and plays an important role in information theory is given by(10) E[log(T)]=ψ(ϕ)-αlogλ+(λ+ϕ)-1α.(10)

2.2. Survival properties

In this section, we present the survival, hazard, and mean residual life (MRL) function for the GWL distribution. The survival function of T is given by(11) S(t|ϕ,λ,α)=Γϕ,(λt)α(λ+ϕ)+(λt)αϕe-(λt)α(λ+ϕ)Γ(ϕ)(11)

where Γ(x,y)=0xwy-1e-xdw is called upper incomplete gamma. The hazard function is given as(12) h(t|ϕ,λ,α)=f(t|ϕ,λ,α)S(t|ϕ,λ,α)=αλαϕtαϕ-1(λ+(λt)α)e-(λt)αΓϕ,(λt)α(λ+ϕ)+(λt)αϕe-(λt)α.(12)

The behavior of the hazard function (12) when t0 and t are, respectively, given byh(0)=,ifαϕ<1αλ2(λ+ϕ)Γ(ϕ),ifαϕ=10,ifαϕ>1andh()=0,ifαϕ<1λ,ifαϕ=1,ifαϕ>1.

Theorem 2.5

The hazard rate function h(t) of the GWL distribution is increasing, decreasing, bathtub, unimodal, or decreasing-increasing-decreasing shaped.

Proof

The theorem proposed by Glaser (Citation1980) is not easily applied in the GLW distribution. Since the hazard rate function (12) is complex, we considered the following cases:

(1)

Let α=1, then GWL distribution reduces to the WL distribution. Ghitany et al. (Citation2011) proved that the hazard function is bathtub-shaped (increasing) if 0<ϕ<1(ϕ>0), for all λ>0.

(2)

Let ϕ=1, then GWL distribution reduces to the PL distribution. Considering β=λα, Ghitany et al. (Citation2013) proved that the hazard function is

 
  • increasing when 0<α1,β>0;

  • decreasing when 0<α12,β>0 or 12<α<1,β(2α-1)2(4α(1-α))-1;

  • decreasing-increasing-decreasing if 12<α<1,0<β<(2α-1)2(4α(1-α))-1.

 
(3)

Let α=2 and λ=1, from Glaser’s theorem (Glaser, Citation1980), the hazard rate function is decreasing shaped (unimodal) for 0<ϕ<1(ϕ>1).

These properties make the GWL distribution a flexible model for reliable data. Figure gives examples of the shapes of the hazard function for different values of ϕ,λ and α.

Figure 2. Hazard function shapes for GWL distribution and considering different values of ϕ,λ and α.

Figure 2. Hazard function shapes for GWL distribution and considering different values of ϕ,λ and α.

The MRL has been widely used in survival analysis and represents the expected additional lifetime given that a component has survived until time t.

Proposition 2.6

The MRL function r(t|ϕ,λ,α) of the GWL distribution is given by(13) r(t|ϕ,λ,α)=ϕ+1α+λΓϕ+1α,λtα-λt(λ+ϕ)Γϕ,λtαλ[(λ+ϕ)Γ(ϕ,λtα)+(λt)αϕe-λtα].(13)

Proof

Note thatr(t|ϕ,λ,α)=1S(t)tyf(y|λ,ϕ)dy-t=1S(t)ptyf1(y|λ,ϕ)dy+(1-p)xyf2(y|λ,ϕ)dy-t=ϕ+1α+λΓϕ+1α,λtα-λt(λ+ϕ)Γϕ,λtαλ[(λ+ϕ)Γ(ϕ,λtα)+(λt)αϕe-λtα].

The behavior of the MRL function when t0 and t are, respectively, given byr(0)=1λ(λ+ϕ)Γ(ϕ)andr(),ifα<11λ,ifα=10,ifα>1.

2.3. Entropy

In information theory, entropy has played a central role as a measure of uncertainty associated with a random variable. Shannon’s entropy is one of the most important metrics in information theory. For the GWL distribution, Shannon’s entropy can be obtained by solving(14) HS(ϕ,λ,α)=-0logαλαϕtαϕ-1(λ+(λt)α)e-(λt)α(λ+ϕ)Γ(ϕ)f(t|ϕ,λ,α)dt.(14)

Proposition 2.7

A random variable T with GWL distribution has Shannon’s entropy given by(15) HS(ϕ,λ,α)=log(λ+ϕ)+logΓ(ϕ)-logα-logλ-ϕ(1+ϕ+λ)(λ+ϕ)-ψ(ϕ)(αϕ-1)α-(αϕ-1)α(λ+ϕ)-η(ϕ,λ)(λ+ϕ)Γ(ϕ)(15)

whereη(ϕ,λ)=0(λ+y)log(λ+y)yϕ-1e-ydy=01(λ-logu)log(λ-logu)(-logu)ϕ-1du.

Proof

From the Equation (14), we have(16) HS(ϕ,λ,α)=-logα-αϕlogλ+log(λ+ϕ)+log(Γ(ϕ))+λαE[Tα]-(αϕ-1)E[logT]-Elog(λ+(λT)α).(16)

Note thatElog(λ+(λT)α)=0log(λ+(λT)ααλαϕtαϕ-1(λ+(λt)α)e-(λt)α(λ+ϕ)Γ(ϕ)dt,

using the change of variable y=(λt)α and after some algebraElog(λ+(λT)α)=1(λ+ϕ)Γ(ϕ)0(λ+y)log(λ+y)yϕ-1e-ydy=η(ϕ,λ)(λ+ϕ)Γ(ϕ).

From Equations (6) and (10), we can easily find the solution of E[Tα] and E[logT] and the result as follows.

Another popular entropy measure is proposed by Renyi (Citation1961). Some recent applications of the Renyi entropy can be seen in Popescu and Aiordachioaie (Citation2013). If T has the probability density function (1) then Renyi entropy is defined by(17) 11-ρlog0fρ(x)dx.(17)

Proposition 2.8

A random variable T with GWL distribution, has the Renyi entropy given by(18) HR(ρ)=(ρ-1)(logα+logλ)-ρlog(λ+ϕ)+logΓ(ϕ)-log(δ(ρ,ϕ,λ,α))1-ρ(18)

where δ(ρ,ϕ,λ,α)=0yρϕ-ρ+1-αα(λ+y)ρe-ρydy.

Proof

The Renyi entropy is given byHR(ρ)=11-ρlogαρλρ(λ+ϕ)ρΓ(ϕ)ρ0(λt)αρϕ-1αλ+(λt)αρe-ρ(λt)αdt=11-ρlogαρλρ(λ+ϕ)ρΓ(ϕ)ρ0yρϕ-ρ+1-αα(λ+y)ρe-ρydy=11-ρlogαρλρ(λ+ϕ)ρΓ(ϕ)ρδ(ρ,ϕ,λ,α)

and with some algebra the proof is completed.

2.4. Lorenz curves

The Lorenz curve (Bonferroni, Citation1930) is a well-known measure used in reliability, income inequality, life testing and renewal theory. The Lorenz curve for a non-negative T random variable is given through the consecutive plot ofLF(t)=0txf(x)dx0xf(x)dx=1μ0txf(x)dx.

Proposition 2.9

The Lorenz curve for the GWL distribution isLp=1α+ϕ+λγϕ+1α,λF-1(p)α-λF-1(p)αϕ-1e-λF-1(p)α1α+ϕ+λΓ1α+ϕ

where F-1(p)=tp.

3. Methods of estimation

In this section, we present eight different estimation methods for the parameters ϕ,λ and α of the GWL distribution.

3.1. Maximum likelihood estimation

The maximum likelihood method has been widely used due to its better asymptotic properties. The estimates are obtained by maximizing the likelihood function. Let T1,,Tn be a random sample where TGWL(ϕ,λ,α), the likelihood function is given by(19) L(ϕ,λ,α;t)=αnλnαϕ(λ+ϕ)Γ(ϕ)ni=1ntiαϕ-1i=1nλ+(λti)αexp-λαi=1ntiα.(19)

The log-likelihood function l(ϕ,λ,α;t)=logL(ϕ,λ,α;t) is given by(20) l(ϕ,λ,α;t)=nlogα+nαϕlogλ-nlog(λ+ϕ)-nlogΓ(ϕ)+(αϕ-1)i=1nlog(ti)+i=1nlogλ+(λti)α-λαi=1ntiα.(20)

From the expressions ϕl(ϕ,λ,α;t)=0, λl(ϕ,λ,α;t)=0, αl(ϕ,λ,α;t)=0, the likelihood equations are(21) nα^log(λ^)+α^i=1nlog(ti)=nλ^+ϕ^+nψ(ϕ^)(21) (22) nα^ϕ^λ^+i=1n1+α^λ^α^-1tiα^λ^+(ti)α^=α^λ^α^-1i=1ntiα^+nλ^+ϕ^(22)

and(23) nα^+nϕ^log(λ^)+ϕ^i=1nlog(ti)+i=1n(λ^ti)α^log(λ^ti)λ^+(λ^ti)α^=λ^α^i=1ntiα^log(λ^ti),(23)

where ψ(k)=klogΓ(k)=Γ(k)Γ(k). Numerical methods such as Newton-Rapshon are required to find the solution of the nonlinear system. Note that from (21) and (23) and after some algebra we have(24) α^MLE=1nlog(λ^)+i=1nlog(ti)nλ^+ϕ^+nψ(ϕ^)(24) (25) ϕ^MLE=λ^α^i=1ntiα^log(λ^ti)-i=1n(λ^ti)α^log(λ^ti)λ^+(λ^ti)α^-nα^nlog(λ^)+i=1nlog(ti).(25)

Under mild conditions, the maximum likelihood estimates (MLEs) are asymptotically normal distributed with a joint multivariate normal distribution given by(ϕ^MLE,λ^MLE,α^MLE)N3(ϕ,λ,α),I-1(ϕ,λ,α))asn.

where I(ϕ,λ,α) is the Fisher information matrix is given as(26) I(ϕ,λ,α)=Iϕ,ϕ(ϕ,λ,α)Iϕ,λ(ϕ,λ,α)Iϕ,α(ϕ,λ,α)Iϕ,λ(ϕ,λ,α)Iλ,λ(ϕ,λ,α)Iλ,α(ϕ,λ,α)Iϕ,α(ϕ,λ,α)Iλ,α(ϕ,λ,α)Iα,α(ϕ,λ,α),(26)

and the elements of the matrix are given in Appendix 2.

3.2. Moments estimators

The method of moments is one of the oldest methods used for estimating parameters in statistical models. The moments estimators (MEs) of the GLW distribution can be obtained by equating the first three sample moments x¯=1ni=1nti, 1ni=1nti2 and 1ni=1nti3 with the theoretical moments1ni=1nti=1α+ϕ+λΓ1α+ϕ(λ+ϕ)λΓ(ϕ)1ni=1nti2=2α+ϕ+λΓ2α+ϕ(λ+ϕ)λ2Γ(ϕ)and1ni=1nti3=3α+ϕ+λΓ3α+ϕ(λ+ϕ)λ3Γ(ϕ).

Therefore, the ME ϕ^ME, λ^ME and α^ME, can be obtained by solving the non-linear equationsjα+ϕ+λΓ(jα+ϕ)(λ+ϕ)λjΓ(ϕ)-1ni=1ntij=0,j=1,2,3.

3.3. Ordinary and weighted least-square estimate

Let t(1),t(2),,t(n) be the order statistics (the same notation is assumed for the next subsections) of the random sample of size n from F(t|ϕ,λ,α). The least square estimators ϕ^LSE, λ^LSE and α^LSE can be obtained by minimizingVϕ,λ,α=i=1nFt(i)|ϕ,λ,α-in+12

with respect to ϕ,λ and α. Equivalently, the estimates can be obtained by solving the non-linear equationsi=1nFt(i)|ϕ,λ,α-in+1Δjt(i)|ϕ,λ,α=0,j=1,2,3

where(27) Δ1t(i)|ϕ,λ,α=ϕFt(i)|ϕ,λ,α,Δ2t(i)|ϕ,λ,α=λFt(i)|ϕ,λ,αandΔ3t(i)|ϕ,λ,α=αFt(i)|ϕ,λ,α.(27)

Note that the solution of Δi for i=1,2,3 involves partial derivatives of the lower incomplete gamma function. However, this can be easily achieved numerically with high precision.

The weighted least-squares estimates (WLSEs), ϕ^WLSE, λ^WLSE and α^WLSE, can be obtained by minimizingWϕ,λ,α=i=1nn+12n+2in-i+1Ft(i)|ϕ,λ,α-in+12.

These estimates can also be obtained by solving the non-linear equationsi=1nn+12n+2in-i+1Ft(i)|ϕ,λ,α-in+1Δjt(i)|ϕ,λ,α=0,j=1,2,3,

where Δ1·|ϕ,λ,α, Δ2·|ϕ,λ,α and Δ3·|ϕ,λ,α are given in (27).

3.4. Method of maximum product of spacings

The MPS method is a powerful alternative to MLE for the estimation of unknown parameters of continuous univariate distributions. Proposed by Cheng and Amin (Citation1979,Citation1983), this method was also independently developed by Ranneby (Citation1984) as an approximation to the Kullback–Leibler information measure. Cheng and Amin (Citation1983) proved desirable properties of the MPS such as asymptotic efficiency, invariance, and more importantly, the consistency of maximum product of spacing estimators holds under more general conditions than for MLEs.

Let Di(ϕ,λ,α)=Ft(i)|ϕ,λ,α-Ft(i-1)|ϕ,λ,α, for i=1,2,,n+1, be the uniform spacings of a random sample from the GWL distribution, where F(t(0)|ϕ,λ,α)=0 and F(t(n+1)|ϕ,λ,α)=1. Clearly i=1n+1Di(ϕ,λ,α)=1. The MPS estimates ϕ^MPS, λ^MPS and α^MPS are obtained by maximizing the geometric mean of the spacings(28) Gϕ,λ,α=i=1n+1Di(ϕ,λ,α)1n+1(28)

with respect to ϕ, λ and α, or, equivalently, by maximizing the logarithm of the geometric mean of sample spacings(29) Hϕ,λ,α=1n+1i=1n+1logDi(ϕ,λ,α).(29)

The estimates ϕ^MPS, λ^MPS and α^MPS of the parameters ϕ, λ and α can be obtained by solving the nonlinear equations(30) 1n+1i=1n+11Di(ϕ,λ,α))Δj(t(i)|ϕ,λ,α)-Δj(t(i-1)|ϕ,λ,α)=0,j=1,2,3,(30)

where Δ1·|ϕ,λ,α, Δ2·|ϕ,λ,α and Δ3·|ϕ,λ,α are given respectively in (27). Note that if t(i+k)=t(i+k-1)==t(i) then Di+k(ϕ,λ,α)=Di+k-1(ϕ,λ,α)==Di(ϕ,λ,α)=0. Therefore, the MPS estimators are sensitive to closely spaced observations, especially ties. When the ties are due to multiple observations, Di(ϕ,λ,α) should be replaced by the corresponding likelihood f(t(i),ϕ,λ,α) since t(i)=t(i-1).

Under mild conditions for the GWL distribution, the MPS estimators are asymptotically normal distributed with a joint trivariate normal distribution given by(ϕ^MPS,λ^MPS,α^MPS)N3(ϕ,λ,α),I-1(ϕ,λ,α))asn.

3.5. The Cramer-von Mises minimum distance estimators

The Cramer-von Mises estimator is a type of minimum distance estimators (also called maximum goodness-of-fit estimators) and is based on the difference between the estimate of the cumulative distribution function and the empirical distribution function (Luceño, Citation2006).

Macdonald (Citation1971) motivated the choice of the CME estimators providing empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. The Cramer-von Mises estimates ϕ^CME, λ^CME and α^CME of the parameters ϕ, λ and α are obtained by minimizing(31) C(ϕ,λ,α)=112n+i=1nFt(i)|ϕ,λ,α-2i-12n2,(31)

with respect to ϕ, λ and α. These estimates can also be obtained by solving the nonlinear equations:i=1nFt(i)|ϕ,λ,α-2i-12nΔjt(i)|ϕ,λ,α=0,j=1,2,3,

where Δ1·|ϕ,λ,α, Δ2·|ϕ,λ,α and Δ3·|ϕ,λ,α are given respectively in (27).

3.6. The Anderson–Darling and Right-tail Anderson–Darling estimators

Another type of minimum distance estimator is based on ADE statistic and is known as ADE estimator. The ADE estimates ϕ^ADE,λ^ADE and α^ADE of the parameters ϕ,λ and α are obtained by minimizing, with respect to ϕ, λ and α, the function(32) A(ϕ,λ,α)=-n-1ni=1n2i-1logFt(i)|ϕ,λ,α+logSt(n+1-i)|ϕ,λ,α.(32)

These estimates can also be obtained by solving the nonlinear equationsi=1n2i-1Δjt(i)|ϕ,λ,αFt(i)|ϕ,λ,α-Δjt(n+1-i)|ϕ,λ,αSt(n+1-i)|ϕ,λ,α=0,j=1,2,3.

The Right-tail ADE estimates ϕ^RADE,λ^RADE and α^RADE of the parameters ϕ,λ and α are obtained by minimizing the function(33) R(ϕ,λ,α)=n2-2i=1nFti:n|ϕ,λ,α-1ni=1n2i-1logStn+1-i:n|ϕ,λ,α.(33)

with respect to ϕ, λ and α. These estimates can also be obtained by solving the nonlinear equations:-2i=1nΔjti:n|ϕ,λ,α+1ni=1n2i-1Δjtn+1-i:n|ϕ,λ,αStn+1-i:n|ϕ,λ,α=0,j=1,2,3.

where Δ1·|ϕ,λ,α, Δ2·|ϕ,λ,α and Δ3·|ϕ,λ,α are given respectively in (27).

4. Simulation study

In this section, an intensive simulation study is presented to compare the efficiency of the estimation procedures for parameters of the GWL distribution. The following procedure was adopted:

(1)

Generate pseudo-random values from the GWL(ϕ,λ,α) with size n.

(2)

Using the values obtained in step 1, calculate ϕ^, λ^ and α^ via 1-MLE, 2-MPS, 3-ADE, 4-RTADE, 5-LSE, 6-WLSE, 7-ME, 8-CME.

(3)

Repeat the steps 1 and 2 N times.

(4)

Using θ^=(ϕ^,λ^,α^) and θ=(ϕ,λ,α), compute the mean relative estimates (MRE) j=1Nθ^i,j/θiN and the mean square errors (MSE) j=1N(θ^i,j-θi)2N, for i=1,2,3.

Considering this approach, the most efficient estimation method will have MREs closer to one and MSEs closer to zero. The results were computed using the software R using the seed 2015 to generate the pseudo-random values. The initial values considered were the same values used to generate the random samples. The chosen values to perform this procedure were N=10,000 and n=(50,60,,250). For reasons of space, we have presented the results only for θ=(2,0.5,0.1). However, the following results are similar for other choices of θ.

Figure 3. Proportion of failure from N simulated samples, considering different values of n using the following estimation method 1-MLE, 2-MPS, 3-ADE, 4-RTADE, 5-LSE, 6-WLSE, 7-ME, 8-CME.

Figure 3. Proportion of failure from N simulated samples, considering different values of n using the following estimation method 1-MLE, 2-MPS, 3-ADE, 4-RTADE, 5-LSE, 6-WLSE, 7-ME, 8-CME.

Figure 4. MREs, MSEs related from the estimates of ϕ=0.5,λ=0.7 and α=1.5 for N simulated samples, considering different values of n obtained using the following estimation method 1-MLE, 2-MPS, 3-ADE, 4-RTADE.

Figure 4. MREs, MSEs related from the estimates of ϕ=0.5,λ=0.7 and α=1.5 for N simulated samples, considering different values of n obtained using the following estimation method 1-MLE, 2-MPS, 3-ADE, 4-RTADE.

Figure 5. (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution.

Figure 5. (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution.

Figure 6. (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution.

Figure 6. (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution.

For this comparison to be meaningful, the estimation procedures need to be performed under same conditions. However, for some particular samples and estimation methods, the numerical techniques do not work well in finding the parameter estimates. Therefore, a rate study is presented to verify the frequency of convergence of the numerical solutions. This procedure is carried out by counting the number of times each estimation fails in finding the numerical solution. In Figure we present the proportion of failure from each method.

From Figure , the MLE, LSE, WLSE, ME, and the CME estimators fail in finding the parameter estimates for a significant number of samples. Therefore, such methods are not recommended for estimation of the GLW parameters. Hereafter, we consider the MPS, ADE, RADE estimators due to their better computational stability. The MLE is considered only for illustrative purposes since it is the most used estimation method. Figure presents the MREs, MSEs for the estimates of ϕ,λ and α using the MLE, MPS, ADE, RADE with N simulated samples and different values of θ=(2,0.5,0.1) and n. The horizontal lines in both figures correspond to MREs and MSEs being one and zero, respectively.

From these results, the MSE of the MLE, MPS, ADE, and RADE estimators tend to zero for large n and also, as expected, the values of MREs tend to one, i.e. the estimates are consistent and asymptotically unbiased for the parameters. For small sample sizes, the MLE has the largest MSEs. The MPS has smaller MSEs with MREs closer to one for almost all values of n. Additionally, the MPS, ADE, and RADE estimators were the only methods that were able to find ϕ^,λ^ and α^ for all the 2×106 generated samples. Therefore, combining all results with the good properties of the MPS method such as consistency, asymptotic efficiency, normality and invariance, we conclude that the MPS estimators are a highly competitive method compared to the maximum likelihood for estimating the parameters of the GWL distribution.

5. Application

In this section, we compare the GWL distribution with other three-parameter lifetime distributions considering two data-sets, the first with bathtub hazard rate and the other with the increasing hazard function. The following lifetime distributions were considered. The GG distribution with p.d.f given byf(t)=αΓ(ϕ)-1βαϕtαϕ-1e-(βt)α

where β>0,ϕ>0 and α>0. The GW distribution where the p.d.f isf(t)=(αϕ)-1(t/ϕ)1/α-1(1-λ(t/ϕ)1/α)1/λ-1

where λR,ϕ>0 and α>0. The GEP distribution with p.d.f given byf(t)=αβϕ/(1-e-ϕ)αe-ϕ-βt+ϕexp(-βt)1-e-ϕ+ϕexp(-βt)α-1

where β>0,ϕ>0 and α>0

The EW distribution with p.d.ff(t)=αϕβ-1(t/β)α-1exp-(t/β)α1-exp-(t/β)αϕ-1

where β>0,ϕ>0 and α>0.

The TTT-plot (total time on test) is considered in order to verify the behavior of the empirical hazard function (Barlow & Campo, Citation1975). The TTT-plot is obtained through the plot of [r / nG(r / n)] whereG(r/n)=i=1rti+(n-r)t(r)/i=1nti,r=1,,n,i=1,,n

and t(i) is the statistical order. If the curve is concave (convex), the hazard function is increasing (decreasing). On the other hand, when it starts convex and then becomes concave (concave and then convex) the hazard function has bathtub (inverse bathtub) shape.

The goodness of fit is checked considering the Kolmogorov–Smirnov (KS) test. This procedure is based on the KS statistic Dn=supFn(t)-F(t;ϕ,λ,α), where supt is the supremum of the set of distances, Fn(t) is the empirical distribution function and F(t;α,β,λ) is c.d.f. A hypothesis test is conducted at the 5% level of significance to test whether or not the data come from F(t;α,β,λ). In this case, the null hypothesis is rejected if the returned p-value is smaller than 0.05.

To carry out the model selection, the following discrimination criterion methods are adopted: AIC (Akaike information criteria) and AICc (Corrected Akaike information criterion) computed, respectively, by AIC=-2l(θ^;t)+2k and AICc=AIC+2k(k+1)(n-k-1)-1, where k is the number of parameters to be fitted and θ^ is estimation of θ. For a set of candidate models for t, the best one provides the minimum values.

5.1. Lifetimes data

Aarset (Citation1987) presents the data-set (see Table ) related to the lifetime in hours of 50 devices on test

Table 1. Lifetimes data (in hours) related to a device on test

Figure shows (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution. Table presents the AIC and AICc criteria and the p-value from the KS test for all fitted distributions considering the Aarset dataset.

Table 2. Results of AIC and AICc criteria and the p-value from the KS test for all fitted distributions considering the Aarset dataset

Table 3. MPS estimates, standard-error and 95% CI for ϕ,λ and α

Table 4. January average flows (m3/s) of the Cantareira system

Table 5. Results of AIC and AICc criteria and the p-value from the KS test for all fitted distributions considering the data-set related to the january average flows (m3/s) of the Cantareira system

Table 6. ML estimates, standard-error and 95% CI for ϕ,λ and α

Comparing the empirical survival function with the adjusted distributions, it can be observed that the GWL distribution is as a better fit. This result is also confirmed from the AIC and AICC (see Table ) since GWL distribution has the minimum values and also the p-values returned from the KS test are greater than 0.05. It should be emphasized that considering a significance level of 5%, the others models are not able to fit the proposed data. Table displays the MPS estimates, standard errors, and the confidence intervals (CI) for ϕ,λ and α of the GWL distribution.

In this section, we consider the ML estimator showing that both MPS or MLE could be used successfully in applications. Figure shows (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution. Table presents the AIC and AICc criteria and the p-value from the KS test for all fitted distributions considering the data-set related to the January average flows (m3/s) of the Cantareira system.

From the empirical survival function and the adjusted distributions, it can be observed that the GWL distribution is better. This result is also confirmed from AIC and AICC since GWL distribution has the minimum values and the p-values returned from the KS test are greater than 0.05. Table displays the ML estimates, standard errors, and the CI for ϕ,λ and α of the GWL distribution.

5.2. Average flows data

The study of average flows has been proved to be of high importance to protect and maintain aquatic resources in streams and rivers (Reiser, Wesche, & Estes, Citation1989). In this section, we consider a real data-set related to the average flows (m3/s) of the Cantareira system during January at São Paulo city in Brazil. It is worth mentioning that the Cantareira system provides water to 9 million people in the São Paulo metropolitan area. The data-set available in Table was obtained from the National Water Agency from 1930 to 2012.

6. Concluding remarks

To summarize, we have proposed a three-parameter lifetime distribution. The GLW distribution is a straightforward generalization of the WL distribution proposed by Ghitany et al. (Citation2011), which accommodates increasing, decreasing, decreasing-increasing-decreasing, bathtub, and unimodal hazard rate making the GWL distribution a flexible model for reliable data. The mathematical properties of this distribution are also discussed.

The estimation procedures for the parameters of GWL distribution are also derived considering eight estimation methods. Since it is not feasible to compare these methods theoretically, we have presented an extensive simulation study in order to identify the most efficient procedure. We observed that the MLE, ME, LSE, WLSE, and the CME estimators fail in finding the parameter estimates for a significant number of samples. The simulations showed that the MPS (maximum product of spacing) is the most efficient method for estimating the parameters of the GWL distribution in comparison to its competitors. Finally, two data-sets were analyzed for illustrative purposes proving that the GWL distribution outperforms several usual three parameter lifetime distributions.

Acknowledgements

We are grateful to the Editorial Board and the reviewers for their valuable comments and suggestions which has improved the manuscript.

Additional information

Funding

The research was partially supported by CNPq, FAPESP, and CAPES of Brazil.

Notes on contributors

P.L. Ramos

P.L. Ramos holds a BSc degree in Statistics and an MSc in Applied and Computational Mathematics from the São Paulo State University, Brazil. He is currently reading for his PhD in Statistics at the Institute for Mathematical Science and Computing, University of São Paulo (USP), Brazil. His main research interests are in survival analysis, Bayesian inference, classical inference, and probability distribution theory.

F. Louzada

F. Louzada is a professor of Statistics at the Institute for Mathematical Science and Computing, University of So Paulo (USP), Brazil. He received his PhD degree in Statistics from the University of Oxford, UK, his MSc degree in Computational Mathematics from USP, Brazil, and his BSc degree in Statistics from UFSCar, Brazil. His main research interests are in survival analysis, data mining, Bayesian inference, classical inference, and probability distribution theory.

References

  • Aarset, M. V. (1987). How to identify a bathtub hazard rate. IEEE Transactions on Reliability, 36, 106–108.
  • Ali, S. (2015). On the bayesian estimation of the weighted lindley distribution. Journal of Statistical Computation and Simulation, 85, 855–880.
  • Al-Mutairi, D., Ghitany, M., & Kundu, D. (2015). Inferences on stress-strength reliability from weighted lindley distributions. Communications in Statistics-Theory and Methods, 44, 4096–4113.
  • Bakouch, H. S., Al-Zahrani, B. M., Al-Shomrani, A. A., Marchi, V. A., & Louzada, F. (2012). An extended lindley distribution. Journal of the Korean Statistical Society, 41, 75–85.
  • Barlow, R. E., & Campo, R. A. (1975). Total time on test processes and applications to failure data analysis (Technical report). Berkeley, CA: DTIC Document.
  • Barreto-Souza, W., & Bakouch, H. S. (2013). A new lifetime model with decreasing failure rate. Statistics, 47, 465–476.
  • Barreto-Souza, W., & Cribari-Neto, F. (2009). A generalization of the exponential-poisson distribution. Statistics & Probability Letters, 79, 2493–2500.
  • Bonferroni, C. (1930). Elementi di statistica generale. Firenze: Seeber.
  • Cheng, R. & Amin, N. (1979). Maximum product of spacings estimation with application to the lognormal distribution (Mathematical Report 79-1). Cardiff: University of Wales IST.
  • Cheng, R., & Amin, N. (1983). Estimating parameters in continuous univariate distributions with a shifted origin. Journal of the Royal Statistical Society. Series B (Methodological), 45, 394–403.
  • Ghitany, M., Al-Mutairi, D., Balakrishnan, N., & Al-Enezi, L. (2013). Power lindley distribution and associated inference. Computational Statistics & Data Analysis, 64, 20–33.
  • Ghitany, M., Alqallaf, F., Al-Mutairi, D., & Husain, H. (2011). A two-parameter weighted lindley distribution and its applications to survival data. Mathematics and Computers in Simulation, 81, 1190–1201.
  • Ghitany, M., Atieh, B., & Nadarajah, S. (2008). Lindley distribution and its application. Mathematics and Computers in Simulation, 78, 493–506.
  • Glaser, R. E. (1980). Bathtub and related failure rate characterizations. Journal of the American Statistical Association, 75, 667–672.
  • Luceño, A. (2006). Fitting the generalized pareto distribution to data using maximum goodness-of-fit estimators. Computational Statistics & Data Analysis, 51, 904–917.
  • Macdonald, P. (1971). An estimation procedure for mixtures of distribution. Journal of the Royal Statistical Society. Series B (Methodological), 33, 326–329.
  • Mazucheli, J., Louzada, F., & Ghitany, M. (2013). Comparison of estimation methods for the parameters of the weighted lindley distribution. Applied Mathematics and Computation, 220, 463–471.
  • Mudholkar, G. S., Srivastava, D. K., & Freimer, M. (1995). The exponentiated weibull family: A reanalysis of the bus-motor-failure data. Technometrics, 37, 436–445.
  • Mudholkar, G. S., Srivastava, D. K., & Kollia, G. D. (1996). A generalization of the weibull distribution with application to the analysis of survival data. Journal of the American Statistical Association, 91, 1575–1583.
  • Popescu, T. D., & Aiordachioaie, D. (2013). Signal segmentation in time-frequency plane using renyi entropy-application in seismic signal processing. In 2013 Conference on Control and Fault-Tolerant Systems (SysTol) (pp. 312–317). Nice: IEEE.
  • Ranneby, B. (1984). The maximum spacing method. An estimation method related to the maximum likelihood method. Scandinavian Journal of Statistics, 11, 93–112.
  • Reiser, D. W., Wesche, T. A., & Estes, C. (1989). Status of instream flow legislation and practices in north america. Fisheries, 14, 22–29.
  • Renyi, A. (1961). On measures of entropy and information. In Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, 47–561.
  • Stacy, E. W. (1962). A generalization of the gamma distribution. The Annals of Mathematical Statistics, 33, 1187–1192.
  • Torabi, H., Falahati-Naeini, M., & Montazeri, N. (2014). An extended generalized lindley distribution and its applications to lifetime data. Journal of Statistical Research of Iran, 11, 203–222.
  • Wang, M., & Wang, W. (in press). Bias-corrected maximum likelihood estimation of the parameters of the weighted lindley distribution. Communications in Statistics-Simulation and Computation, 46, 530–545.
  • Zakerzadeh, H., & Dolati, A. (2009). Generalized lindley distribution. Journal of Mathe-matical Extension, 3, 1–17

Appendix 1

Appendix

The elements of the Fisher information matrix areIϕ,ϕ=-El(θ;t)ϕ2=-1(λ+ϕ)2+ψ(θ)Iϕ,λ=-El(θ;t)ϕλ=-αλ+1(λ+ϕ)2Iϕ,α=-El(θ;t)ϕα=-αlog(λ)-ψ(ϕ)+αlog(λ)-(λ+ϕ)-1αIλ,λ=-El(θ;t)λ2=αϕλ2+(α-1)λα-2(ψ(ϕ)-αlog(λ)+(λ+ϕ)-1)+EαTαλα-2(α-2)λ-(λT)αλ+(λT)α-1(λ+ϕ)2Iα,α=-El(θ;t)α2=ϕ(λ+ϕ+1)ψ(ϕ)2+ψ(ϕ)α2(λ+ϕ)+1α2+2(λ+2ϕ+1)ψ(ϕ)+2α2(λ+ϕ)-Eλ(λT)αlog(λT)2λ+(λT)αIα,λ=-El(θ;t)αλ=-ϕλ+λ1+ϕψ(ϕ)+ϕ1+(ϕ+1)ψ(ϕ+1)λ(λ+ϕ)-E1+αλα-1Tα(λT)αlog(λT)λ+(λT)α2+ϕ+λ+1-1αΓϕ+1-1α(λ+ϕ)Γ(ϕ)-Eαλα-1Tαlog(λT)+(λT)α-1λ+(λT)α