512
Views
1
CrossRef citations to date
0
Altmetric
Articles

Some results on quantile-based Shannon doubly truncated entropy

, &
Pages 59-70 | Received 11 Apr 2018, Accepted 20 Feb 2019, Published online: 15 Mar 2019

ABSTRACT

Sunoj et al. [(2009). Characterization of life distributions using conditional expectations of doubly (Intervel)truncated random variables. Communications in Statistics – Theory and Methods, 38(9), 1441–1452] introduced the concept of Shannon doubly truncated entropy in the literature. Quantile functions are equivalent alternatives to distribution functions in modelling and analysis of statistical data. In this paper, we introduce quantile version of Shannon interval entropy for doubly truncated random variable and investigate it for various types of univariate distribution functions. We have characterised certain specific lifetime distributions using the measure proposed. Also we discuss one fascinating practical example based on the quantile data analysis.

1. Introduction

Let X be a non-negative absolutely continuous random variable representing the lifetime of a component with cumulative distribution function (CDF) F(t)=P(Xt) and survival function F¯(t)=P(X>t)=1F(t). In modelling and analysis of lifetime data, the average amount of uncertainty associated with a non-negative continuous random variable X is given in the differential entropy function H(X)=0f(x)logf(x)dx, a continuous counterpart of the Shannon (Citation1948) entropy in the discrete case and f(x) is the probability density function (pdf) of the random variable X. While the concept of entropy has found increased application, little attention has yet been given to the practical problems of estimating entropy. Gong, Yang, Gupta, and Nearing (Citation2014) discussed a method for computing robust and accurate estimates of entropy that accounts for several important characteristics of hydrological data sets. Since this entropy is not applicable to a system which has survived for some unit of time or for used item. The residual lifetime of the system when it is still operating at time t is Xt=(Xt|X>t); which has the probability density f(x;t)=f(x)/F¯(t), xt>0. Ebrahimi (Citation1996) proposed the entropy of the residual lifetime Xt as (1) H(X;t)=tf(x)F¯(t)logf(x)F¯(t)dx,t>0.(1) In some practical situations, uncertainty is related to past life time rather than future. In this situation, the random variable Xt=(tX|Xt), which is known as inactivity time is suitable to describe the time elapsed between the failure of a system and the time when it is found to be ‘down’. Based on this idea, Di Crescenzo and Longobardi (Citation2002Citation2004) have considered the entropy of the inactivity time Xt given as (2) H¯(X;t)=0tf(x)F(t)logf(x)F(t)dx.(2) In many situations, we only have information between two points, and in this case statistical measures are studied under the condition of doubly truncated random variables. The doubly truncated measures are applicable to engineering systems when the observations are measured after it starts operating and before it fails. If the random variable X denotes the lifetime of a unit, then the random variable Xt1,t2=(Xt1|t1Xt2) is called the doubly truncated (interval) residual lifetime, which in special case t2 tends to residual lifetime random variable Xt. Also, we can use the doubly truncated past lifetime random variable Xt1,t2=(t2X|t1Xt2), which in special case t1=0, it tends to past lifetime random variable Xt. Another extension of Shannon entropy is based on a doubly truncated (interval) random variable, which is as follows, (3) H(X;t1,t2)=t1t2f(x)F(t2)F(t1)logf(x)F(t2)F(t1)dx.(3) Given that a system has survived up to time t1 and has been found to be down at time t2, then H(X;t1,t2) measure the uncertainty about its lifetimes between t1 and t2. Different aspects and properties of H(X;t1,t2) have been studied by Sunoj, Sankaran, and Maya (Citation2009) and Misagh and Yari (Citation2011Citation2012). For various results on doubly truncated random variable, we refer to Sankaran and Sunoj (Citation2004)), Khorashadizadeh, Rezaei Roknabadi, and Mohtashami Borzadaran (Citation2013), Kayal and Moharana (Citation2016), and Kundu (Citation2017).

All the theoretical investigations and applications using these information measures are based on the distribution function. A probability distribution can be specified either in terms of its distribution function or by the quantile function. Although both convey the same information about the distribution with different interpretations, the concepts and methodologies based on distribution functions are traditional. When traditional approach are either difficult or fails to obtain desired results then quantile-based study were carried out. However, as Gilchrist (Citation2000) discussed, there are many distinct properties for quantile functions that are not shared by the distribution functions, which makes the former attractive in certain practical situations. For inference purposes, quantile-based statistics are often more robust than those based on moments in the distribution function approach. Furthermore, there exist many simple quantile functions, that serve very well in empirical model building, for which distribution function are not in tractable forms, refer to van Staden and Loots (Citation2009), Hankin and Lee (Citation2006) and Nair, Sankaran, and Vinesh Kumar (Citation2012). In many cases, QF is more convenient as it is less influenced by extreme observations and thus provides a straightforward analysis with a limited amount of information. In such cases, conventional tools of analysis using distribution functions are difficult to apply. An alternative approach to the study is to use the quantile functions (QFs), defined by (4) Q(u)=F1(u)=inf{x|F(x)u},0u1.(4) When F is continuous, we have from (Equation4), FQ(u)=u, where FQ(u) represents the composite function F(Q(u)). Defining the density quantile function by fQ(u)=f(Q(u)) and quantile density function by q(u)=Q(u), where prime denotes the differentiation, we have q(u)fQ(u)=1, refer to Nair, Sankaran, and Balkrishanan (Citation2013). Several researchers have studied information theoretic measures based on quantile function. Sunoj and Sankaran (Citation2012) have considered the quantile version of Shannnon entropy and its residuals form, defined as (5) ̧H=01logq(p)dp(5) and (6) ̧H(u)=log(1u)+(1u)1u1logq(p)dp(6) respectively. Sunoj, Sankaran, and Nanda (Citation2013) have considered the quantile past entropy, which is defined as (8) ̧H¯(u)=logu+u10ulogq(p)dp.(8) Readers can refer to Nanda, Sankaran, and Sunoj (Citation2014), Baratpour and Khammar (Citation2018), Sankaran and Sunoj (Citation2017), Guoxin (Citation2018), and Kumar (Citation2018) for more works on this line.

Motivated with the usefulness of the quantile function and the interval entropy, in the present note, we introduced a quantile version of Shannon interval entropy and derived some new characterisations to certain probability distributions as well as study its important properties. The proposed measure has several advantage. The measure proposed for doubly truncated random variable appears in quasar survey, where an investigator assumes that the apparent magnitude is doubly truncated. Also, the times to progression for patients with certain disease who received chemotherapy, experienced tumour progression and subsequently died, are doubly truncated. Secondly, quantile functions (QFs) have several properties that are not shared by distribution functions. Application of these properties give some new results and better insight into the properties of the measure that are difficult to obtain in the conventional approach. However, the use of QFs in the place of F provides new models, alternative methodology, easier algebraic manipulations, and methods of analysis in certain cases and some new results that are difficult to derive by using distribution function.

The paper is organised as follows. In Section 2, we consider the quantile version of Shannon interval entropy. In Section 3, the quantile interval entropy has been derived in case of some specific distributions. In Section 4, we study characterisation results concerning Quantile Interval Entropy (QIE) and also characterise a few specific lifetime distributions. Finally, conclusions have been given along with comments.

2. Quantile interval entropy

Defining a doubly truncated random variable (X|u1Xu2) which represents the lifetime of a unit between u1 and u2, where (u1,u2)D={(u1,u2);Q(u1)<Q(u2)}. Corresponding to (Equation4), a measure of uncertainty for the doubly truncated random variable in term of quantile function (Equation4) is defined as (8) ̧H(u1,u2)=u1u2f(Q(p))(u2u1)logf(Q(p))u2u1dQ(p)=log(u2u1)+1(u2u1)u1u2log(q(p))dp.(8) The important quantile measures useful in reliability analysis are hazard quantile function and reversed hazard quantile, defined as A(u)=((1u)q(u))1 and A¯(u)=(uq(u))1, respectively, corresponding to the hazared rate a(x)=f(x)/F¯(x) and reversed hazared rate a¯(x)=f(x)/F(x) of X. In doubly truncation, Ruiz and Navarro (Citation1996) defined the generalised hazard function (GHF) given by h1(t1,t2)=f(t1)/(F(t2)F(t1)) and h2(t1,t2)=f(t2)/(F(t2)F(t1)), respectively. Thus, generalised quantile hazard functions (GQHF) are defined as (9) ̧h1(u1,u2)=1(u2u1)q(u1)anḑh2(u1,u2)=1(u2u1)q(u2),(9) respectively. Equation (Equation8) can be rewritten as (10) ̧H(u1,u2)=11(u2u1)u1u2logA(p)dp+1(u2u1)[(1u2)log(1u2)(1u1)log(1u1)+log(u2u1)],(10) (11) ̧H(u1,u2)=11(u2u1)u1u2logA¯(p)dp+1(u2u1)[u2log(u2u1)logu1+log(u2u1)](11) where (Equation10) and (Equation11) are the expression of quantile entropy in terms of hazard quantile function and reversed hazard quantile function, respectively. Using (Equation6), (Equation7) and (Equation8), the quantile entropy (Equation5) can be decomposed as (12) ̧H=u1̧H¯(u1)+(u2u1)̧H(u1,u2)+(1u2)̧H(u2)[u1logu1+(u2u1)log(u2u1)+(1u2)log(1u2)].(12) The identity (Equation12) can be interpreted by decomposing the uncertainty about the failure of item into in the following way. into four parts:

  1. The uncertainty about the failure time in (0,u1) given that the item has failed before u1,

  2. The uncertainty about the failure time in the interval (u1,u2) given that the item has failed after u1 but before u2,

  3. The uncertainty about the failure time in (u2,) given that it has failed after u2,

  4. The uncertainty of the item that has failed before u1 or in between u1 and u2 or after u2,

    Differentiating ̧H(u1,u2) with respect to u1 and u2, we have (13) ̧H(u1,u2)u1=̧h1(u1,u2)loģh1(u1,u2)+ ̧H (u1,u2)1(13) and (14) ̧H(u1,u2)u2=̧h2(u1,u2)loģh2(u1,u2)+ ̧H (u1,u2)1.(14)

When ̧H(u1,u2) is increasing in u1 and u2, then, (Equation13) and (Equation14) together imply (1loģh1(u1,u2))̧H(u1,u2)1loģh2(u1,u2).

Nair and Rajesh (Citation2000) gave some applications of geometric vitality function. Sunoj et al. (Citation2009) discussed few properties of this measure and showed that it determines the distribution function uniquely. Next, we define the quantile-based geometric vitality function.

Definition 2.1

Let X be a non-negative random variable then geometric vitality quantile function (GVQF) for a doubly truncated random variable is defined by (15) G(u1,u2)=E(logX|u1<X<u2)=1(u2u1)u1u2fQ(p)q(p)logQ(p)dp=1(u2u1)u1u2logQ(p)dp.(15)

This gives the geometric mean life of a doubly truncated random variable between the points u1 and u2. Relationships between geometric vitality quantile function (Equation15) for doubly truncated random variables and generalised hazard quantile function (Equation9) are given in Table .

Table 1. Relationship between GVQF and GHQF.

where (16) R(u1,u2)=E1X|u1<X<u2=1(u2u1)u1u21Q(p)dp.(16)

2.1. Relationship between H(u1,u2) and quantile condition measure of uncertainty

Based on residual life distribution, Sankaran and Gupta (Citation1999) have introduced a new measure of uncertainty known as conditional measure of uncertainty, which is defined as M(X;t)=E(logf(X)|X>t)=(1/F¯(t))tf(x)logf(x)dx. The doubly truncated situation was considered in Sunoj et al. (Citation2009) given as follows M(X;t1,t2)=E(logf(X)|t1<X<t2)=((1)/(F(t2)F(t1)))t1t2f(x)logf(x)dx. Using (Equation4), the quantile-based condition measure of uncertainty for the doubly truncated random variable defined as (17) ̧M(u1,u2)=1(u2u1)u1u2f(Q(u))logf(Q(u))dQ(u)=1(u2u1)u1u2logq(u)du.(17) Using (Equation17) in (Equation8), we obtain (18) ̧M(u1,u2)=̧H(u2u1)log(u2u1).(18) Differentiation of (Equation18) with respect to u1 and u2, respectively, provides the relationships with GHQF, which is given as ̧M(u1,u2)/u1=̧H(u1,u2)/u1+̧h1(u1,u2), and ̧M(u1,u2)/u2=̧H(u1,u2)/u2̧h2(u1,u2). The various relationships between the quantile condition measure of uncertainty (Equation17) and GHQF (Equation9) for some commonly used probability models are given in Table .

Table 2. Relation between ̧M(u1,u2) and GHQF for various distributions.

3. Quantile interval entropy for various univariate distributions

In reliability theory, while studying the lifetime of a component or a system, a flexible model which is used in the literature is that of a generalised Pareto distribution (GPD) with survival function F¯(x)=bax+b1/a+1,x>0 b>0, a>1. It plays an important role in extreme value theory and other branches of statistics. The GPD, as a family of distributions, includes the exponential distribution when a0, the Pareto type-II distribution or Lomax distribution for a>0, which is used in the investigation of city population, occurrence of natural resources, insurance risk, size of human settlements, reliability modelling and business failure. It has been an important model in many socio-economic studies. The GPD becomes power distribution for 1<a<0. Next, let us discuss some examples on expression for quantile interval entropy function for some commonly used univariate distribution.

Example 3.1

If X is a random variable following the GPD with quantile function and quantile density function are given, respectively, by Q(p)=(b/a)((1p)a/(a+1)1) and q(p)=(b/(a+1))(1p)((2a+1)/(a+1)). Hence quantile interval entropy (Equation8) for GPD is given by ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logq(p)dp=log(u2u1)+1(u2u1)u1u2logba+11p2a+1a+1dp,=log(u2u1)+logba+1(u2u1)u1u2dp2a+1(a+1)(u2u1)u1u2log(1p)dp, which gives, (19) ̧H(u1,u2)=log(u2u1)+logba+12a+1a+11(u2u1)(1u2)1log(1u2)(1u1)1log(1u1).(19) When u1=u and u2=1, then (Equation19) reduces to ̧H(u)=log(b/(a+1))+((2a+1)/(a+1))(a/(a+1))log(1u), the quantile residual entropy (Equation6) for GPD. Also when u1=0 and u2=u, then (Equation19) reduces to ̧H¯(u)=log(b/(a+1))+((2a+1)/(a+1))+logu+((2a+1)/(a+1))[((1u)/u)log(1u)+1], the quantile past entropy (Equation7) for GPD.

Example 3.2

If random variable X having the Pareto-II distribution with quantile function and quantile density function are given, respectively, by Q(p)=a((1p)1/b1) and q(p)=(a/b)(1p)(1+b)/b. Then quantile interval entropy (Equation8) becomes ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logab(1p)((1+b))/bdp=log(u2u1)+logab(u2u1)u1u2dp(1+b)b(u2u1)u1u2log(1p)dp=log(u2u1)+logab(1+b)b(u2u1)[1u2)1log(1u2)(1u1)(1log(1u1)].

Example 3.3

If a random variable X follows the rescaled beta distribution distribution with quantile and quantile density functions are given, respectively, by Q(p)=R(1(1p)1/c) and q(p)=(R/c)(1p)(1c)/c. Then quantile interval entropy for the rescaled beta distribution is given as ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logRc(1p)(1c)/cdp,=log(u2u1)+logRc(u2u1)u1u2dp+(1c)c(u2u1)u1u2log(1p)dp which gives, ̧H(u1,u2)=log(u2u1)+logRc+1cc1u2u1[(1u2)(1log(1u2))(1u1)(1log(1u1))].

Example 3.4

If a random variable X having half logistic distribution with quantile and quantile density functions as Q(p)=σlog((1+p)/(1p)) and q(p)=2σ/(1p)(1+p), respectively. Then quantile interval entropy (Equation8) for half logistic distribution is given as ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2log2σ(1p)(1+p)dp,=log(u2u1)+log2σ1(u2u1)u1u2log(1p)(1+p)dp. after some algebraic simplifications, we obtain (20) ̧H(u1,u2)=log(u2u1)+log(2σ)1(u2u1)(1u2)(1log(1u2))+1(u2u1)(1u1)(1log(1u1))1(u2u1)(1+u2)(log(1+u2)1)+1(u2u1)(1+u1)(log(1+u1)1).(20) Put u1=u and u2=1, in (Equation20), we have the quantile residual entropy for the half logistic distribution which is given by ̧H(u)=3+log2σ2(1u)log2+log(1u)+2(1u)log(1+u), and for u1=0 and u2=u then we get quantile past entropy for the half logistic distribution as ̧H¯(u)=2+logu+log2σ1+uulog(1+u)+1uulog(1u).

Example 3.5

If a random variable X having log logistic distribution with quantile function and quantile density function are respectively given as Q(p)=(1/a)((p/(1p))1/b) and q(p)=(1/ab)(p/(1p))1/b((p(1p))1. Then quantile interval entropy for log logistic distribution is given as ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2log1abp1p1/bp(1p)1dp,=log(u2u1)+log1ab+1(u2u1)u1u2logp1p1/bp(1p)1dp,=log(u2u1)+log1ab+(1b)b(u2u1)u1u2logpdp(1+b)b(u2u1)u1u2log(1p)dp. which gives, (21) ̧H(u1,u2)=log(u2u1)+log1ab+1bb1(u2u1)[u2logu21u1(logu11)]1+bb1(u2u1)[(1u2)1log(1u2)(1u1)1log(1u1)].(21) Substituting u1=u and u2=1 in (Equation21), then we get quantile residual entropy for log logistic distribution given as ̧H(u)=2+log1ab+b1bu1ulogu1blog(1u), whereas the quantile past entropy for log logistic distribution is obtain, when we take u1=0 and u2=u is given as ̧H¯(u)=2log1ab+1blogu+1+bb1uulog(1u).

Example 3.6

Let X be a random variable having the exponential geometric distribution with quantile and quantile density functions are given, respectively, by Q(p)=(1/λ)(log((1qp)/(1p))) and q(p)=(1/λ)((1q)/(1qp))(1p)1. Then quantile interval entropy for the exponential geometric distribution is given as ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2log1λ1q1qp(1p)1dp,=log(u2u1)+log1qλ1(u2u1)u1u2log1qp(1p)dp, which gives (22) ̧H(u1,u2)=log(u2u1)+log1qλ1q(u2u1)[(1qu2)(1log(1u2))(1qu1)(1log(1u1))]1(u2u1)[(1u2)1log(1u2)(1u1)(1log(1u1))].(22) Particularly, if we put u1=u and u2=1 in (Equation22), then we get quantile residual entropy for the exponential geometric distribution as ̧H(u)=log2+log(1u)+log1qλ+1q(1q)log(1q)(1qu)log(1qu), and if we take u1=0 and u2=u, we get quantile past entropy for the exponential geometric distribution as ̧H¯(u)=logu+log1qλ+1+qq1u+log1u1qu.

Example 3.7

If X is a random variable following the quantile function and quantile density functions of linear hazard rate distribution are given, respectively, by Q(p)=(1/(a+b))log((a+bp)/a(1+p)) and q(p)=((a+b)/(ba))(1/(a+bp)(1+p)). Hence quantile interval entropy ̧H(u1,u2) is given as ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logq(p)dp,=log(u2u1)+1(u2u1)u1u2loga+bba1(a+bp)(1+p)dp,=log(u2u1)+logb+aba1(u2u1)u1u2log(1+p)dp+u1u2log(a+bp)dp. After some algebraic simplifications, we have ̧H(u1,u2)=2+log(u2u1)+logb+aba+1+u1u2u1log(1+u1)1+u2u2u1log(1+u2)+1b(u2u1)[(a+bu1)log(1+bu1)(1+bu2)log(1+bu2)].

Example 3.8

If X be a random variable following the Davies distribution (2006), that do not have any closed form expressions for distribution and density function, then QFs and quantile density functions are given, respectively, by Q(p)=cpλ1(1p)λ2 and q(p)=cp(λ11)(1p)(λ21)(λ2p+λ1(1p)). Hence quantile interval entropy (Equation8) for Davies distribution is given by ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logq(p)dp,=log(u2u1)+1(u2u1)u1u2log[cp(λ11)(1p)(λ21)λ2p+λ1(1p)]dp,=log(u2u1)+logc+1(u2u1)(λ11)u1u2logpdp(λ2+1)u1u2log(1p)dp+1(u2u1)u1u2logλ2p+λ1(1p)dp, We get, after some algebraic calculations, (23) ̧H(u1,u2)=(λ2λ1+1)+logc+log(u2u1)+(λ11)u2(u2u1)logu2u1(λ11)(u2u1)logu1+(λ2+1)(1u2)(u2u1)log(1u2)+(λ2+1)(1u1)(u2u1)log(1u1)+(λ2u2+(1u2)λ1)(λ2λ1)(u2u1)log(λ2u2+(1u2)λ1)(λ2u1+(1u1)λ1)(λ2λ1)(u2u1)logλ2u1+(1u1λ1).(23) If we substitute u1=u and u2=1, then (Equation23) reduces to ̧H(u)=(λ2λ1+1)+logc1(1u)(λ2λ1)λ2logλ2λ2log(1u)1(1u)(λ11)ulogu(λ1(1u)+λ2u)(1u)(λ2λ1)log(λ1(1u)+λ2u), the quantile residual entropy (Equation6) for Davies distribution. When we put u1=0 and u2=u, then (Equation23) reduces to ̧H¯(u)=(λ2λ1+1)+logc+λ1logu+(λ2+1)(1u)ulog(1u)+(λ2u+(1u)λ1)(λ2λ1)ulog(λ2u+(1u)λ1)λ1(λ2λ1)ulogλ1, the quantile past entropy (Equation7) for Davies Distribution.

Example 3.9

If X be a random variable following the Govindarajulu's distribution that do not have any closed form expressions for distribution and density function, then QFs and quantile density functions are given, respectively, by Q(u)=a{(b+1)ubbub+1}andq(u)=ab(b+1)(1u)ub1;0u1; a,b>0. Thus quantile-based interval entropy (Equation8) for Govindarajulu's distribution is given as (24) ̧H(u1,u2)=log(u2u1)+1(u2u1)u1u2logab(b+1)(1p)pb1dp,=log(u2u1)+logab(b+1)+1(u2u1)u1u2log(1p)dp+(b1)(u2u1)u1u2logpdp(24) which gives ̧H(u1,u2)=log(u2u1)+logab(b+1)1u2u2u1log(1u2)+1u1u2u1log(1u1)+(b1)u2u1u2logu2u1logu1.

Table  provides the relationships between the quantile interval entropy ̧H(u1,u2), quantile conditional expectation (25) m(u2,u1)=E(X|u1<X<u2)=1(u1u2)u1u2Q(u)du,(25) and generalised hazard quantile function ̧hi(u1,u2);i=1,2 for some commonly used distribution.

Table 3. Relationships between ̧hi(u1,u2);i=1,2 and ̧H(u1,u2).

4. Characterisation results

In the literature, the problem of characterising probability distributions has been investigated by many researchers. The standard practice in modelling statistical data is either to derive the appropriate model based on the physical properties of the system or to choose a flexible family of distributions and then find a member of the family that is appropriate to the data. In both the situations, it would be of more use if we find characterisation theorems that explain the distribution using important measures of indices as. In this section, we discussed some characterisation theorems for lifetime distribution taking some important concepts like GHQF, GVQF and quantile-based condition Shannon's measures of uncertainty.

Theorem 4.1

Let X be a random variable defined on (0,) having the quantile function Q(u), then the relationship (26) G(u1,u2)=1k[(1+CQ(u1))̧h1(u1,u2)logQ(u1)(1+CQ(u2))̧h2(u1,u2)logQ(u2)+R(u1,u2)+C](26) where k, C are constants holds for all (u1,u2)D. If and only if for

  1. C=0, X has exponential distribution with quantile function Q(u)=(1/λ)log(1u),

  2. C>0, X has Pareto distribution with quantile function Q(u)=(1/a)(1u)1/b, and

  3. C<0, X has finite range distribution with quantile function Q(u)=a(1(1u)1/b.

Proof.

The if part is straightforward from the Table . To prove the converse, let us assume that (Equation26) holds. Using (Equation15), (Equation9) and (Equation16) in (Equation26), we obtain (27) u1u2fQ(p)q(p)logQ(p)dp=1klogQ(u1)1+CQ(u1)fQ(u1)1+CQ(u2)fQ(u2)logQ(u2)+u1u21Q(p)dp+C(u2u1).(27) Differentiating (Equation27) with respect to ui, i=1,2 we get, after some algebraic calculations, f(Q(ui))f(Q(ui))=(k+C)(1+CQ(ui)),i=1,2 or f(Q(u))/f(Q(u))=(k+C)/(1+CQ(u)), which gives the required result.

Theorem 4.2

For a non-negative random variable X, the relation (28) ̧M(u1,u2)λm(u1,u2)=k,(k>0,a constant)(28) holds for all (u1,u2)D if and only if X follows exponential distribution with quantile function Q(u)=(1/λ)log(1u).

Proof.

The if part is straightforward from the Table . To prove the converse, let us assume that (Equation28) holds. Then using (Equation16) and (Equation25), (Equation28) becomes (29) u1u2logq(p)dpλu1u2Q(p)dp=k(u2u1).(29) Differentiating (Equation29) with respect to ui, i=1,2, we get, after some algebraic calculations, f(Q(ui))=KeλQ(ui),i=1,2 and K>0 (constant) or f(Q(u))=KeλQ(u), which characterise the exponential distribution.

Theorem 4.3

If X be a non-negative random variable with quantile function Q(u), and constants k>0 and c>0. A relationship of the form (30) ̧M(u1,u2)(c+1)G(u1,u2)=k,(30) holds for a<u1<u2 with Q(u2)<Q(u2) if and only if X follows Pareto-1 distribution with quantile function Q(u)=(a/(1u)1/c); a>0.

Proof.

The if part is straightforward. To prove the converse, let us assume that (Equation30) holds. Then using (Equation15) and (Equation16), we have (31) u1u2logq(p)dp(c+1)u1u2logQ(p)dp=k(u2u1).(31) Differentiating (Equation31) with respect to ui, i=1,2, we get, after some algebraic calculations f(Q(ui))=k1(Q(ui))(c+1),i=1,2 and k1 (constant) or f(Q(u))=k1 (Q(u))(c+1), which gives the required result.

Next, we state the characterisation of power distribution. The proof is similar to that of Theorem 4.3 and hence omitted.

Theorem 4.4

If X is a non-negative random variable with quantile function Q(u), and K>0, C>1 be constants. A relationship of the form ̧M(u1,u2)+(c1)G(u1,u2)=k, is holds for 0<u1<u2<b with Q(u1)<Q(u1) if and only if X follows power distribution with quantile function Q(u)=au1/b.

Theorem 4.5

Let X be a random variable defined on (0,) with quantile function Q(u). Then X follows one- parameter log exponential distribution if and only if (32) ̧M(u1,u2)=logA(θ)θG(u1,u2)mc(u1,u2),(32) where mc(u1,u2)=E[logC(X)|u1<X<u2] for all (u1,u1)D.

Proof.

The if part is straightforward from the Table 1. To prove the converse, let us assume that (Equation32) holds. Using (Equation15) and (Equation16) in (Equation32), we have (33) u1u2logf(Q(p))dp=(u2u1)logA(θ)θu1u2logQ(p)dpu1u2log C(Q(p))dp.(33) Differentiating (Equation33) with respect to ui, i=1,2 we get, after some algebraic calculations, f(Q(ui))=Q(ui)θC(Q(ui))A(θ),i=1,2 or f(Q(u))=(Q(u))θC(Q(u))/A(θ), which gives the required result.

We conclude this section by characterising exponential distribution. The proof is similar to that of Theorem 4.5 and hence omitted.

Theorem 4.6

Let X(0,) be a random variable having absolutely continuous quantile function Q(u). Then the relationship of the form M(u1,u2)=logb(θ)m(u1,u2)logθma(u1,u2), where ma(u1,u2)=E[loga(X)|u1<X<u2] holds for all (u1,u2)D if and only if X follows one-parameter exponential distribution.

4.1. Exploratory data analysis using Q-Q Plot

Quantile-quantile (Q-Q) plot is a diagnostic tool, which is widely used to assess the distributional similarities and differences between two independent univariate samples. It is also a popular device for checking the appropriateness of a specified probability distribution for a given univariate data. The advantages of the Q-Q plot are: The sample sizes do not need to be equal. Many distributional aspects can be simultaneously tested. For example, shifts in location, shifts in scale, changes in symmetry, and the presence of outliers can all be detected from this plot. The Q-Q plot is similar to a probability plot. For a probability plot, the quantiles for one of the data samples are replaced with the quantiles of a theoretical distribution.

Example 4.1

Consider the Rainfall data from seeded clouds and non-seeded clouds are given below: (numbers in paratheses indicate the number of repetitions of the values)

Rainfall from control clouds: 1, 4.9(2), 11.5, 17.3, 21.7, 24.4, 26.1, 26.3, 28.6, 29,36.6,41.1, 47.3, 68.5, 81.2, 87.0, 95, 147.8, 163, 243.3, 321.2, 345.5,372.4, 830.1, 1202.6.

Rainfall from seeded clouds: 4.1, 7.7, 17.5, 31.4, 32.7, 40.6, 92.4, 115.3, 118.3, 119, 129.6, 198.6, 200.7, 242.5, 255.0, 274.7(2),302.8,334.1, 430.0, 489.1, 703.4, 978, 1656, 1697.8, 2745.6. (Source: Simpson, Olsen, & Eden, Citation1975)

The conclusions of our data analysis are as follows:

Compare the location of the data sample (mean, median): Seeded rainfall has greater location parameter than non-seeded rainfall.

Compare scale: The interquartile range indicates the variability of seeded rainfall is greater than non-seeded rainfall.

Side by side box plots: These plot reader can draw using Table ; non-seeded rainfall has a skew distribution,whereas seeded rainfall is symmetric.

Table 4. Numerical summary of rainfall for seeded clouds and non- seeded clouds.

Identification of probability laws: Seeded rainfall and non-seeded data both indicates fit by exponential distribution.

The Q-Q plot is a graphical technique for determining if two data sets come from populations with a common distribution. A Q-Q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. In general for computing a normal probability plot,the standard normal table could be used to approximate the normal quantile. We look up the Z value corresponding to the pi for i=1,2,3n and then plot the ordered data against the corresponding Z value. Table  displays the quantiles for rainfall data from seeded clouds as well as the corresponding normal quantile approximated from a standard normal table (Table ).

Table 5. Quantile plot table of seeded rainfall.

Table 6. Quantile plot table of non-seeded rainfall.

5. Conclusion

Recently, there has been a great interest in the study of information measures based on quantile functions, namely quantile entropy. When a system has lifetime between two time points (t1,t2), the interval entropy plays an important role, in the field of reliability theory and survival analysis. The present work introduced an alternative approach to interval entropy measure using quantile functions. The proposed measures may help information theorists and reliability analysts to study the various characteristics of a system when it fails between two time instants. The results presented here generalise the related existing results in context with quantile entropy for residual and past lifetime random variables.

Acknowledgments

The authors would like to express their gratitude to the reviewers and the editor-in-chief for their valuable comments, which have considerably improved the earlier version of the article.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The first author wishes to acknowledge the Science and Engineering Research Board (SERB), Government of India, for the financial assistance (Ref. No. ECR/2017/001987) for carrying out this research work.

Notes on contributors

Vikas Kumar

Vikas Kumar has obtained his M.Sc and M.Phil degree in Applied Mathematics from IIT Roorkee and ISM University, Dhanbad in 2005 and 2007 respectively. He received the Ph.D. degree in Mathematics from University of Delhi, Delhi. Currently, he is a Assistant Professor in Mathematics, UIET, M. D. University, Rohtak, India. His research interests are information theory and its applications and mathematical modeling. He has published research articles in reputed international journals of Mathematics and Statistical sciences.

Gulshan Taneja

Dr. Gulshan Taneja is working as Professor in Mathematics at M. D. University, Rohtak. He has vast experience of about 25 years of teaching in mathematics and statistics both at UG and PG levels. Dr. Taneja has published more than sixty research papers in the field of Information Theory & Reliability Theory in journals of international repute and is a member of various national and international societies.

Samsher Chhoker

Samsher Chokker has obtained his M.Sc and M.Phil degree in Mathematics from M. D. University, Rohtak in 2015 and 2017 respectively. He pursuing his Ph.D. degree in Mathematics from M. D. University, Rohtak . Currently, he is a Assistant Professor in Mathematics, Government PG Nehru College Jhajjar, India. His research interests are information theory and and mathematical modeling.

References

  • Baratpour, S., & Khammar, A. H. (2018). A quantile-based generalized dynamic cumulative measure of entropy. Communications in Statistics – Theory and Methods, 47(13), 3104–3117. doi: 10.1080/03610926.2017.1348520
  • Di Crescenzo, A., & Longobardi, M. (2002). Entropy-based measure of uncertainty in past lifetime distributions. Journal of Applied Probability, 39, 434–440. doi: 10.1239/jap/1025131441
  • Di Crescenzo, A., & Longobardi, M. (2004). A measure of discrimination between past lifetime distributions. Statistics & Probability Letters, 67, 173–182. doi: 10.1016/j.spl.2003.11.019
  • Ebrahimi, N. (1996). How to measure uncertainty in the residual life distributions. Sankhya Series A, 58, 48–57.
  • Gilchrist, W. (2000). Statistical modelling with quantile functions. Boca Raton, FL: Chapman and Hall/CRC.
  • Gong, W., Yang, D., Gupta, H. V., & Nearing, G. (2014). Estimating information entropy for hydrological data: One-dimensional case. Water Resources Research, 50, 5003–5018. doi: 10.1002/ 2014WR015874
  • Hankin, R. K. S., & Lee, A. (2006). A new family of non-negative distributions. Australian and New Zealand Journal of Statistics, 48, 67–78. doi: 10.1111/j.1467-842X.2006.00426.x
  • Kayal, S., & Moharana, R. (2016). Some Results on a doubly truncated generalized discrimination measure. Applications of Mathematics, 61, 585–605. doi: 10.1007/s10492-016-0148-4
  • Khorashadizadeh, M., Rezaei Roknabadi, A. H., & Mohtashami Borzadaran, G. R. (2013). Mohtashami Borzadaran Doubly truncated (interval) cumulative residual andpast entropy. Statistics & Probability Letters, 83, 1464–1471. doi: 10.1016/j.spl.2013.01.033
  • Kumar, V. R. (2018). A quantile approach of Tsallis entropy for order statistics. Physica A: Statistical Mechanics and its Applications, 503, 916–928. doi: 10.1016/j.physa.2018.03.025
  • Kundu, C. (2017). On weighted measure of inaccuracy for doubly truncated random variables. Communications in Statistics – Theory and Methods, 46, 3135–3147. doi: 10.1080/03610926.2015.1056365
  • Misagh, F., & Yari, G. (2011). On weighted interval entropy. Statistics & Probability Letters, 29, 167–176.
  • Misagh, F., & Yari, G. H. (2012). Interval entropy and Informative Distance. Entropy, 14, 480–490. doi: 10.3390/e14030480
  • Nair, K. R. M., & Rajesh, G. (2000). Geometric vitality function and its applications to reliability. IAPQR Transactions, 25, 1–8.
  • Nair, N. U., Sankaran, P. G., & Balkrishanan, N. (2013). Quantile based reliability analysis. Statistics for industry and technology. New York, NY: Springer Science+Business Media.
  • Nair, N. U., Sankaran, P. G., & Vinesh Kumar, B. (2012). Modeling lifetimes by quantile functions using Parzen's score function. Statistics-A Journal of Theoretical and Applied Statistics, 46(6), 799–811.
  • Nanda, A. K., Sankaran, P. G., & Sunoj, S. M. (2014). Renyi's residual entropy: A quantile approach. Statistics & Probability Letters, 85, 114–121. doi: 10.1016/j.spl.2013.11.016
  • Qiu, G. (2018). Further results on the residual quantile entropy. Communications in Statistics – Theory and Methods, 47(13), 3092–3103. doi: 10.1080/03610926.2017.1348519
  • Ruiz, J. M., & Navarro, J. (1996). Characterizations based on conditional expectations of the doubled truncated distribution. Annals of the Institute of Statistical Mathematics, 48(3), 563–572. doi: 10.1007/BF00050855
  • Sankaran, P. G., & Gupta, R. P. (1999). Characterization of lifetime distributions using measure of uncertainty. Calcutta Statistical Association Bulletin, 49, 195–196. doi: 10.1177/0008068319990303
  • Sankaran, P. G., & Sunoj, S. M. (2004). Identification of models using failure rate and mean residual life of doubly truncated random variables. Statistical Papers, 45, 97–109. doi: 10.1007/BF02778272
  • Sankaran, P. G., & Sunoj, S. M. (2017). Quantile based cumulative entropies. Communications in Statistics – Theory and Methods, 46(2), 805–814. doi: 10.1080/03610926.2015.1006779
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x
  • Simpson, J., Olsen, A., & Eden, J. (1975). A Baysian analysis of a multiplicative treatment effect in weather modification. Technometrics, 17, 161–166. doi: 10.2307/1268346
  • Sunoj, S. M., & Sankaran, P. G. (2012). Quantile based entropy function. Statistics & Probability Letters, 82, 1049–1053. doi: 10.1016/j.spl.2012.02.005
  • Sunoj, S. M., Sankaran, P. G., & Maya, S. S. (2009). Chararacterization of life distributions using conditional expectations of doubly (Intervel)truncated random variables. Communications in Statistics – Theory and Methods, 38(9), 1441–1452. doi: 10.1080/03610920802455001
  • Sunoj, S. M., Sankaran, P. G., & Nanda, A. K. (2013). Quantile based entropy function in past lifetime. Statistics & Probability Letters, 83, 366–372. doi: 10.1016/j.spl.2012.09.016
  • van Staden, P. J., & Loots, M. T. (2009). L-moment estimation for the generalized lambda distribution. Third annual ASEARC conference, New Castle.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.