Publication Cover
Statistics
A Journal of Theoretical and Applied Statistics
Latest Articles
143
Views
0
CrossRef citations to date
0
Altmetric
Research Article

On classes of consistent tests for the Type I Pareto distribution based on a characterization involving order statistics

, , &
Received 15 Mar 2023, Accepted 14 Apr 2024, Published online: 02 May 2024

Abstract

We propose new classes of goodness-of-fit tests for the Pareto Type I distribution. These tests are based on a characterization of the Pareto distribution involving order statistics. We derive the limiting null distribution of the tests and also show that the tests are consistent against fixed alternatives. The finite-sample performance of the newly proposed tests are evaluated and compared to some of the existing tests, where it is found that the new tests are competitive in terms of powers. The paper concludes with an application to a real world data set, namely the earnings of the 22 highest paid participants in the inaugural season of LIV golf.

Mathematics Subject Classifications:

1. Introduction

Many real-world phenomena exhibit measurements with heavy-tailed behaviour and, as such, lend themselves to be modelled using the Pareto distribution. First developed by economist and socialist, Vilfredo Pareto [Citation1], the heavy-tailed Pareto distribution was initially used to model the unequal distribution of wealth in a population, but has found application in a number of other scenarios. Examples of situations modelled using this distribution include studies involving insurance claim premiums [Citation2–4], studies involving medical insurance claims [Citation5], and studies investigating gestation duration [Citation6,Citation7], to name a few (see [Citation8]).

Due to its popularity, this distribution has enjoyed the attention of numerous researchers resulting in a number of different versions of the Pareto distribution, including the Types I, II, III, IV, and Generalised Pareto distributions. However, in this paper we will focus only on the use of the Type I Pareto distribution, which has cumulative distribution function (CDF) and probability density function (PDF) respectively given by Fβ,σ(x)={1(xσ)β,xσ0,x<σandfβ,σ(x)={βσβxβ1,xσ0,x<σwhere σ>0 and β>0 denote respectively the scale and the shape parameters. The Type I version of the Pareto distribution with σ=1 has a number of practical applications as shown in a variety of old and new research works. In the earlier works, for example, Fisk [Citation9] and Steindl [Citation10] cited several examples of economic data which follow the Type I Pareto distribution, whereas Berger and Mandelbrot [Citation11] proposed using the Type I Pareto distribution in studies of error clusters in communication circuits. It has also been shown to be useful in applications where service times and queuing systems are modelled, as discussed in Harris [Citation12]. More recent applications include using the Type I Pareto distribution to model the wealth distribution in the Forbes 400 list [Citation13], and modelling the city size distribution in the United States [Citation14].

Hence, under these considerations, it is important to determine whether a realized data set, x1,,xn, from a non-negative random variable X with distribution function F(x), is well-described by a Type I Pareto distribution with parameters β and σ, denoted here by P(β,σ). In this paper we will therefore consider using a goodness-of-fit (GOF) test to evaluate the following hypotheses regarding these data: (1) H0:Xfollows aP(β,σ)distribution;β,σ>0such thatF(x)=Fβ,σ(x),x[σ,),andH1:Xdoes not follow aP(β,σ)distribution;β,σ>0such thatF(x)=Fβ,σ(x),x[σ,).(1) Before discussing the existing goodness-of-fit tests for the Pareto distribution we briefly introduce the Pareto Types II, III, and IV distributions for completeness. If XP(β,σ), then the random variable Z=X+μσ has a Pareto Type II distribution, with CDF (2) Gβ,σ,μ(x)=1[1+(xμσ)]β,xμ,(2) where μR is a location parameter. Setting μ=0 in (Equation2) we have a special case of the Pareto Type II distribution, sometimes called the Lomax distribution [Citation15], and often appears in a reparameterised form with β=1/ξ and σ=δ/ξ (see [Citation16] as well as Remark 1 of [Citation17]). The Pareto Type IV distribution includes a location parameter, μR, scale parameter, σ>0, inequality parameter, γ>0, and shape parameter, β>0, and has CDF (3) Hμ,σ,γ,α(x)=1[1+(xμσ)1/γ]β,xμ.(3) Setting β=0 in (Equation3) results in the CDF of the Pareto Type III distribution. Note that the CDF of the Pareto Type I distribution can also be recovered from (Equation3) by setting μ=σ and γ=1. By setting γ=1 in (Equation3) the CDF of the Pareto Type II is obtained. For a full discussion on interesting properties of these distributions, as well as the relationships between the Pareto distribution and other distributions, the interested reader is referred to the monograph by Arnold [Citation8].

Several tests have been suggested to check the goodness-of-fit of Pareto distributions; the most commonly used formal goodness-of-fit tests for Pareto distributions are those based on the empirical distribution function (EDF), such as the Kolmogorov-Smirnov (KS) test, Cramér-von Mises (CvM) test, or Anderson-Darling (AD) test. These tests compare the empirical distribution of the data with the hypothesized theoretical Pareto distribution and assess the likelihood that the data were generated by a Pareto distribution. The results of these tests can be used to determine whether the Pareto distribution is a good fit for the data, or whether another distribution may be more appropriate. Goodness-of-fit tests for the Pareto distribution have been discussed in Beirlant et al. [Citation18], Gulati and Shapiro [Citation19], Martynov [Citation20], Rizzo [Citation21], and Falk et al. [Citation16], among others. In Chu et al. [Citation22] a review of established tests for the Generalized Pareto, Pareto Type I and Pareto Type II distributions is provided, whereas goodness-of-fit tests based on a variety of different characterizations for the Pareto distribution can be found in Obradovíc et al. [Citation23], Obradovíc [Citation24], Volkova [Citation25], and Miloševíc and Obradovíc [Citation26]. Ndwandwe et al. [Citation27] provides an extensive review of the existing goodness-of-fit tests for the Pareto Type I distribution, focussing on the myriad characterizations of this distribution. Although tests specifically developed for Pareto Types II, III, and IV distributions can potentially be used to test for the Type I distribution (by exploiting relationships between these distributions), these tests will not be considered in the Monte Carlo study presented in this paper. In what follows we will refer to the Pareto Type I distribution as just the Pareto distribution.

In this paper we propose new classes of tests for the Pareto distribution. These tests are based on characterization of the Pareto distribution involving order statistics. In Section 2 we consider the case of the Type I Pareto with unit scale parameter. We present the characterization, introduce the new test statistics and derive the limiting null distribution of the tests and show that they are consistent against fixed alternatives. Section 3 is devoted to a discussion on the general Type I Pareto distribution. In Section 4 we compare the powers of our newly proposed tests with some existing tests (in the case of the general Type I Pareto distribution), while Section 5 illustrates the use of the tests in order to test the hypothesis that the 2022 season's earnings of LIV golfers (exceeding some known threshold), follows a Pareto distribution. The paper concludes in Section 6.

2. The Type I Pareto distribution with unit scale parameter

In this section, we study the case of a Type I Pareto distribution with unit scale parameter, that is, P(β,1), β>0.

2.1. The test statistic

Consider the following characterization of the Pareto distribution denoted here by P(β,1), discussed in Allison et al. [Citation28].

Characterisation. Let X1,,Xn be independent copies of a non-negative random variable X with common density function f and cumulative distribution function F. Let m be an integer such that 2mn. Then the random variables X1m and X(1)=min{X1,,Xm} have the same distribution if and only if F(x)=Fβ(x)=Fβ,1(x), xR, β>0.

From this characterization we have the following Theorem

Theorem 2.1

Let X1,,Xn be copies of a non-negative random variable X with common density function f and cumulative distribution function F. Let m be an integer such that 2mn. Then the random variables X1m and min{X1,,Xm} have the same distribution if and only if for any tR, (4) E{1mexp(itX1m)[1F(X)]m1exp(itX)}=0.(4)

Proof.

Let 2mn. It is well known that X(1)=min{X1,,Xm} has density function f~(x)=m[1F(x)]m1f(x),xR.It is then clear that the random variables X1m and X(1) have the same distribution if and only if they have the same characteristic functions, that is, if and only if for any tR, Rexp(itx1m)f(x)dx=mRexp(itx)[1F(x)]m1f(x)dx.It is easy to see that the above equality is equivalent to R{1mexp(itx1m)exp(itx)[1F(x)]m1}f(x)dx=0,which can be written as E{1mexp(itX1m)[1F(X)]m1exp(itX)}=0.

Let w() be any continuous function satisfying (5) w(t)>0,limt±w(t)=0,tR,0<Rw(t)dt<andRζ(tx)w(x)dx=0,tR,(5) for any real-valued odd function ζ.

From the characterization and Theorem 2.1 we have that (Equation4) characterizes the P(β,1) distribution. Thus, suitable normalizations of empirical versions of the expectation in (Equation4) can be used as basis for the construction of tests for that particular Pareto distribution. To this end, we propose the following test statistic (6) Tm,n,w=R|Sm,n,β^n(t)|2w(t)dt,(6) where for all tR Sm,n,β(t)=1nj=1n[1mexp(itXj1m)Xjβ(m1)exp(itXj)],β^n=n/j=1nlog(Xj) is the maximum likelihood estimator for β.

Proposition 2.1

Let 2mn. Then R|Sm,n,β^n(t)|2w(t)dt=R|Sm,n,β^n(t)|2w(t)dt,where for all tR, (7) Sm,n,β(t)=1nj=1n[1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)],(7) and ν is the function defined on R×R by (8) ν(x;t)=cos(tx)+sin(tx).(8)

Proof.

Denote by z¯ the conjugate of any complex number z. One has the following equalities: R|Sm,n,β^n(t)|2w(t)dt=RSm,n,β^n(t)Sm,n,β^n(t)¯w(t)dt=1nj=1nk=1nR{1m2exp[it(Xj1mXk1m)]1mXkβ^n(m1)exp[it(Xj1mXk)]1mXjβ^n(m1)exp[it(Xk1mXj)]+(XjXk)β^n(m1)exp[it(XjXk)]}w(t)dt.By (Equation5), since xsin(x) is an odd function, one has that R|Sm,n,β^n(t)|2w(t)dt=1nj=1nk=1nR{1m2cos[t(Xj1mXk1m)]1mXkβ^n(m1)cos[t(Xj1mXk)]1mXjβ^n(m1)cos[t(Xk1mXj)]+(XjXk)β^n(m1)cos[t(XjXk)]}w(t)dt.Using the identity cos(ab)=cos(a)cos(b)+sin(a)sin(b) and the fact that the function xcos(x)sin(x) is odd, one finally has: R|Sm,n,β^n(t)|2w(t)dt=R|Sm,n,β^n(t)|2w(t)dt.

For practical applications (and for the Monte-Carlo study in Section 4) we will choose w(t)=ea|t| and w(t)=eat2, the choices of which lead to the following calculable forms of the test statistic: Tn,m,a(1)=1nj=1nk=1n[1m22aa2+(Xj1mXk1m)21mXkβ^n(m1)2aa2+(Xj1mXk)21mXjβ^n(m1)2aa2+(Xk1mXj)2+Xjβ^n(m1)Xkβ^n(m1)2aa2+(XjXk)2].and Tn,m,a(2)=1nπaj=1nk=1n[1m2exp((Xj1mXk1m)24a)1mXkβ^n(m1)exp((Xj1mXk)24a)1mXjβ^n(m1)exp((Xk1mXj)24a)+Xjβ^n(m1)Xkβ^n(m1)exp((XjXk)24a)(Xj1mXk)24a]respectively.

2.2. Large sample properties

In this section we study the asymptotic properties of the newly proposed tests under the null hypothesis as well as under fixed alternatives. It is well known that under H0, the maximum likelihood estimator of β is β^n=n/i=1nlog(Xi) and E(β^n)=/(n1).

From this, under H0, one can deduce the following equalities: (9) n(β^nβ)=n(nj=1nlog(Xj)β)=1nj=1n[1βlog(Xj)](β2+βnj=1nlog(Xj)β2)=β2nj=1n[1βlog(Xj)]+1nj=1n[1βlog(Xj)](βnj=1nlog(Xj)β2)(9) Now, recall that under H0, for any j=1,,n, log(Xj) follows a gamma distribution with parameters 1 and β, j=1nlog(Xj) follows a gamma distribution with parameters n and β and 1/j=1nlog(Xj) follows an inverse gamma distribution with parameters n and β. From this, by the Strong Law of Large Numbers (SLLN) and Slutsky theorem, one has the following almost surely convergence: βnj=1nlog(Xj)β2βnj=1nlog(Xj)β20.Next, by the Central Limit Theorem, the following convergence in distribution holds 1nj=1n[1βlog(Xj)]N,where N is a zero-mean Gaussian random variable with variance 1/β2.

Collecting these two convergence results, one sees that the second term in (Equation9) is oP(1).

Denote by C=C(R,R), the set of R-valued continuous functions defined on R. Define on C the metric ρ(x,y)=j=12jρj(x,y)1+ρj(x,y),j1,ρj(x,y)=supwj|x(w)y(w)|.It is well known (see, for example, [Citation29]) that endowed with ρ, C is a separable Fréchet space, and that convergence in this metric corresponds to the uniform convergence on all compact sets. That is for all x,yC, ρ(x,y)=0j1,ρj(x,y)=0. For random elements xn and yn of C, ρ(xn,yn)P0j1,ρj(xn,yn)P0.

Proposition 2.2

Let 2mn. Under H0, in C, as n tends to infinity, in probability, Sm,n,β^n()=S~m,n,β()+oP(1),and for all tR, (10) S~m,n,β(t)=1nj=1n{1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)+[1βlog(Xj)]φ(t)},(10) with φ standing for the function defined for any tR by: (11) φ(t)=(m1)β41xβm2ν(x;t)dx.(11)

Proof.

Write for all tR, Sm,n,β^n(t)=Sm,n,β(t)+S^m,n(t),where S^m,n(t)=1nj=1n(Xjβ(m1)Xjβ^n(m1))ν(Xj;t).Now, from a first-order Taylor expansion, one has Xjβ(m1)Xjβ^n(m1)=(β^nβ)(m1)βXjβ(m1)1+oP(1).Then, under H0, for all tR, one has the equalities: S^m,n(t)=(β^nβ)(m1)β1nj=1nXjβ(m1)1ν(Xj;t)+oP(1),=n(β^nβ)[(m1)βnj=1nXjβ(m1)1ν(Xj;t)]+oP(1).For all tR, define the term in the brackets as φn(t). By the law of large numbers, it is easy to see that φn(t) converges point-wise to φ(t) and that it is equicontinuous on every compact subset of R. Therefore, it converges uniformly to φ(t) on any compact subset Θ of R. This result could also be obtained by applying Proposition 1 of Csörgö [Citation30].

As a consequence of the above convergence, uniformly in tΘ, S^m,n(t)=1nj=1n[1βlog(Xj)]φ(t)+oP(1)and uniformly in tΘ, Sm,n,β^n(t)=Sm,n,β(t)+S^m,n(t)=1nj=1n{1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)+[1βlog(Xj)]φ(t)}+oP(1)=S~m,n,β+oP(1).As this holds for arbitrary compact Θ, one can conclude that as n tends to infinity, in probability, ρ(S~m,n,β,Sm,n,β^n)0,which establishes the proposition.

Now, let k be the real-valued function defined for any (x,t)R×R by k(x,t)=1mν(x1m;t)xβ(m1)ν(x;t)+[1βlog(x)]φ(t).Also consider the function K defined for any tR by K(t)=Rk(x,t)dF(x),and Fn the empirical cumulative distribution function of X1,X2,,Xn. Then one sees that ϖn(t)=1nj=1n[k(Xj,t)K(t)]=Rk(x,t)d{n[Fn(x,t)F(t)]}.Under suitable conditions (see [Citation30]), ϖn converges weakly to a zero-mean Gaussian process with covariance kernel E[ϖn(t)ϖn(s)]=Rk(x,t)k(x,s)dF(x)K(t)K(s).

Theorem 2.2

Let m{2,,n}, fixed. If β>1/m, then under H0, S~m,n,β() converges weakly in C to a zero-mean Gaussian process Sm() with covariance kernel Γm defined for any s,tR by (12) Γm(s,t)=β1{1m2ν(x1m;t)ν(x1m;s)+x2β(m1)ν(x;t)ν(x;s)+(1βlog(x))2φ(t)φ(s)1mxβ(m1)ν(t;s)ν(x1m;s)+1m(1βlog(x))2ν(x1m;t)φ(s)1mxβ(m1)ν(x;t)ν(x1m;s)xβ(m1)(1βlog(x))ν(x;t)φ(s)+1m(1βlog(x))φ(t)ν(x1m;s)xβ(m1)(1βlog(x))φ(t)ν(x;s)}xβ1dx.(12)

Proof.

We prove this result by showing that S~m,n,β() is tight and its finite-dimensional distributions converge to those of any zero-mean Gaussian process with covariance kernel Γm. For this, we check the conditions (i), (i)* and (ii)* of Csörgö [Citation30].

As k(x,t) is bounded with respect to t on any compact subset of R, so is |k(x,t)|2+δ, for any δ>0. Thus, one can find t0Θ such that for any x1 suptΘ|k(x,t)|2+δ=|k(x,t0)|2+δ.Consequently, RsuptΘ|k(x,t)|2+δdF(x)=R|k(x,t0)|2+δdF(x)CstR[1+x(2+δ)β(m1)+(1β+log(x))2+δ]dF(x)<.This establishes the convergence of the finite-dimensional distributions of S~m,n,β and the point (i)* of Csörgö [Citation30].

It remains to show (ii)*. One can write: |k(x,t)k(x,s)|1m|ν(x1m;t)ν(x1m;s)|+xβ(m1)|ν(x;t)ν(x;s)|+|1βlog(x)||φ(t)φ(s)|.By a first-order Taylor expansion of the function tν(x;t)=cos(xt)+sin(xt), one has: |k(x,t)k(x,s)|x1mm|ts|+xβ(m1)+1|ts|+Cst|1βlog(x)||ts|.Then, one easily sees that for any s,tΘ, one can find α(0,1] such that |k(x,t)k(x,s)||ts|αCst(x1mm+xβ(m1)+1+|1βlog(x)|)=|ts|αM(x,v(t,s)),where v is any Θ-valued function defined on Θ×Θ, and for any x1 and tΘ, M stands for the function defined as M(x,t)=Cst(x1mm+xβ(m1)+1+|1βlog(x)|).Since M(x,t) does not depend on t, suptΘM2(x;t)=M2(x;t) and by our assumptions RsuptΘM2(x;t)dF(x)=RM2(x;t)dF(x)<.From all these, by Csörgö [Citation30], one can conclude that Rk(x,t)d{n[Fn(x,t)F(t)]}converges weakly to any zero-mean Gaussian process with covariance kernel Rk(x,t)k(x,s)dF(x)K(s)K(t).From the following equality (13) Rk(x,t)d{n[Fn(x,t)F(t)]}=S~m,n,β(t)nK(t),(13) since under H0, K(t)=0, S~m,n,β() converges weakly to the Gaussian process invoked in the theorem. This establishes the theorem.

Theorem 2.3

Let m{2,,n}, fixed. Assume that under H1, E[log(X1)]<. Then under H1, in probability, for any tR, Sm,n,β^n(t) has the same asymptotic behaviour as nQ(t), where Q(t)=E[1mexp(itX11m)[1F(X1)](m1)exp(itX1)]+E{[X1(m1)E[log(X1)][1F(X1)](m1)]exp(itX1)},tR.

Proof.

Define, for any tR, S˙m,n,F=1nj=1n[1mexp(itXj1m)[1F(Xj)](m1)exp(itXj)].Now, adding and subtracting one has Sm,n,β^n(t)n=S˙m,n,F(t)n+1nj=1n{Xjβ^(m1)[1F(Xj)](m1)}exp(itXj)+oP(1).Then by the SLLN the first and second terms in the right-hand side of the above equation converge point-wise respectively to Q1(t)=E[1mexp(itX11m)[1F(X1)](m1)exp(itX1)]and Q2(t)=E{[X1(m1)E[log(X1)][1F(X1)](m1)]exp(itX1)}.This establishes the theorem.

Theorem 2.4

Let w be any function satisfying (Equation5). Let m{2,,n}, fixed.

(i)

If β>1/m, then, under H0, as n tends to infinity, in distribution Tm,n,wRSm2(t)w(t)dt,where Sm is the Gaussian process invoked in Theorem 2.2.

(ii)

Under H1, if E[log(X1)]<, then as n tends to infinity, Tm,n,w.

Proof.

For Part (i), first observe that since the Xi's are iid, under H0, one has by simple computations E[S~m,n,β2(t)]=E{1mν(X11m;t)X1β(m1)ν(X1;t)+[1βlog(X1)]φ(t)}2.Denote by BrR a ball of radius r, and B¯r its complementary in R. Integrating both sides of the above equality with respect to w(t)dt on B¯r, one has: B¯rE[S~m,n,β2(t)]w(t)dt=B¯rE{1mν(X11m;t)X1β(m1)ν(X1;t)+[1βlog(X1)]φ(t)}2w(t)dt.Since the functions tν(x,t) and tφ(t) are bounded, since w(t)0 as t tends to infinity, it is easy to see that as r tends to infinity, the right-hand side of the last equality converges to 0. From an adaption of Theorem 2.3 of Bilodeau and Lafaye de Micheaux [Citation29] with f(x)=x2 and α=1, one has that, under H0, as n tends to infinity, Tm,n,wRSm2(t)w(t)dt.For the proof of the second part, it follows easily from Theorem 2.3 that for larger values of n, Tm,n,w=nR|Q(t)|2w(t)dt.Under H1, there exists a t1R, such that Q1(t1)0. The quantity mQ2(t) is the difference between the characteristic function of X(1)=min{X1,,Xm} under H0(F=F1/E[log(X1)]) and its characteristic function under H1. Since FF1/E[log(X1)] under H1, Q2(t2)0 for some t2R. This means that under H1, there is some t0R for which Q(t0)0. By the continuity of |Q|, R|Q(t)|2w(t)dt>0. Whence, under H1, as n tends to infinity, in probability, Tm,n,w.

Now, assume that w() is the density function (with respect to the Lebesgue's measure) of some positive measure μ with support R. Let L2=L2(μ) be the collection of functions g defined on R such that Rg2(t)dμ(t)<. For h1,h2,hL2, h1,h2=Rh1(t)h2(t)(t) and hL2=h,h12 respectively stand for the usual inner product and norm on L2.

From our assumptions, it is easy to prove that the function Γm(s,t) defined by (Equation20) is a positive semidefinite kernel. Consequently, the integral operator Γm defined on L2 by (14) Γmh(t)=RΓm(s,t)h(s)dμ(s),tR.(14) admits eigenvalues ξ1,ξ2, sorted so that ξ1ξ20, and eigenfunctions g1,g2, which form an orthonormal basis for L2.

Corollary 2.1

Under the conditions of Theorem 2.4, under H0, Tm,n,w has asymptotically the same distribution as j1ξjχj2, where ξj and χj2,j1, are respectively the eigenvalues of Γm and i.i.d. random variables following a chi–squared distribution with one degree of freedom.

Proof.

The Gaussian process Sm() defined in Theorem 2.2 can be viewed as a random element of L2. Its Karhunen-Loève representation is given by Sm(t)=j=1Gjfj(t),tR,where for all j1, Gj=Sm(),fj are independent zero–mean Gaussian random variables with variances ξj. Thus, Sm()L22=j=1Gj2. Recalling that E(Gj2)=ξj0, j1, for nil ξj's, the corresponding Gj's are nil in probability. For positive ξj's, one can observe that Zj=Gj/ξj,j1, are iid standard Gaussian random variables. Thus, RSm2(t)dμ(t)=Sm()L22=j=1ξjZj2.

One can approximate the distribution of j=1ξjχj2 by that of j=1Jξjχj2 for any integer J large enough. Since the ξj's are unknown, they can be estimated by the eigenvalues of the operator Γ^m,n(t), where Γ^m(t) is any consistent estimator of Γm(t). A possible way is to estimate them by the ξ^j's from the integral equations Γ^m,nf^j=ξ^jf^j,j1.A natural estimator of Γm(s,t) can be obtained by taking the empirical counterpart in the expression given in (Equation20), in which β is replaced by β^n. Some indications on the computation of the cumulative distribution function of j=1Jξ^jχj2 can be found in Ngatchou-Wandji [Citation31] or in Fan et al. [Citation32]. We will not pursue this further in this paper.

3. The case of the general Type I Pareto distribution

In this section, we indicate how to treat the case where the observations follow a more general P(β,σ) distribution. That is, we consider testing the following more general hypotheses: H0:β,σ>0such thatF(x)=Fβ,σ(x),xRand H1:β,σ>0such thatF(x)=Fβ,σ(x),xR,where we recall that Fβ,σ(x)={1(xσ)β,xσ0,x<σ

Remark 3.1

The above testing problem can be related to that of the preceding sections by using the fact that a non-negative random variable X follows a P(β,σ) distribution if and only if the scaled random variable X/σ follows a P(β,1) distribution. This can be seen easily by observing that:

  • if XP(β,σ) then for any x1, P(Xσx)=P(Xσx)=1xβ

  • if X/σP(β,1) then for any xσ, P(Xx)=P(Xσxσ)=1(xσ)β.

Now, let Y1,,Yn be an independent and identically distributed sample following a P(β,σ) distribution. In the case where σ is known, by Remark 3.1, the current testing problem can be done as the one studied in Section 2, by considering the scaled observations Xj=Yj/σ, j=1,,n.

With this, all the results as those obtained for σ=1 can be established.

The null distribution of the test statistic can be computed along the same lines as at the end of Section 2.

Also, the consistency of the test can be handled by establishing a result similar to Theorem 2.3 and another similar to the second part of Theorem 2.4.

The case where σ is unknown is more interesting and the one encountered in practice. Here, it is natural to consider the scaled observations Yj/σ^n, j=1,,n, where σ^n is a consistent estimator of σ. In the sequel, we use the maximum likelihood estimators of the parameters of σ and β given by σ^n=min{Y1,,Yn}andβ^n=ni=1nlog(Yjσ^n).Letting Xj=Yj/σ and observing that Fβ,σ(σ)=0 and fβ,σ(σ)=β/σ, from the Bahadur representation of sample quantiles, one can write n(σ^σ)=nFn(σ)fβ,σ(σ)+oP(1)=σβ1nj=1nI(Xj1)+oP(1),where we recall that I() is the indicator function and here Fn() is the empirical cumulative distribution function of the Y1,,Yn.

Next, by a Taylor expansion (the delta method), one has n[(log(σ^n)log(σ))]=1β1nj=1nI(Xj1)+oP(1).Also, by the Bahadur representation of σ^n, by its consistency and the Slutsky theorem, the Bahadur representation of β^n is given by: (15) n(β^nβ)=j=1nlog(Yjσ^n){n[1βj=1nlog(Yjσ^n)n]}=log(σ)log(σ^n)+j=1nlog(Xj)×{n[1βj=1nlog(Xj)nlog(σ)+log(σ^n)]}=β2nj=1n[1βlog(Xj)1βI(Xj1)]+oP(1),(15) Now, the test static is (16) Tm,n,w=R|Wm,n,β^n,σ^n(t)|2w(t)dt,(16) where for all tR Wm,n,β,σ(t)=1nj=1n[1mexp(itXj1m)Xjβ(m1)exp(itXj)].Then, it is easy to prove along the same lines as for Proposition 2.1 that, given an integer m[2,n], one has R|Wm,n,β^n,σ^(t)|2w(t)dt=R|Wm,n,β^n,σ^(t)|2w(t)dt,where for all tR, (17) Wm,n,β,σ(t)=1nj=1n{1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)},(17) and ν() is the function defined in Proposition 2.1.

For studying the asymptotic distribution of Wm,n,β^n,σ^n(t), one has to establish the corresponding versions of Proposition 2.2 and Theorem 2.2.

Proposition 3.1

Let 2mn. Under H0, in C, as n tends to infinity, in probability, Wm,n,β^n,σ^n()=W~m,n,β,σ()+oP(1),where for all tR, (18) W~m,n,β,σ(t)=1nj=1n{1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)+[1βlog(Xj)1βI(Xj1)]φ(t)+I(Xj1)ψ(t)}(18) with φ standing for the function defined by (Equation11) and the function ψ is defined for any tR by (19) ψ(t)=tm21x1mβ1ϑ(x1m;t)dx,(19) and ϑ(x,t) stands for the function ϑ(x,t)=cos(xt)sin(xt).

Proof.

Write for all tR, Wm,n,β^n,σ^n(t)=Wm,n,β,σ(t)+W^m,n(t),where W^m,n(t)=1nj=1n[1m{ν[(Yjσ^n)1m;t]ν[Xj1m;t]}+(Xjβ(m1)Xjβ^n(m1))ν(Xj;t){ν[(Yjσ^n)1m;t]ν[Xj1m;t]}].Now, from a first-order Taylor expansion, one has ν[(Yjσ^n)1m;t]ν(Xj1m;t)=(σ^nσ)1Xj1m(Xj1m;t)+oP(1)and Xjβ(m1)Xjβ^n(m1)=(β^nβ)(m1)βXjβ(m1)1+oP(1).Then, under H0, for all tR, one has the equalities: W^m,n(t)=(β^nβ)(m1)β1nj=1nXjβ(m1)1ν(Xj;t)+(σ^nσ)1m2σ1nj=1nXj1m(Xj1m;t)+oP(1)=n(β^nβ)[(m1)βnj=1nXjβ(m1)1ν(Xj;t)]+n(σ^nσ)[tm2σ1nj=1nXj1mϑ(Xj1m;t)]+oP(1).Multiply the terms in the brackets respectively by β2 and σ/β and define the results for all tR by ψ1,n(t) and ψ2,n(t).

By the law of large numbers, it is easy to see that ψ1,n(t) converges point-wise to φ(t) and that it is equicontinuous on every compact subset of R. Therefore, it converges uniformly to φ(t) on any compact subset Θ of R. By the same argument, ψ2,n(t) converges uniformly to ψ(t).

As a consequence of the above convergence, uniformly in tΘ, W^m,n(t)=1nj=1n{[1βlog(Xj)1βI(Xj1)]φ(t)+I(Xj1)ψ(t)}+oP(1)and uniformly in tΘ, Wm,n,β^n,σ^n(t)=Wm,n,β,σ(t)+W^m,n(t)=1nj=1n{1mν(Xj1m;t)Xjβ(m1)ν(Xj;t)}+1nj=1n{[1βlog(Xj)1βI(Xj1)]φ(t)+I(Xj1)ψ(t)}=W~m,n,β,σ+oP(1).As this holds for arbitrary compact Θ, one can conclude that as n tends to infinity, in probability, ρ(W~m,n,β,σ,Wm,n,β^n,σ^n)0,which establishes the proposition.

The corresponding result to Theorem 2.2 is the following:

Theorem 3.1

Let m{2,,n}, fixed. If β>1/m, then under H0, W~m,n,β,σ() converges weakly in C to a zero-mean Gaussian process Wm() with covariance kernel Λm defined for any s,tR by (20) Λm(s,t)=β1(1m2ν(x1m;t)ν(x1m;s)+x2β(m1)ν(x;t)ν(x;s)+{[1βlog(x)1βI(x1)]φ(t)+I(x1)ψ(t)}×{[1βlog(x)1βI(x1)]φ(s)+I(x1)ψ(s)}1mxβ(m1)ν(t;s)ν(x1m;s)+1m{[1βlog(x)1βI(x1)]φ(s)+I(x1)ψ(s)}ν(x1m;t)1mxβ(m1)ν(x;t)ν(x1m;s)xβ(m1)ν(x;t){[1βlog(x)1βI(x1)]φ(s)+I(x1)ψ(s)}+1mν(x1m;s){[1βlog(x)1βI(x1)]φ(t)+I(x1)ψ(t)}xβ(m1)ν(x;s){[1βlog(x)1βI(x1)]φ(t)+I(x1)ψ(t)})xβ1dx.(20)

Proof.

This result can be proved with the same techniques as in the proof of Theorem 2.2 by using, in the places of the functions k(x,t) and K(t), the following functions h(x,t)=1mν(x1m;t)xβ(m1)ν(x;t)+[1βlog(x)1βI(x1)]φ(t)+I(x1)ψ(t)and H(t)=Rh(x,t)dF(x).With these, under conditions (i), (i) and (ii) of Csörgö [Citation30] one has that πn(t)=1nj=1n[h(Xj,t)H(t)]converges weakly to a zero-mean Gaussian process with covariance kernel E[πn(t)πn(s)]=Rh(x,t)h(x,s)dF(x)H(t)H(s).We now check these conditions under H0. For this, we follow the same lines as in the proof of Theorem 2.2.

It is clear that h(x,t) is bounded with respect to t on any compact subset of R. So it is a trivial matter that in any compact set there is some t0 so that for any δ>0 RsuptΘ|h(x,t)|2+δdF(x)=R|k(x,t0)|2+δdF(x)<which shows the convergence of the finite-dimensional distributions of W~m,n,β,σ, and checks the point (i)* of Csörgö [Citation30].

For checking (ii)*, one can write: |h(x,t)h(x,s)|1m|ν(x1m;t)ν(x1m;s)|+xβ(m1)|ν(x;t)ν(x;s)|+|1βlog(x)1βI(x1)||φ(t)φ(s)|+|ψ(t)ψ(s)|.By a first-order Taylor expansion of the functions tν(x;t)=cos(xt)+sin(xt) and tϑ(x;t)=cos(xt)sin(xt), one has: |h(x,t)h(x,s)|x1mm|ts|+xβ(m1)+1|ts|+Cst(|1βlog(x)1βI(x1)|+1)|ts|.Then, one easily sees that for any s,tΘ, one can find α(0,1] such that |h(x,t)h(x,s)||ts|αCst(x1mm+xβ(m1)+1+|1βlog(x)1βI(x1)|+1)=|ts|αM(x,v(t,s)),where v is any Θ-valued function defined on Θ×Θ, and for any x1 and tΘ, M is the function defined as M(x,t)=Cst(x1mm+xβ(m1)+1+|1βlog(x)1βI(x1)|+1).As M(x,t) does not depend on t, suptΘM2(x;t)=M2(x;t) and by our assumptions RsuptΘM2(x;t)dF(x)=RM2(x;t)dF(x)<.From these and by Csörgö [Citation30], it results that πn(t) converges weakly to any zero-mean Gaussian process with covariance kernel Rh(x,t)h(x,s)dF(x)H(s)H(t).

Given that under H0, H(t)=0, by the fact that (21) πn(t)=W~m,n,β,σ(t)nH(t),(21) one can conclude that W~m,n,β,σ() converges weakly to the Gaussian process invoked in the theorem. This establishes Theorem 3.1.

For the consistency of the test and the approximation of the quantiles of its null distribution in this case where σ is unknown, results similar to Theorems 2.3 and 2.4 can be stated and proved with the same techniques and arguments.

In what follows, we briefly comment on the application of our method to the Pareto Type II distribution. Recall that if a random variable Z follows a Pareto Type II distribution the CDF is given by (Equation2). It readily follows that, given the random variable Z, the random variable Y=Zμ+σ follows a Type I Pareto distribution, P(β,σ). Given Z1,,Zn i.i.d. observations, testing for the null hypothesis of Pareto Type II distribution is tantamount to testing if Z1μ+σ,,Znμ+σ follow a P(β,σ). Thus, it is enough to apply the above results derived for the the Pareto Type I distribution. In the case where the parameters are unknown, which is the one encountered in practice, one must replace the unknown parameter by consistent estimators, β^n, σ^n and μ^n. However, in this case, the asymptotics would be difficult to treat. One of the reasons being that the Bahadur representations of the estimators may not be easy to handle. We therefore do not pursue this matter further, as the focus of the paper is on the Pareto Type I distribution.

4. Monte Carlo simulation study and results

This section contains the results of a Monte Carlo study where the finite-sample performance of the newly proposed tests Tn,m,a(1) and Tn,m,a(2) are compared to the following existing tests for the hypothesis in (Equation1); i.e., the general Type I Pareto distribution:

  • The traditional Kolmogorov-Smirnov (KSn) and Cramer-von Mises (CVn) tests.

  • Two tests proposed by Zhang [Citation33] based on the likelihood ratio, with test statistics given by ZAn=j=1n[log{1X(j)β^n}nj+12+log{X(j)β^n}j12]and ZBn=j=1n[log((1X(j)β^n)11(n12)(j34)1)]2,where X(1)<X(2)<<X(n) denote the order statistics of X1,X2,,Xn.

  • A test based on entropy utilizing the Kullback-Leibler divergence measure (see, e.g., [Citation34]). The test statistic is given by KLn,m=Hn,mlog(β^n)+(β^n+1)1nj=1nlog(Xj),where Hn,m=1nj=1nlog{(n2m)(X(j+m)X(jm))}is an estimator for the entropy, with X(j)=X(1) for j<1, X(j)=X(n) for j>n, and m is a window width subject to mn2.

    We implement the test for m = 1 and m = 10.

  • A test based on the empirical characteristic function proposed by Meintanis [Citation35]. The test is a weighted L2 distance between the empirical characteristic function of transformed data and the characteristic function of the standard uniform distribution. Based on the transformation U^j=Fβ^n(Xj),j=1,,n, the test statistic is Mn,a=1nj,k=1n2a(U^jU^k)2+a2+2n[2tan1(1a)alog(1+1a2)]4j=1n[tan1(U^ja)+tan1(1U^ja)].The value of the tuning parameter is set to a = 0.5 and a = 1 in order to obtain the Monte Carlo results presented.

  • A test based on the Mellin transform proposed by Meintanis [Citation36]. The test statistic is given by Gn,a=1n[(β^n+1)2j,k=1nIw(0)(XjXk)+j,k=1nIw(2)(XjXk)+2(β^n+1)j,k=1nIw(1)(XjXk)]+β^n[nβ^nIw(0)(1)2(β^n+1)j=1nIw(0)(Xj)2j=1nIw(1)(Xj)],where Iw(m)(t)=0(t1)m1xtw(t)dt,m=0,1,2.Choosing w(x)=eax, one has Ia(0)(x)=(a+logx)1,Ia(1)(x)=1alogx(a+logx)2,and Ia(2)(x)=22a+a2+2(a1)logx+log2x(a+logx)3.We present results for a = 0.5 and a = 2.

  • A test proposed by Allison et al. [Citation28]. The test statistic measures the difference between the empirical distribution of min{X1,,Xm} and the V-empirical distribution of Xm, defined as Δn,m(x)=1nj=1nI{Xj1mx}1nmj1,,jm=1nI{min(Xj1,,Xjm)x}.Based on Δn,m, the authors propose the following test statistic A1n,m=1Δn,m(x)dFn(x).We show results for m = 2.

4.1. Simulation settings

A Monte Carlo study is carried out to examine the empirical power performance of the tests discussed in the previous sections against various fixed alternative distributions; this includes those listed in Table  along with two Pareto mixture distributions. Note that the alternatives listed in Table  natively have support (0,) and so we were required to shift these distributions by σ=1 unit to ensure that the simulated data has the same support as the Pareto distribution.

Table 1. Summary of various choices of the alternative distributions.

The first of the two mixture distributions that we use in this study places mixture probability 1−p on the Pareto distribution with parameter θ=3 (P(3)) and probability p on the lognormal distribution with parameters μ=2.69 and θ=2 (LN(2.69,2)). The second family of mixture distributions similarly places probability 1−p on the P(3) distribution and probability p on the Weibull distribution with parameter λ=0.5 and θ=0.25 (W(0.5,0.25)). These parameter configurations are chosen to ensure that both distributions used in the mixtures share the same expected values. Random variates from many of these distributions can be obtained using, for example, the R package PoweR [Citation37].

The results are given in Tables  and display the estimated powers (the percentage of times the null hypothesis is rejected in MC=20,000 independent Monte Carlo replications) calculated for sample sizes n = 20 and n = 30. Note that the results from the test statistic Tn,m,a(1) are omitted from the simulation results as they were found to be similar to Tn,m,a(2). The ‘warp-speed’ bootstrap method [Citation38] is employed in order to simultaneously calculate the bootstrap critical values and Monte Carlo approximations of the power. In addition, all results are calculated by estimating the parameters using either maximum likelihood estimation (MLE), σ^MLE=min(X1,,Xn)andβ^MLE=ni=1nlog(Xi/σ^MLE)or method of moments estimation (MME), σ^MME=X¯n(β^MME1)β^MMEandβ^MME=nX¯min(X1,,Xn)n(X¯nmin(X1,,Xn)).The results obtained when MLE is employed are given in Tables  and , whereas the results associated with MME are given in Tables  and . A significance level of 5 % is used throughout the study and all calculations were executed using R v4.2.2 [Citation39]. Note that all the code used in these simulations can be found here:

Table 2. Estimated powers for sample size n = 20 based on MLE.

Table 3. Estimated powers for sample size n = 20 based on MME.

Table 4. Estimated powers for sample size n = 30 based on MLE.

Table 5. Estimated powers for sample size n = 30 based on MME.

https://github.com/LSantanaZA/ParetoGOFUsingOrderStatistics-2023

The empirical power results for the two test statistics developed in this paper make use of specific configurations of the tuning parameters m and a, as these combinations were found to produce high powers in a separate exploratory preliminary simulation. Specifically, for T(2), the parameter settings used are the pairings of m = 2 & a = 0.5, m = 3 & a = 0.5 and m = 4 & a = 0.5.

4.2. Simulation results

We begin by discussing the results obtained using MLE estimators for the parameters for n = 20 and n = 30, respectively, and then move on to the results obtained when using MME estimation. The two highest values for each alternative distribution (rows) are highlighted to make comparison easier. From Tables  it is clear that all the tests closely maintain the nominal significance level of 5%.

When considering the estimated powers under MLE estimation, the test KLn,10 does best for the majority of the alternatives given in Table , closely followed by Tn,2,0.5(2). However, we note that while the KLn,10 test has good performance among these non-mixture alternatives, it has remarkably poor performance when applied to the mixture distributions; displaying almost no power for this sequence of alternatives. When considering the two mixture alternatives, the statistic Tn,4,0.5(2) produces some of the highest estimated powers across all mixing proportions used. In particular, it has the best performance for the Pareto-log-normal mixture alternative and is tied for the best performing test statistic (with ZAn and ZBn) for the Pareto-Weibull mixture.

Turning our attention to the performance of the tests under the MME estimation scheme, we find that the Gn,2 test statistic generally produces the best power performance among the non-mixture alternatives, followed, once again, by Tn,2,0.5(2). We note that, in stark contrast to the powers obtained under MLE estimation, the KLn,10 test is not competitive at all when using the MME estimation technique for the alternatives in Table . For the mixture alternatives, the test proposed in this paper produces some of the highest estimated powers for almost all of their parameter configurations. In particular, our test with setting m = 4 and a = 0.5 performs best for both of these mixture alternatives under almost all mixing proportions considered.

It is clear that the proposed test produces higher estimated powers for the alternatives in Table  when using MME estimation compared to the corresponding powers obtained under MLE estimation. Conversely, these tests calculated under MLE have much higher powers than their MME counterparts when considering the mixture alternatives. As is common with these kinds of analyses, we conclude that there is no single test with the best overall power performance, but we find that our proposed test is competitive against the majority of other tests encountered in the literature when either MLE or MME estimation is used; this is true for both mixed and non-mixed alternatives.

Finally, it is clear from Tables  that, for the test statistic Tn,m,a(2), the choice of the parameters m and a potentially have a pronounced impact on the estimated powers produced. This fluctuation in power performance is briefly explored in Table  for the ‘Benini’ and ‘log-Weibull’ alternatives using a variety of different configurations of the parameters m and a. These particular alternatives were selected because they can be considered local alternatives; setting the parameter θ to zero the Pareto distribution is obtained, with deviations from Pareto occurring when θ>0. The powers are obtained using a sample size of n = 20 and the parameters σ and β are estimated using MME. From Table  it is clear that, in general, the highest powers are associated with smaller values of both m and a; the powers taper off as both m and a increase. This tendency is observed in all six alternatives considered here but was also found to be true for many of the other alternatives considered in the main simulation study. There were a few exceptions to this, but we only report these results as they are representative of the common trend observed.

Table 6. Estimated powers of the test statistic Tn,m,a(2) for varying choices of a and m (using MME estimation and sample size n = 20) for 6 alternatives.

5. Practical data application

To further investigate the behaviour of the test statistics studied in this paper, we now present the results obtained by applying those tests to a practical data set. The data set concerns the earnings for the 2022 inaugural season of LIV golf. LIV golf is a new golf series backed by the Saudi Ariabian sovereign wealth fund which aims to be an alternative to the PGA Tour by attracting star players and providing larger paydays for winners. The data were obtained from www.spotrac.com [Citation40] and collects the player earnings in the 2022 season. The data are shown in Table  and a box-plot of these values is provided in Figure .

Figure 1. Box-plot of the LIV golf 2022 earnings. The dashed line represents the threshold of $3,500,000.

Figure 1. Box-plot of the LIV golf 2022 earnings. The dashed line represents the threshold of $3,500,000.

Table 7. Data set: LIV golf earnings data set (accessed 2023-09-11).

It is clear from Figure that the data values have a right-skewed tendency, with some extremely large outliers. The Pareto distribution is a distribution with heavy tails and is often used to model income above a specified threshold, so it sensible to also consider this distribution as a possible model for these earnings values above a threshold. We therefore focus our attention on the LIV golf earnings above the threshold of 3,500,000, indicated in Figure with a dashed grey line. The values above the threshold are extracted from the data set, scaled by dividing them by the known value of 3,500,000, and their empirical distribution is plotted in Figure . Overlaid on this empirical distribution plot are two parametric Pareto distributions: one where the parameter is estimated using MLE (producing the estimate β^=1.781) and the other where the parameter is estimated using MME (producing the estimate β^n=1.932). From these figures it seems that the Pareto might be a good fit for the data, but to more formally test this assertion we now apply all of the tests discussed in this paper to the above-threshold, scaled LIV golf earnings data, with the results found in Table . The estimated p-values reported in Table  were obtained by making use of a parametric bootstrap employing B=10,000 bootstrap replications from a Pareto distribution with parameter estimated using either the MLE or the MME. From these p-values it is clear that all the tests do not reject the null hypothesis that the 2022 season's earnings of LIV golfers exceeding 3,500,000, follows a Pareto distribution.

Figure 2. Empirical distribution function of the scaled above-threshold LIV golf 2022 earnings with two Pareto distribution overlays (one where the parameter is estimated via MLE and the other where it is estimated via MME).

Figure 2. Empirical distribution function of the scaled above-threshold LIV golf 2022 earnings with two Pareto distribution overlays (one where the parameter is estimated via MLE and the other where it is estimated via MME).

Table 8. p-values for the LIV golf earnings data set (accessed 2023-09-11).

6. Concluding remarks

In this paper, we propose two new classes of goodness-of-fit tests for the Pareto Type I distribution based on a characterization involving order statistics. In addition, we also derive the null distribution of these test statistics and show that the tests are indeed consistent. A Monte Carlo simulation study is presented to demonstrate the finite-sample performance of these tests under a variety of alternative distributions. Through the inclusion of other similar tests for the Pareto distribution, the simulation study also demonstrates that our tests are competitive when compared to these other tests. Finally, the choice of the two tuning parameters appearing in these tests was also roughly explored, with the finding that, when the tests are to be implemented in a practical setting, the choices a = 0.5 and m = 2 or m = 3 can be recommended.

Acknowledgments

The work of the third and fourth authors is based on research supported by the National Research Foundation (NRF). Any opinion, finding and conclusion or recommendation expressed in this material is that of the authors and the NRF does not accept any liability in this regard.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Pareto V. The new theories of economics. J Polit Econ. 1897;4:485–502. doi: 10.1086/250454
  • Beirlant J, de Wet T, Goegebeur Y. Pricing risk when distributions are fat tailed. J Appl Probab. 2004;41A:157–175.
  • Brazauskas V. Robust parametric modeling of the proportional reinsurance premium when claims are approximately Pareto-distributed. Proc Business Economic Stats Sect. 2000;A:144–149.
  • Mert M, Saykan Y. On a bonus-malus system where the claim frequency distribution is geometric and the claim severity distribution is Pareto. Hacet J Math Stat. 2005;34:75–81.
  • Zisheng O, Chi X. Generalized Pareto distribution fit to medical insurance claims data. Appl Math J Chinese Universities. 2006;21:21–29. doi: 10.1007/s11766-996-0018-z
  • Keiding N, Hansen OKH, Sorensen DN, et al. The current duration approach to estimating time to pregnancy. Scand J Stat. 2012;39:185–204. doi: 10.1111/sjos.2012.39.issue-2
  • Keiding N, Kvist K, Hartvig H, et al. Estimating time to pregnancy from current durations in a cross-sectional sample. Bio-statistics. 2002;3:565–578.
  • Arnold B. Pareto distributions. Boca Raton, FL: CRC Press, Taylor and Francis Group; 2015.
  • Fisk P. The graduation of income distributions. Econometrica. 1961;29:171–185. doi: 10.2307/1909287
  • Steindl J. Random processes and the growth of firms. Madison, WI: Hafner Publishing; 1965.
  • Berger J, Mandelbrot B. A new model for error clusterng in telephone circuits. IBM J Res Dev. 1963;7:224–236. doi: 10.1147/rd.73.0224
  • Harris CM. The Pareto distribution as a queue service discipline. Oper Res. 1968;16:307–313. doi: 10.1287/opre.16.2.307
  • Klass O, Biham O, Levy M, et al. The Forbes 400 and the Pareto wealth distribution. Econ Lett. 2006;90:290–295. doi: 10.1016/j.econlet.2005.08.020
  • Ioannides Y, Skouras S. US city size distribution: Robustly Pareto, but only in the tail. J Urban Econ. 2013;73:18–29. doi: 10.1016/j.jue.2012.06.005
  • Lomax K. Business failures: another example of the analysis of failure data. J Am Stat Assoc. 1954;49:847–852. doi: 10.1080/01621459.1954.10501239
  • Falk M, Guillou A, Toulemonde G. A LAN based Neyman smooth test for Pareto distributions. J Stat Plan Inference. 2008;138(10):2867–2886. doi: 10.1016/j.jspi.2007.10.007
  • Charpentier A, Flachaire E. Pareto models for risk management. Working Papers hal-02423805, HAL. 2019. Available from: https://ideas.repec.org/p/hal/wpaper/hal-02423805.html.
  • Beirlant J, de Wet T, Goegebeur Y. A goodness-of-fit statistic for the Pareto-type behavior. J Comput Appl Math. 2006;186:99–116. doi: 10.1016/j.cam.2005.01.036
  • Gulati S, Shapiro S. Goodness-of-fit tests for Pareto distribution. In: Statistical models and methods for biomedical and technical systems. Boston, MA: Springer; 2008. p. 259–274.
  • Martynov G. Cramér-von Mises test for the Weibull and Pareto distributions. In: Proceedings of Dobrushin International Conference Moscow, Moscow, Russia, 2009. p. 117–122.
  • Rizzo ML. New goodness-of-fit tests for Pareto distributions. Astin Bull. 2009;39:69–715. doi: 10.2143/AST.39.2.2044654
  • Chu J, Dickin S, Nadarajah S. A review of goodness of fit tests for Pareto distributions. J Comput Appl Math. 2019;361:13–41. doi: 10.1016/j.cam.2019.04.018
  • Obradovíc M, Jovanovíc M, Miloševíc B. Goodness-of-fit tests for Pareto distribution based on a characterization and their asymptotics. Statistics. 2015;49:1026–1041. doi: 10.1080/02331888.2014.919297
  • Obradovíc M. On asymptotic efficiency of goodness of fit tests for Pareto distribution based on characterizations. Filomat. 2015;29:2311–2324. doi: 10.2298/FIL1510311O
  • Volkova K. Goodness-of-fit tests for Pareto distribution based on its characterization. Stat Methods Appl. 2016;25:351–373. doi: 10.1007/s10260-015-0330-y
  • Miloševíc B, Obradovíc M. Two-dimensional Kolmogorov-type goodness-of-fit tests based on characterizations and their asymptotic efficiencies. J Nonparametr Stat. 2016b;28:413–427. doi: 10.1080/10485252.2016.1163358
  • Ndwandwe L, Allison J, Santana L, et al. Testing for the Pareto type I distribution: A comparative study. 2022. arXiv preprint arXiv:2211.10088.
  • Allison J, Milošević B, Obradović M, et al. Distribution-free goodness-of-fit tests for the Pareto distribution based on a characterization. Comput Stat. 2022;37:403–418. doi: 10.1007/s00180-021-01126-y
  • Bilodeau M, Lafaye de Micheaux P. A multivariate empirical characteristic function test of independence with normal marginals. J Multivar Anal. 2005;9(2):345–369. doi: 10.1016/j.jmva.2004.08.011
  • Csörgö S. Kernel-transform empirical processes. J Multivar Anal. 1983;13:517–533. doi: 10.1016/0047-259X(83)90037-4
  • Ngatchou-Wandji J. Testing for symmetry in multivariate distributions. Stat Methodol. 2009;6(3):230–250. doi: 10.1016/j.stamet.2008.09.003
  • Fan Y, Lafaye de Micheaux P, Penev S, et al. Multivariate nonparametric test of independence. J Multivar Anal. 2017;153:189–210. doi: 10.1016/j.jmva.2016.09.014
  • Zhang J. Powerful goodness-of-fit tests based on the likelihood ratio. J R Stat Soc B Stat Methodol. 2002;64(2):281–294. doi: 10.1111/1467-9868.00337
  • Ahrari V, Baratpour S, Habibirad A, et al. Goodness of fit tests for Rayleigh distribution based on quantiles. Commun Stat. 2002;51(2):341–357. doi: 10.1080/03610918.2019.1651336
  • Meintanis S. A unified approach of testing for discrete and continuous Pareto laws. Stat Papers. 2009;50(3):569–580. doi: 10.1007/s00362-007-0103-2
  • Meintanis S. Goodness-of-fit tests and minimum distance estimation via optimal transformation to uniformity. J Stat Plan Inference. 2009;139(2):100–108. doi: 10.1016/j.jspi.2008.03.037
  • Lafaye de Micheaux P, Tran VA. PoweR: A reproducible research tool to ease Monte Carlo power simulation studies for goodness-of-fit tests in R. J Stat Softw. 2016;69(3):1–42. doi: 10.18637/jss.v069.i03
  • Giacomini R, Politis DN, White H. A warp-speed method for conducting Monte Carlo experiments involving bootstrap estimators. Econ Theory. 2013;29(3):567–589. doi: 10.1017/S0266466612000655
  • R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. 2021. Available from: https://www.R-project.org/.
  • www.spotrac.com. LIV golf results by year. [accessed 2023 September 11]. Available from: https://www.spotrac.com/liv/rankings/year/2022/.