1,451
Views
7
CrossRef citations to date
0
Altmetric
Articles

New extreme value theory for maxima of maxima

&
Pages 232-252 | Received 08 Mar 2020, Accepted 01 Nov 2020, Published online: 20 Dec 2020

Abstract

Although advanced statistical models have been proposed to fit complex data better, the advances of science and technology have generated more complex data, e.g., Big Data, in which existing probability theory and statistical models find their limitations. This work establishes probability foundations for studying extreme values of data generated from a mixture process with the mixture pattern depending on the sample length and data generating sources. In particular, we show that the limit distribution, termed as the accelerated max-stable distribution, of the maxima of maxima of sequences of random variables with the above mixture pattern is a product of three types of extreme value distributions. As a result, our theoretical results are more general than the classical extreme value theory and can be applicable to research problems related to Big Data. Examples are provided to give intuitions of the new distribution family. We also establish mixing conditions for a sequence of random variables to have the limit distributions. The results for the associated independent sequence and the maxima over arbitrary intervals are also developed. We use simulations to demonstrate the advantages of our newly established maxima of maxima extreme value theory.

1. Introduction

Rigorous risk analysis helps to make better decisions and prevent great failures. Extreme value theory has been a powerful tool in risk analysis and is widely applied to risk analysis in finance, insurance, health, climate, and environmental studies. In classical extreme value theory, the sequence of data is assumed to have the same marginal distribution, and the limit distribution of the maxima is in one of the extreme value types if it exists. Galambos (Citation1978), de Haan (Citation1993), Beirlant et al. (Citation2004), de Haan and Ferreira (Citation2006), Leadbetter et al. (Citation2012) and Resnick (Citation2013) amongst many monographs are good literatures introducing the theoretical results in the classical extreme value theory. Mikosch et al. (Citation1997), Embrechts et al. (Citation1999), McNeil and Frey (Citation2000), Coles (Citation2001), Finkenstädt and Rootzén (Citation2004), Castillo et al. (Citation2005), Salvadori et al. (Citation2007) and Dey and Yan (Citation2016) introduce many applications of extreme value method to the areas of science, engineering, nature, finance, insurance and climate. For example, in financial applications, extreme value theory is one of the tools to calculate the Value-at-Risk (VaR) and Expected Shortfall (ES) (e.g., Rocco, Citation2014; Tsay, Citation2005). Chavez-Demoulin et al. (Citation2016) offer an extreme value theory (EVT)-based statistical approach for modelling operational risk and losses, by taking into account dependence of the parameters on covariates and time. Zhang and Smith (Citation2010) propose the multivariate maxima of moving maxima (M4) processes and apply the method to model jumps in returns in multivariate financial time series and predict the extreme co-movements in price returns. Daouia et al. (Citation2018) use the extreme expectiles to measure VaR and marginal expected shortfall. In the statistical inference of maximum likelihood estimation (MLE), a discussion on the properties of maximum likelihood estimators of the parameters in generalised extreme value (GEV) distribution was given by Smith (Citation1985). In the paper, it is shown that the classical properties of the MLE hold when the shape parameter ξ>1/2, but not when ξ1/2. Bücher and Segers (Citation2017) give a general result on the asymptotic normality of the maximum likelihood estimator for parametric models whose support may depend on the parameters.

In the age of Big Data, the advances of science and technology have been changing data generating processes in a more complex way. As a result, the data structures and dependence structures accompanied by the collected data can be very different from the existed assumptions in many commonly used models. In the literature, advanced statistical models and machine learning approaches have been proposed to fit such complex data or learn the underlying structures better. For example, the support vector machine, the deep learning method, and the random forest method have now been very well recognised and wildly used in data analysis. In extreme value analysis for more complex data, the same marginal distribution assumption and its derived extreme value distributions can be very restrictive and lack of data fitting power. Although statistical models, e.g., Heffernan et al. (Citation2007), Naveau et al. (Citation2011), Tang et al. (Citation2013), Malinowski et al. (Citation2015), Zhang and Zhu (Citation2016) and Idowu and Zhang (Citation2017), have been proposed to model extreme values observed from different data sources with different populations and max-domains of attraction, their probability foundations have not been established.

The definition of the classical maximum domain of attraction cannot be applied directly to the extreme values of data drawn from different populations mixed together. Note that we are not dealing with mixtures of distributions that may belong to a maximum domain of attraction of classical extreme value distribution. In this study, we are dealing with maxima of maxima in which the maxima resulted from each population has its limit extreme value distribution and norming and centering constants and convergence rate. For example, in many real-world applications, the risks one is exposed to usually come from different resources, and the risk at a given time is decided by the dominant one, i.e., not the added risk of all risks. Let us consider a specific example: Suppose a patient suffers two severe diseases. The risk of that the patient will die over a certain time may be best described by the maximum, not the sum, of two risk variables.

This work extends the definition of the maximum domain of attraction to maxima of maxima of sequences of random variables in which the mixing patterns change along with the sample size. The accelerated max-stable distribution (accelerated extreme value distribution) is expressed as a product of the classical extreme value distributions for the maxima of maxima resulted from different distributions. Some basic properties and theoretical results are provided. It can be seen that the classical extreme value distributions are special cases of our newly established family of accelerated max-stable distributions. The results obtained can be applied to more complex data, e.g., Big Data. The new results also establish the probability foundation of previously proposed statistical models in extreme time series modeling. Those models include Heffernan et al. (Citation2007) that introduces one scheme where the maxima are taken over random variables with different distributions, and Zhang and Zhu (Citation2016) that models intra-daily maxima of high-frequency financial data.

The structure of this paper is as follows. In Section 2, (1) we give a brief review of the classical extreme value theory; (2) we define our maxima of maxima of sequences of random variables; (3) we use examples to demonstrate the characteristics of the maxima of maxima; (4) we establish the convergence of maxima of maxima to the accelerated max-stable distributions; (5) we illustrate density functions of the new family of accelerated max-stable distributions and evaluate moments and tail equivalence. Simulations are used to demonstrate the advantages of the accelerated max-stable distribution family in terms of the estimation accuracy of high quantiles at different levels. We also apply this new accelerated max-stable distribution to the high quantiles of the daily maxima of 330 stock returns of S&P 500 companies. In Section 3, the convergence of joint probability for general thresholds and approximation errors are developed. In Section 4, theoretical results for weakly dependent sequences are derived. Section 6 concludes. Additional figures and technical proofs are included in the Appendix.

2. Accelerated max-stable distribution for independent sequences

2.1. A brief review of classical univariate extreme value theory

In classical extreme value theory, the central result is the Fisher-Tippett theorem which specifies the form of the limit distribution for centered and normalised maxima. Let X1,X2,,Xn be a sequence of independent and identically distributed (i.i.d.) non-degenerate random variables (rvs) with common distribution function F and Mn=max(X1,,Xn) be the sample maxima. The Fisher-Tippett theorem states that: If for some norming constants an>0 and centering constants bn we have (1) P(an(Mnbn)x)wH(x)(1) for some nondegenerate H, where w stands for convergence in distribution, then H belongs to one type of the following three cumulative distribution functions (cdf's): (2) Fr'echet: Φα(x)=0,x0exp{xα},x>0,α>0.Weibull: Ψα(x)=exp{(x)α},x01,x>0,α>0.Gumbel:Λ(x)=exp{ex},xR.(2) Conversely, every extreme value distribution in (Equation2) can be a limit in (Equation1), and in particular, when H itself is the cdf of each Xi, the limit is itself. We say that F belongs to the maximum domain of attraction of the extreme value distribution of H, and denote as FMDA(H) when (Equation1) holds. H is also called the max-stable distribution since for any n=2,3,, there are constants an>0 and bn such that Hn(anx+bn)=H(x). Due to this property, the equivalence of extreme value distribution or max-stable distribution in practice is mutually implied.

2.2. Maxima of maxima

Suppose that the independent mixed sequence of random variables {Xi}i=1n is composed of k subsequences {Xj,i}i=1nj, j=1,2,,k; {Xj,i}i=1nji.i.d.Fj(x), nj as n and n=n1++nk. Denote Mj,nj=max(Xj,i, i=1,,nj) as the maximum of the jth subsequence, j=1,2,,k. Suppose FjMDA(Hj), where Hj is one of the three types of extreme value distributions, i.e., Mj,nj has the following limit distribution with some norming constants aj,nj>0 and centering constants bj,nj, (3) limnP(aj,nj(Mj,njbj,nj)x)=Hj(x).(3) Define Mn=max(M1,n1,M2,n2,,Mk,nk), i.e., Mn is the maxima of k maxima of Mj,njs. Throughout the paper, Mn is termed as the maxima of maxima. Questions can be asked: (1) whether or not Equation (Equation1) holds with appropriately chosen norming constants an>0,bn; (2) if Equation (1) holds, whether or not an>0,bn are equivalent to any of aj,nj>0,bj,nj; (3) whether or not H(x) is a function of Hj(x); (4) if all Equations (1)–(3) hold, which one is the best method to be used in practice. This paper intends to answer these four questions.

Practical examples related to the above defined process can be numerous. For example, (1) the maximum temperature of the US in a day can be described by the maximum of maxima of regional maximum temperatures. In each region, the maximum temperature is the maximum temperature recordings among all weather stations in the region. Considering the regions' spatial and geographical patterns, the regional maxima certainly follow different extreme value distributions from one region to another region. The US temperature maxima are the maxima of regional maxima, and should be modelled by a distribution function that is a function of the regional extreme value distribution functions. (2) Considering the daily risk of high-frequency trading in a stock market, one can partition the data into hourly data (from 9:00 am to 4:00 pm). Suppose each hourly maxima Mj,nj of negative returns can be approximately modelled by an extreme value distribution of Hj(x). It is clear that Mn is better modelled by a function of Hj(x), j=1,,7, i.e., not a single Hj(x). We use the following simple example with k = 2 to illustrate the idea.

Example 2.1

The sequence {Xi}i=1n is generated by Xi=max(Yi,Zi), where {Yi}i=1ni.i.d.F1(x), {Zi}i=1ni.i.d.F2(x), and F1(x) and F2(x) are two distribution functions. Assume Yi and Zi are independent. Then {Xi}i=1ni.i.d.F(x)=F1(x)F2(x).

Remark

The form Xi=max(Yi,Zi) is the simplest case in the general mixture models introduced in Zhao and Zhang (Citation2018). It is also the simplest case in the copula structured M4 models studied by Zhang and Zhu (Citation2016).

For illustrative purpose of Example 2.1, let's consider two scenarios. Suppose {Yi[k]}i=1ni.i.d.N(0,1) and {Zi[k]}i=1ni.i.d.U[a,b] for k=1,,m. Here U[a,b] represents the uniform distribution on the interval [a,b]. The superscript [k] stands for the kth sample sequence. In Scenario 1, Figure  illustrates two different simulated sequences of {Xi[k]}i=1n, where Xi[k]=max(Yi[k],Zi[k]), and the maxima of Mn[k]=max(X1[k],,Xn[k]) for n = 100 and a particular k, e.g., k = 1. Next, we repeatedly generate m=10,000 such sequences {Xi[k]}i=1n, k=1,,m. By taking the maxima Mn[k]=max(X1[k],,Xn[k]), k=1,,m, the histogram of {Mn[k]}k=1m is displayed in Figure (a) with a = −2.2 and b = 2.2. In Scenario 2, by replacing the marginal distribution of Zi[k] with U[2.8,2.8], the histogram of Mn[k] is shown in Figure (b). It is clear that although {Xi[k]}i=1n is independent and identically distributed (i.i.d.), one can see that the distribution of Mn looks quite different from the three types of extreme value distributions.

Figure 1. Simulated mixed sequences from normal and uniform distributions and their maxima (marked with black dots). In (a), the maximum is from the uniform distribution; in (b), the maximum is from N(0,1).

Figure 1. Simulated mixed sequences from normal and uniform distributions and their maxima (marked with black dots). In (a), the maximum is from the uniform distribution; in (b), the maximum is from N(0,1).

Figure 2. (a) Histogram of Mn from N(0,1) and U[2.2,2.2]. (b) Histogram of Mn from N(0,1) and U[2.8,2.8].

Figure 2. (a) Histogram of Mn from N(0,1) and U[−2.2,2.2]. (b) Histogram of Mn from N(0,1) and U[−2.8,2.8].

In Example 2.1, the larger values of two paired underlying subsequences are observed while the smaller values are covered up by larger ones and are never observed. The sample sizes from the two subsequences are the same. However, in general mixed sequences the ratios of sample sizes from two subsequences n1/n2 can be any value between 0 and infinity and can vary as the total sample size grows. As a result, we can see many kinds of different patterns different from Figure .

In practice, data generating processes are naturally formed spatially and temporarily from underlying physical processes of studies. Here we provide two data generating processes in simulation.

  1. For a given sample size n, we set the numbers n1 and n2 satisfying n1+n2=n and assume that limnn1/(n1+n2)r, r(0,1). Then we generate the specified n1 and n2 observations from two populations respectively, stack them in a sequence, and perform a random permutation of the combined sequence. In a physical process, the procedure can be designed as: we mix n1 yellow balls and n2 white balls in a bag. Then we draw balls sequentially. If a yellow ball is drawn, generate a number from the first population, otherwise from the second population.

  2. Alternatively, suppose a1,n1, a2,n2 are norming constants defined in Equation (Equation3) with known population distributions in simulation, we can set n1+n2=n and let limna1,n1/a2,n2=r, and then solve n1 and n2 to generate the observations as the last step.

Example 2.2

Using the sampling scheme designed above. Suppose there are two sequences {X1,i}i=1n1i.i.d.N(0,0.9) and {X2,i}i=1n2i.i.d.U[2,2] with n1=100 and n2=200. {Xk}k=1300 is mixed with these two sequences. Let Mn[j] be the maxima of the jth realisation of the sequence, j=1,,m, n = 300. With m=10,000, {Mn[j]}j=1m are calculated and the histogram is shown in Figure (a). The case of n1=200 and n2=100 is shown in Figure (b).

Figure 3. Histograms of combinations of Mn from N(0,0.9) and U[2,2]. (a) n1=100, n2=200. (b) n1=200, n2=100.

Figure 3. Histograms of combinations of Mn from N(0,0.9) and U[−2,2]. (a) n1=100, n2=200. (b) n1=200, n2=100.

The histograms in Figure look different from any of the three types of extreme value distributions discussed in (Equation2). One feature is that they can be bimodal. On the other hand, the classical GEV distributions are all unimodal. Figure shows two specific examples of choices of n1 and n2. In more general situations, the ratios of n1 and n2 can be any values in (0,). The ratio n1/n2 may also change as n increases. In Figures and , the left parts of the distributions are dominated by the Weibull type induced by the uniform distribution, and the right parts resemble the Gumbel type induced by the normal distribution. The reason is that when we look at the maxima of {Xi}i=1n, there are two populations competing with each other. Taking (b) in Figure as an example, the winners from U[2,2] form the steep peak on the left; and the winners from N(0,0.9) form the smoother peak on the right.

Figure (a) shows the distribution of Mn for the sequence which is mixed with N(0,1) and a Fréchet distribution. In (b), (c) and (d), they show the combinations of one Fréchet distribution and one Weibull distribution. Notice that in panel (b), the distribution looks left-skewed and is very similar to a Weibull distribution. However, with the effect of the Fréchet distribution, it actually has an infinite right endpoint.

Figure 4. Histograms of Mn. (a) N(0,1) and Fréchet combination. (b)–(d) Some combinations of Fréchet and Weibull.

Figure 4. Histograms of Mn. (a) N(0,1) and Fréchet combination. (b)–(d) Some combinations of Fréchet and Weibull.

In Figure , histograms of Mn are created such that the independent sequences of random variables {Xi}i=1n are generated by comparing the pairs of observations from normal and Weibull distribution. They can be unimodal or bimodel, left-skewed or right-skewed. If we use the GEV family to characterise the distributions of Mn in these examples, it may not capture the shape of the distribution properly. For example, if we look at the left part of the distribution in Figure (d), it resembles a Weibull distribution that has a finite right endpoint. However, because of the effect of the normal distribution on the right tail, the shape changes suddenly to be similar to a Gumbel distribution with infinite right endpoint. If we fit a GEV distribution to Mn, the left part with more sample data may have a large effect on the fitted distribution and we may underestimate the long tail on the right.

Figure 5. Histograms of Mn, with combinations of normal distribution and Weibull distribution.

Figure 5. Histograms of Mn, with combinations of normal distribution and Weibull distribution.

2.3. Convergence to the accelerated max-stable distribution

Throughout the paper, xF=sup{x;F(x)<1} is the right endpoint of a cdf F and let F¯(x)=1F(x); Mn=max(M1,n1,,Mk,nk) is restricted to k = 2. For k>2, relative results can be derived with additional notations. The following theorem shows that under certain conditions on the norming constants aj,nj and bj,nj, we can choose one set of the norming constants for the global maximum Mn=max(M1,n1,M2,n2) to derive its limit distribution. Theorem 2.1 can be directly derived from Khintchine's theorem.

Theorem 2.1

If M1,n1 and M2,n2 satisfy (Equation3) for j = 1, 2, the limit distribution of Mn as n can be determined in the following cases:

Case 1.

If a1,n1a2,n2a>0, a1,n1(b2,n2b1,n1)b<+, for some constants a and b, then (4) P(a2,n2(Mnb2,n2)x)H1(ax+b)H2(x).(4)

Case 2.

If a1,n1a2,n20, a1,n1(b2,n2b1,n1)+, then (5) P(a2,n2(Mnb2,n2)x)H2(x).(5)

Notice that the limit in Case 1 is the product of two extreme value distributions, H1(ax+b)H2(x). Although it is in the product form, sometimes it can still be reduced to the three classical extreme value distributions. For example, exp{xα}exp{(x2)α} is still a Fréchet type. However, in some situations, when the conditions in Case 1 are satisfied, the limit product form cannot be reduced to any one of the three extreme value distributions. We next present several examples to illustrate these possibilities.

Example 2.3

Fréchet and Gumbel

Suppose F1(x)=Φα(x) is a Fréchet distribution function, and F2(x)=Λ(x) is the standard Gumbel distribution function. By choosing a1,n1=n11/α,b1,n1=0, a2,n2=1,b2,n2=logn2 we have (6) P(M2,n2logn2x)Λ(x).(6) Then when n11/α/logn2, we have P(n11/αMnx)=P(n11/αM1,n1x,M2,n2logn2n11/αxlogn2)Φα(x).

Example 2.4

Fréchet and Fréchet

Suppose F1(x)=Φα1(x) and F2(x)=Φα2(x) are two Fréchet distribution functions such that α1>α2, which means that the tail of F2(x) is heavier than the tail of F1(x). By choosing norming constants a1,n1=n11/α1, b1,n1=0 and a2,n2=n21/α2, b2,n2=0 we have (7) P(nj1/αjMj,njx)=Φαj(x),j=1,2,(7) and (8) P(n21/α2Mnx)=Pn11/α1M1,n1n21/α2n11/α1x,n21/α2M2,n2x.(8) If n21/α2/n11/α1a>0, then (9) P(n21/α2Mnx)Φα1(ax)Φα2(x).(9) If n21/α2/n11/α1+, then (10) P(n21/α2Mnx)Φα2(x).(10)

In Example 2.4, the sequence is mixed with two Fréchet distributions with different shape parameters. The limit distribution of Mn for this mixed sequence is the product of two Fréchet distributions, which is different from any of the three types of extreme value distributions.

Example 2.5

Uniform and normal

Suppose F1(x) is the function of the uniform distribution U[0,1], F2(x) is the distribution function of N(0,1). By choosing (11) a1,n1=n1,b1,n1=1,(11) and a2,n2=(2logn2)1/2,b2,n2=(2logn2)1/212(2logn2)1/2(loglogn2+log4π),we have (12) P(n1(M1,n11)x)ex(12) for x<0, and (13) P(a2,n2(M2,n2b2,n2)x)Λ(x).(13) Then P(a2,n2(Mnb2,n2)x)=Pn1(M1,n11)n1xa2,n2+b2,n21,a2,n2(M2,n2b2,n2)xxa2,n2.Since n1(xa2,n2+b2,n21)+ for any x, we have (14) P(a2,n2(Mnb2,n2)x)Λ(x).(14)

Example 2.6

Weibull and Weibull

Suppose xF< and K1>0,K2>0, (15) F1(x)=1K1(xFx)α1,xFK11/α1xxF,(15) (16) F2(x)=1K2(xFx)α2,xFK21/α2xxF,(16) are two polynomial functions with common finite endpoint xF, α1>α2. We can choose a1,n1=(n1K1)1/α1, b1,n1=xF, a2,n2=(n2K2)1/α2, b2,n2=xF, and (17) P((n1K1)1/α1(M1,n1xF)x)Ψα1(x),(17) (18) P((n2K2)1/α2(M2,n2xF)x)Ψα2(x).(18) If (n2K2)1/α2(n1K1)1/α1a>0, then P((n1K1)1/α1(MnxF)x)=P(n2K2)1/α2(n1K1)1/α1(n1K1)1/α1(M1,n1xF),(n2K2)1/α2×(M2,n2xF)(n2K2)1/α2(n1K1)1/α1xΨα1(x)Ψα2(ax).

Example 2.7

Normal and Pareto

Suppose F1(x) is the standard normal distribution function of N(0,1), F2(x)=1Kxα, α>0, K>0 is a Pareto distribution function. Let a1,n1=(2logn1)1/2,b1,n1=(2logn1)1/212(2logn1)1/2(loglogn1+log4π),a2,n2=(Kn2)1/α,b2,n2=0.Then P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)0,x<0,exp(exxα),x0.Furthermore, if a2,n2,b1,n1, then P(a1,n1(Mnb1,n1)x)exp(ex).

Example 2.8

Cauchy and uniform distribution

F1(x)=12+1πtan1x is the standard Cauchy distribution function, and F2(x)=x,0x1, let a1,n1=tanπn1πn1,b1,n1=0,a2,n2=n2,b2,n2=1.Then Pπn1M1,n1x,n2(M2,n21)x0,x<0,exp(x1),x0,and P(a1,n1(Mnb1,n1)x)0,x<0,exp(x1),x0.

In Example 2.8, the limit distribution for the normalised M1,n1 is 0 when x<0, and the limit distribution for the normalised M2,n2 is 1 when x>0. Thus, the product is the same as the former one.

In Examples 2.3 and 2.5, we showed that when n is sufficiently large (goes to infinity), the distribution of Mn will be dominated by the subsequence whose marginal distribution has a heavier tail. In Examples 2.4 and 2.6, if the ratio n21/α2/n11/α1 converges to a constant, then one subsequence is never dominated by another, and the limit is of the product form that cannot be reduced to a classical extreme value distribution if α1α2.

We now introduce the accelerated max-stable distribution (AMSD) or the accelerated extreme value distribution (AEVD). We consider the convergence of the probability related to the normalised maxima M1,n1 and M2,n2 of two subsequences separately. By the relationship Mn=max(M1,n1,M2,n2), we can use the accelerated max-stable distribution to approximate the distribution of Mn. The classical extreme value distributions will be special cases in the accelerated max-stable distribution family.

Definition 2.1

Let H1(x) and H2(x) be two max-stable distribution functions, we call H(x)=H1(x)H2(x) the accelerated max-stable distribution (AMSD/AEVD) function, which is the product of two max-stable distribution functions. More generally, we also say that H(x) belongs to the accelerated max-stable distribution family if it is the product of k max-stable distribution functions, k2.

Remark

If Z follows an accelerated max-stable distribution H(x), then Z can be expressed as Z=max(Z1,,Zk), where each Zi follows a max-stable distribution. By taking maxima of (Z1,,Zk), Zi values are accelerated by other components Zjs to get observed Z values. On the other hand, we have H1(x)H2(x)Hk(x)H1(x)H2(x)Hk1(x)H1(x)H2(x)Hk2(x)H1(x)and H1(x)H2(x)Hk(x)¯H1(x)H2(x)Hk1(x)¯H1(x)H2(x)Hk2(x)¯H1(x)¯,where H¯(x) stands for the survival function, i.e., H1(x)H2(x)Hk(x)¯=1H1(x)H2(x)Hk(x).The above inequalities may be regarded as accelerated survival rates. This observation motivates us to call the new distribution as the accelerated max-stable (extreme value) distribution. In the view of risk analysis, the systemic risk of Z is accelerated from individual risks of Zjs given a fixed confidence level.

For the independent sequence of random variables {Xi}i=1n with two subsequences {X1,i}i=1n1 and {X2,i}i=2n2 defined as above, suppose (Equation3) is satisfied with j = 1, 2 and norming constants aj,nj>0,bj,nj, i.e., (19) limnP(aj,nj(Mj,njbj,n)x)=Hj(x),j=1,2,(19) then (20) P(max(a1,n1(M1,n1b1,n1),a2,n2(M2,n2b2,n2))x)H(x)=H1(x)H2(x).(20)

Definition 2.2

Suppose an independent sequence of random variables {Xi}i=1n satisfies (Equation19) and (Equation20). We call the underlying distribution, Fni, of Xi belongs to the competing-maximum domain of attractions of H1 and H2, and denote as FniCMDA(H1,H2).

We note that a max-stable distribution may also be decomposed into a product of two max-stable distributions. As a result, the max-stable distribution family can be thought as a family that is embedded in the accelerated max-stable distribution family. This observation can be seen in Theorem 2.1 that the limits of P(a2,n2(Mnb2,n2)) under two different conditions belong to the accelerated max-stable distribution family. In other words, the accelerated max-stable distributions form an expanded family of distributions that can describe the limiting distribution of the normalised maxima for more general sequences.

For k = 2 and FniCMDA(H1,H2), AMSDs/AEVDs can have the following six possible combinations:

Case 1.

FjMDA(Λ), j = 1, 2, P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΛxb1a1Λxb2a2.

Case 2.

F1MDA(Φα1) and F2MDA(Φα2), P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΦα1xb1a1Φα2xb2a2.

Case 3.

F1MDA(Ψα1) and F2MDA(Ψα2), P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΨα1xb1a1Ψα2xb2a2.

Case 4.

F1MDA(Λ) and F2MDA(Φα), P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΛxb1a1Φαxb2a2.

Case 5.

F1MDA(Λ) and F2MDA(Ψα), P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΛxb1a1Ψαxb2a2.

Case 6.

F1MDA(Φα1) and F2MDA(Ψα2), P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)wΦα1xb1a1Ψα2xb2a2.

It is easy to see that the classical extreme value distributions are special cases of the AMSD family. For any a>0, b>0 satisfying 1a+1b=1, we have Λ(x)=exp{ex}=exp{e(xa+xb)}=exp{(exlogaexlogb)}=Λ(x+loga)Λ(x+logb).Φα(x)=exp{xα}=expxa1ααxb1αα=Φxa1αΦxb1α.Ψα(x)=exp{(x)α}=expxa1ααxb1αα=Ψxa1αΨxb1α.Since H1(x) and H2(x) are max-stable distributions, for any n1=2,3, and n2=2,3,, there are constants a1,n1>0, b1,n1, a2,n2>0, b2,n2 such that H1(x)H2(x)=H1n1(a1,n1x+b1,n1)H2n2(a2,n2x+b2,n2).

In Equation (Equation20), we considered the convergence of P(max(a1,n1(M1,n1b1,n1),a2,n2(M2,n2b2,n2))x),instead of the traditional P(an(Mnbn)x). If n1 and n2 are sufficiently large, by (Equation19) we have P(a1,n1(M1,n1b1,n1)x)G1(x) and P(a2,n2(M2,n2b2,n2)x)G2(x), then (21) P(Mnx)=P(max(M1,n1,M2,n2)x)=P(M1,n1x)P(M2,n2x)G1(a1,n1(xb1,n1))G2(a2,n2(xb2,n2))=G1(x)G2(x),(21) where Gj is of the same type as Gj, j = 1, 2.

To close this section, we remark that (Equation21) is the basis of applying the newly introduced AMSD/AEVD family to real data. Based on (Equation21), in practice, we don't need to worry about the values of n1, n2, a1,n1, b1,n1, a2,n2, b2,n2, as they are absorbed in G1(x) and G2(x), see also Coles (Citation2001). In our examples, we have used some fixed numbers for n1 and n2. They are just for simulation convenience. When n tends to infinity, the values of n1 and n2 will depend on n.

The next section presents density functions and shapes from which one can see the flexibility of applying the new distribution to real data modelling.

2.4. Density functions and density plots

The density function of the accelerated max-stable distribution requires some discussion of the support region of the cumulative distribution function. We can express the two terms in the product using the form of the generalised extreme value distribution, (22) F(x)=Hξ1;μ1,σ1(x)Hξ2;μ2,σ2(x)=exp1+ξ1xμ1σ11/ξ11+ξ2xμ2σ21/ξ2,(22) where 1+ξ1xμ1σ1>0 and 1+ξ2xμ2σ2>0. We include the special case H0;μi,σ2 as the limit of Hξi;μi,σi for ξi0,i=1,2. Denote the density function as f(x) and let h(x)=exp1+ξ1xμ1σ11/ξ11+ξ2xμ2σ21/ξ2×1σ11+ξ1xμ1σ11/ξ11+1σ21+ξ2xμ2σ21/ξ21.Since ξ1 and ξ2 are symmetric, we only present one of them. We have the following six cases for the density functions.

Case 1.

ξ1=0,ξ2=0. f(x)=exp{exμ1σ1exμ2σ2}×1σ1exμ1σ1+1σ2exμ2σ2,xR.

Case 2.

ξ1>0, ξ2>0, assuming μ1σ1ξ1μ2σ2ξ2, then f(x)=h(x),if x>μ1σ1ξ1,0,if xμ1σ1ξ1.

Case 3.

ξ1<0, ξ2<0, assuming μ1σ1ξ1μ2σ2ξ2, then f(x)=h(x),if x<μ2σ2ξ2,exp1+ξ1xμ1σ11/ξ1×1σ11+ξ1xμ1σ11/ξ11,if μ2σ2ξ2xμ1σ1ξ1,0,if x>μ1σ1ξ1.

Case 4.

ξ1=0, ξ2>0. f(x)=expexμ1σ11+ξ2xμ2σ21/ξ2,×1σ1exμ1σ1+1σ2(1+ξ2xμ2σ2)1/ξ21,if x>μ2σ2ξ2,0,if xμ2σ2ξ2.

Case 5.

ξ1=0, ξ2<0. f(x)=expexμ1σ11+ξ2xμ2σ21/ξ2×1σ1exμ1σ1+1σ2(1+ξ2xμ2σ2)1/ξ21,if x<μ2σ2ξ2,exp{exμ1σ1}×1σ1exμ1σ1,if xμ2σ2ξ2.

Case 6.

ξ1>0, ξ2<0.

If μ1σ1ξ1μ2σ2ξ2, f(x)=exp1+ξ1xμ1σ11/ξ1×1σ11+ξ1xμ1σ11/ξ111/ξ11,if x>μ1σ1ξ1,0,if xμ1σ1ξ1.If μ1σ1ξ1<μ2σ2ξ2, f(x)=exp1+ξ1xμ1σ11/ξ1×1σ11+ξ1xμ1σ11/ξ11,if x>μ2σ2ξ2,h(x),if μ1σ1ξ1xμ2σ2ξ2,0,if x<μ1σ1ξ1.

In Figures  and , four density plots of Weibull-Gumbel type are shown. In Figure , panel (a) is the density plot of Fréchet-Fréchet type; and panel (b) is the density plot of Fréchet-Gumbel type. We can observe that they capture the shapes of the histograms shown in Figures and .

Figure 6. Density plots of the accelerated max-stable distributions with Weibull-Gumbel combinations. (a) ξ1=0, μ1=0.5, σ1=1, ξ2=1, μ2=1, σ2=1. (b) ξ1=0, μ1=0.5, σ1=1, ξ2=1, μ2=0.5, σ2=1.

Figure 6. Density plots of the accelerated max-stable distributions with Weibull-Gumbel combinations. (a) ξ1=0, μ1=0.5, σ1=1, ξ2=−1, μ2=−1, σ2=1. (b) ξ1=0, μ1=0.5, σ1=1, ξ2=−1, μ2=0.5, σ2=1.

Figure 7. Density plots of the accelerated max-stable distributions with Weibull-Gumbel combinations. (a) ξ1=1, μ1=1, σ1=1, ξ2=0, μ2=0.5, σ2=0.7. (b) ξ1=0.5, μ1=2, σ1=1, ξ2=0, μ2=1, σ2=0.7.

Figure 7. Density plots of the accelerated max-stable distributions with Weibull-Gumbel combinations. (a) ξ1=−1, μ1=−1, σ1=1, ξ2=0, μ2=0.5, σ2=0.7. (b) ξ1=−0.5, μ1=−2, σ1=1, ξ2=0, μ2=−1, σ2=0.7.

Figure 8. (a) Density plot of the accelerated max-stable distribution with Fréchet-Fréchet combinition. ξ1=0.5, μ1=0, σ1=1, ξ2=0.9, μ2=0, σ2=1. (b) Density plot of the accelerated max-stable distributions with Fréchet-Gumbel combinition. ξ1=0, μ1=0, σ1=3, ξ2=1, μ2=3, σ2=0.2.

Figure 8. (a) Density plot of the accelerated max-stable distribution with Fréchet-Fréchet combinition. ξ1=0.5, μ1=0, σ1=1, ξ2=0.9, μ2=0, σ2=1. (b) Density plot of the accelerated max-stable distributions with Fréchet-Gumbel combinition. ξ1=0, μ1=0, σ1=3, ξ2=1, μ2=−3, σ2=0.2.

In Figure (b), it is for ξ1=ξ2=0, i.e., the combination of two Gumbel distributions. In this case, the density plot is bimodal, which is different from that of a Gumbel distribution. Suppose that X1,iN(μ1,σ1) and X2,jN(μ2,σ2), 1in1 and 1jn2, then we have some norming constants a1,n1>0,b1,n1 and a2,n2>0,b2,n2 such that (23) P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)Λxμ1σ1Λxμ2σ2=exp{exμ1σ1exμ2σ2}.(23) Here the limit product form requires that the two scale parameters σ1σ2. Otherwise, the product exp{exμ1σexμ2σ} reduces to the Gumbel type.

Figure 9. (a) Density plot of the accelerated max-stable distribution with Weibull-Fréchet combination. ξ1=0.5, μ1=0, σ1=1, ξ2=0.3, μ2=1, σ2=0.1. (b) Density plot of the accelerated max-stable distribution with Gumbel-Gumbel combinition. ξ1=0, μ1=0, σ1=3, ξ2=0, μ2=3, σ2=0.3.

Figure 9. (a) Density plot of the accelerated max-stable distribution with Weibull-Fréchet combination. ξ1=−0.5, μ1=0, σ1=1, ξ2=0.3, μ2=−1, σ2=0.1. (b) Density plot of the accelerated max-stable distribution with Gumbel-Gumbel combinition. ξ1=0, μ1=0, σ1=3, ξ2=0, μ2=−3, σ2=0.3.

2.5. Tail equivalence and the existence of moments

In this section, we discuss some results of tail-equivalence, and which moments are finite for certain AMSDs/AEVDs.

Definition 2.3

Two cdf's F and H are called tail-equivalent if they have the same right endpoint, i.e., if xF=xH, and (24) limxxFF¯(x)/H¯(x)=c(24) for some constant 0<c<.

We have the following facts.

Fact 2.1

It is clear that the product distribution of a Weibull distribution and another type of extreme value distribution H(x) is tail equivalent to H(x).

Fact 2.2

Suppose XΦα1Φα2, let μk=E(Xk) be the kth moment of X, then μk is finite only if k<min(α1,α2).

Suppose α1<α2, then Φα1 has a heavier tail than Φα2. Let μk(1) be the kth moment of XΦα1. We know that μk(1)< only if k<α1. This implies that Φα1Φα2 has the same right-tail heaviness as Φα1.

Fact 2.3

If 0<α1<α2, then Φα1Φα2 and Φα1 are tail-equivalent.

Fact 2.4

Suppose XΛ(x)Φα(x). Let μk=E(Xk) be the kth moment of X. Then μk is finite only if k<α.

Fact 2.5

Λ(x)Φα(x) and Φα(x) are tail-equivalent.

Fact 2.6

If H1(x) has a heavier tail than H2(x), then the accelerated max-stable distribution H1(x)H2(x) is tail-equivalent to H1(x).

3. Joint convergence and approximation errors

3.1. Convergence of joint probability for general thresholds

It may also be interesting to consider the limits of P(M1,n1u1,n1,M2,n2u2,n2) for some sequences u1,n1 and u2,n2 not necessarily of the form x/ai,ni+bi,ni or even not dependent on x. Here n1 and n2 are the lengths of the two subsequences, we may write them specifically as n1(n) and n2(n) since they vary with the total length n. When choosing uj,nj=x/aj,nj+bj,nj for j = 1, 2, it becomes the problem we discussed before. The question is:

Which conditions on F1 and F2 ensure that the limit of P(M1,n1u1,n1,M2,n2u2,n2) for n exists for appropriate constants u1,n1 and u2,n2?

Some conditions on tails F¯1 and F¯2 are required to ensure that P(M1,n1u1,n1,M2,n2u2,n2) converges to a non-trivial limit, i.e., a number in (0,1).

Theorem 3.1

Suppose {Xi}i=1n is an independent sequence of random variables which is mixed with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2 with underlying distributions F1(x) and F2(x), n1 and n2 as n. Let 0τ< and {u1,i}i=1n1 and {u2,i}i=1n2 are two sequences of real numbers such that (25) n1(1F1(u1,n1))+n2(1F2(u2,n2))τasn.(25) Then (26) P(M1,n1u1,n1,M2,n2u2,n2)eτasn.(26) Conversely, if (Equation26) holds for some 0τ<, then so does (Equation25).

Remark

Since 1F(uj,nj) is the probability that Xj,i exceeds level uj,nj, Equation (Equation25) means that the expected number of exceedences of u1,n1 by {X1,i}i=1n1 and u2,n2 by {X2,i}i=1n2 in total converges to τ. When the sequence is generated from one distribution F(x), Theorem 3.1 can be reduced to the classical result by choosing u1,n1=u2,n2=un. That is (27) n(1F(un))τ,(27) if and only if (28) P(Mnun)eτ(28) as n.

The following corollary gives the conditions such that we can choose one of {u1,i}i=1n1 and {u2,i}i=1n2 to be applied to Mn, and derive a similar limit of P(Mnun). The condition involves both the ratio of two tail probabilities 1F1(u1,n1)1F2(u2,n2) and n1n2.

Corollary 3.1

Let 0τ1<, 0τ2<. Suppose that there exist two sequences u1,n1 and u2,n2 such that (29) n1(1F1(u1,n1))τ1,n2(1F2(u2,n2))τ2.(29) Then (30) P(M1,n1u1,n,M2,n2u2,n)eτ1τ2.(30) Moreover, if n2(1F2(u1,n1))n1(1F1(u1,n1))t, where 0t<, then (31) P(Mnu1,n1)eτ1(1+t).(31)

Specifically, if we choose u1,n1=xa1,n1+b1,n1, u2,n2=xa2,n2+b2,n2, and suppose that (32) P(a1,n1(M1,n1b1,n1)x)G1(x),(32) (33) P(a2,n2(M2,n2b2,n2)x)G2(x),(33) then G1 and G2 belong to the GEV distribution family and the limit in (Equation31) becomes G1(x)G2(x).

The following is an example of mixed sequence and the limit properties of the maxima of subsequences and the global maxima.

Example 3.1

Suppose {Xi}i=1n is a sequence of random variables combining two subsequences {X1,i}i=1n1 and {X2,i}i=1n2. Suppose n1np, where 0p1, {X1,i}i=1n1 and {X2,i}i=1n2 are i.i.d. from a Pareto distribution with F1(x)=1Kxα1, α1>0, K>0, x>0 and a Fréchet distribution with F2(x)=exp(xα2), α2>0, x>0, respectively.

Since (1F1(tx))/(1F1(t))=xα1 for each x>0, so that Type II (Fréchet) limit applies. For u1,n1=(Kn1,n1/τ)1/α1 we have 1F1(u1,n1)=τ/n1, so that (34) PM1,n1Kn1τ1/α1eτ.(34) Putting τ=xα1 for x0, (35) P((Kn1)1/α1M1,n1x)exp(xα1).(35) On the other hand, F2n2(n21/α2x)=F2(x), i.e., P(n21/α2M2,n2x)=F2(x).

Then we have for x0, (36) P((Kn1)1/α1M1,n1x,n21/α2M2,n2x)exp(xα1xα2).(36) Since limxF¯1(x)F¯2(x)=limxKxα11exp(xα2)=limxKxα1xα2+O(x2α2)0,α1>α2,K,α1=α2,,α1<α2.When α1=α2, the condition n2(1F2(u1,n1))n1(1F1(u1,n1))1ppK in Corollary 3.1 is satisfied, hence (37) P(Mnn11/α1x)expxα11+1ppk.(37) Since n1np, we also have (38) P(Mn(np)1/α1x)expxα11+1ppk.(38)

3.2. Approximation error

The convergence results are usually accompanied by the question of the approximation error. Suppose n1(1F1(u1,n1))τ1 and n2(1F2(u2,n2))τ2, writing τ1,n1=n1(1F1(u1,n1)) and τ2,n2=n2(1F2(u2,n2)), then by Theorem 3.1 we have (39) P(M1,n1u1,n1,M2,n2u2,n2)eτ1τ2.(39) The approximation can be decomposed into several parts. We have 1τ1,n1n1n1eτ1,n1,1τ2,n2n2n2eτ2,n2,and eτ1,n1eτ1,eτ2,n2eτ2.We denote Δ1,n1=1τ1,n1n1n1eτ1,n1,Δ1,n1=eτ1,n1eτ1,Δ2,n2=1τ2,n2n2n2eτ2,n2,Δ2,n2=eτ2,n2eτ2.Then P(M1,n1u1,n1)eτ1=Δ1,n1+Δ1,n1,P(M2,n2u2,n2)eτ2=Δ2,n2+Δ2,n2.The following result gives the bound for the approximation error.

Theorem 3.2

Let {Xi}i=1n be an independent sequence of random variables mixed with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2, which satisfies n1(1F1(u1,n1))τ1 and n2(1F2(u2,n2))τ2, Δ1,n1, Δ1,n1, Δ2,n2, Δ2,n2 are defined as above, then P(M1,n1u1,n1,M2,n2u2,n2)eτ1τ2Δ1,n1+Δ1,n1+Δ2,n2+Δ2,n2with 0Δj,njτj,njeτj,nj21nj10.31nj1,forj=1,2,where the first bound is asymptotically sharp, in the sense that if τj,njτj then Δj,nj(τjeτj2)/nj. Furthermore, for τjτj,njlog2, Δj,nj=eτj{(τjτj,nj)+θj(τjτj,nj)2},with 0<θj<1.

If τj,njτj for uj,nj=x/aj,nj+bj,nj, then (Equation39) holds. By Lemma A.1, (Equation39) holds also if aj,nj and bj,nj are replaced by different constants αj,nj and βj,nj, satisfying αj,nj/aj,nj1 and (βj,njbj,nj)/aj,nj0. However, the speed of convergence to zero of Δj,nj (thus the speed of P(Mj,njuj,nj) to eτj) can be very different for different choices of norming constants.

4. Weakly dependent sequences

In this section, we extend the independent sequences to weakly dependent sequences. For a sequence of random variables {Xi}i=1n with identical distribution, it is stationary if {Xj1,,Xjn} and {Xj1+m,,Xjn+m} have the same joint distribution for any choice of n, j1,,jn, and m. For the mixed sequence, we will provide some alternatives so that the desired results still hold. We assume that the dependence between Xi,k and Xi,j falls off in some specific way as |kj| increases.

4.1. Review of some weakly dependent conditions

Some weakly dependent conditions in the literature can be generalised to the scenarios of mixed sequences. For m-dependent sequence {Xi}i=1n, Xi and Xj are independent if |ij|>m. Another commonly used condition is the strong mixing condition first introduced by Rosenblatt (Citation1956). A sequence of random variables {Xi}i=1n is said to satisfy the strong mixing condition if for some AF(X1,,Xp) and BF(Xp+k+1,Xp+k+2,) |P(AB)P(A)P(B)|<g(k)for any p and k, where g(k)0 as k; F() is the σ-field generated by the indicated random variables. The function g(k) does not depend on the sets A and B, so the strong mixing condition is uniform.

For normal sequences, the correlation between Xk and Xj may be a better measure of dependence. We can also use the dependence restriction |Corr(Xk,Xj)|g(|kj|), where g(k)0 as k.

Since the event {Mnu} is the same as {X1u,X2u,,Xnu}. We may restric the events on this type of event. Following Leadbetter et al. (Citation2012), we use Fi1,,in(u) to denote P(Xi1u,Xi2u,,Xinu). The following condition D is a weakened condition of strong mixing.

The condition D will be said to hold if for any integers i1<<ip and j1<<jp for which j1ipl, and any real u, (40) |Fi1,,ip,j1,,jp(u)Fi1,,ip(u)Fj1,,jp(u)|g(l)(40) where g(l)0 as l.

Under the condition D, the Extremal Types Theorem also holds. Since we usually deal with the event {Mnun} for some levels {un}, the condition can still be weakened. The condition D(un) is defined as follows.

The condition D(un) will be said to hold if for any integers (41) 1i1<<ip<j1<<jpn(41) for which j1ipl, we have (42) |Fi1,,ip,j1,,jp(un)Fi1,,ip(un)Fj1,,jp(un)|αn,l(42) where αn,ln0 as n for some sequence ln=o(n).

The condition D(un) guarantees that lim infP(Mnun)eτ. We still need a further assumption to have the opposite inequality for the upper limit. Here we present the D(un) condition used in Watson (Citation1954) and Loynes (Citation1965). This condition bounds the probability of more than one exceedance among X1,,X[n/k], therefore no multiple points in the point process of exceedances.

The condition D(un) will be said to hold for the sequence of random variables {Xi}i=1n, if (43) lim supnnj=2[n/k]P{X1>un,Xj>un}0(43) as k, (where [ ] denotes the interger part).

If both conditions D(un) and D(un) are satisfied, we have P(Mnun)eτ is equivalent to n(1F(un))τ as n for 0τ<.

4.2. Weakly dependent mixed sequences

To generalise the results from non-mixed sequences to mixed sequences, we need to modify the conditions of D(un) and D(un). We use un to denote the vector of levels (u1,n1,u2,n2) when the sequence {Xi}i=1n is composed of two subsequences {X1,i}i=1n1 and {X2,i}i=2n2, n1+n2=n. We further assume that n1np as n, 0p1, so that n2n1p.

Before introducing the more general D(un) condition, we introduce some new notations. Let un(i)=u1,n1I(Xi{X1,i}i=1n1)+u2,n2I(Xi{X2,i}i=1n2). Here I(A)=1 indicates that the event A is true, otherwise I(A)=0. The notation un(i) represents the threshold for Xi, which depends on the subsequence that Xi belongs to. For example, if X1=X1,1 and X2=X2,1, then P(X1un(1),X2un(2)) represents P(X1,1u1,n1,X2,1u2,n2). After introducing this notation, we can state the condition D(un) as follows.

The condition D(un) will be said to hold for the mixed sequence of random variables {Xi}i=1n with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2 if for any integers (44) 1i1<<ip<j1<<jpn(44) for which j1ipl, we have |P(Xi1un(i1),,Xipun(ip),Xj1un(j1),,Xjpun(jp))P(Xi1un(i1),,Xipun(ip))P(Xj1un(j1),,Xjpun(jp))|<αn,l,where αn,ln0 as n for some sequence ln=o(n).

Similarly, we can also extend the condition D(un) for mixed sequences, which is denoted as D(un).

The condition D(un) will be said to hold for the mixed sequence of random variables {Xi}i=1n and levels un=(u1,n1,u2,n2) if (45) lim supnk1i<j[n/k]P(Xi>un(i),Xj>un(j))0ask,(45) where un(i)=u1,n1I(Xi{X1,i}i=1n1)+u2,n2I(Xi{X2,i}i=2n2), and [ ] denotes the integer part.

Equation (Equation45) means that lim supn1i<j[n/k]P(Xi>un(i),Xj>un(j))=o(1/k). It can be observed that if D(un) holds for the mixed sequence {Xi}i=1n, then D(uj,nj) also holds for the subsequence {Xj,i}i=1nj, for j = 1, 2. The same conclusion is also true for the condition D(un).

After introducing the conditions D(un) and D(un), we have the extended results for mixed sequences. We assume that the two subsequences {X1,i}i=1n1 and {X2,i}i=2n2 are independent with each other. Also, for any interval In with ln members, there are an members from {X1,i}i=1n1 and bn members from {X2,i}i=2n2. We assume that the proportion of each subsequence anlnp and bnln1p, where 0p1.

Theorem 4.1

Let {Xi}i=1n be a weakly dependent mixed sequence of random variables with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2, with sample size proportions n1np and n2n1p as n, 0p1. Suppose that D(un) and D(un) hold for {Xi}i=1n, then for 0τ<, (46) P(M1,n1u1,n1,M2,n2u2,n2)eτ(46) if and only if (47) n1(1F1(u1,n1))+n2(1F2(u2,n2))τ.(47)

Based on Theorem 4.1, we have the following corollary.

Corollary 4.1

The same conclusions hold with τ= (i.e., P(M1,n1u1,n1,M2,n2u2,n2)0 if and only if n1(1F1(u1,n1))+n2(1F2(u2,n2))) if the requirements that D(un), D(un) hold are replaced by the condition that, for arbitrarily large τ(<), there exists a vector of levels vn=(v1,n1,v2,n2) such that v1,n1>u1,n1,v2,n2>u2,n2, which satisfy n1(1F1(v1,n1))+n2(1F2(v2,n2))τ with D(vn) and D(vn) hold.

Theorem 4.1 tells us the property of the joint probability P(M1,n1u1,n1,M2,n2u2,n2) given the tail properties of F1 and F2. n1(1F1(u1,n1))+n2(1F2(u2,n2)) is the mean exceedances of the two thresholds by the corresponding subsequences in total. Theorem 4.1 is the generalisation of Theorem 3.1 under the condition that the mixed sequence is weakly dependent within each subsequence.

4.3. Associated independent sequences

The ‘independent sequence associated with {Xi}i=1n’ can be used to study the maxima of dependent sequence. It was first introduced by Loynes (Citation1965). For a weakly dependent sequence of random variables {Xi}i=1n, the notation {Xˆi}i=1n is used to be the independent sequence with the same marginal distribution as {Xi}i=1n, and write Mˆn=max(Xˆ1,,Xˆn). When {Xi}i=1n is mixed with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2 with different marginal distributions, we still have the associated independent subsequences {Xˆ1,i}i=1n1 and {Xˆ2,i}i=1n1, and we write Mˆi,ni=max(Xˆi,1,,Xˆi,ni), for i = 1, 2.

The following Theorem 4.2 tells us that, under the weakly dependent conditions, P(M1,n1u1,n1,M2,n2u2,n2) and P(Mˆ1,n1u1,n1,Mˆ2,n2u2,n2) have the same limit if it exists. By Theorem 4.3, we can choose the same norming constant as the independent sequence to derive the same limit of P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x) and P(a1,n1(Mˆ1,n1b1,n1)x,a2,n2(Mˆ2,n2b2,n2)x).

Theorem 4.2

Let {Xi}i=1n be a mixed sequence of random variables with two subsequences {X1,i}i=1n1 and {X2,i}i=1n2, independent with each other. Suppose D(un) and D(un) hold for a vector of levels un=(u1,n1,u2,n2). Then P(M1,n1u1,n1,M2,n2u2,n2)θ>0 if and only if P(Mˆ1,n1u1,n1,Mˆ2,n2u2,n2)θ. The same holds with θ=0 if the condition D(un) and D(un) are replaced by the requirement that for arbitrarily large τ< there exists vn=(v1,n1,v2,n2) such that v1,n1>u1,n1,v2,n2>u2,n2, which satisfy n1(1F1(v1,n1))+n2(1F2(v2,n2))τ with D(vn) and D(vn) hold.

Theorem 4.3

Suppose that D(un) and D(un) hold for the mixed sequence of random variables {Xi}i=1n, with u1,n1=x/a1,n1+b1,n1,u2,n2=x/a2,n2+b2,n2 for each real x. Then (48) P(a1,n1(M1,n1b1,n1)x,a2,n2(M2,n2b2,n2)x)G(x)(48) if and only if (49) P(a1,n1(Mˆ1,n1b1,n1)x,a2,n2(Mˆ2,n2b2,n2)x)G(x)(49) for some non-degenerate continuous distribution function G(x).

With the results in this section, for weakly dependent sequences with conditions D(un) and D(un) being satisfied, we can treat them as independent sequences when studying the limit distribution of the maxima. In the next section, some numerical experiments and estimation results are presented.

5. Numerical experiments

5.1. Simulation

We study the accuracy of the accelerated max-stable distributions in estimating the high quantiles of the simulated data. They are compared to the results using the classical GEV distribution alone. To simulate the data, we first generate two sequences from two different GEV distributions with parameters ξ1,μ1,σ1 and ξ2,μ2,σ2, denoting them as {Xi}i=1n and {Yi}i=1n, here n = 2000. We pair them and find their maxima, Zi=max(Xi,Yi), then fit the accelerated max-stable distribution and GEV distribution separately to the sequence {Zi}i=1n using maximum likelihood method. Using each fitted distribution, we generate a new sequence {Zi}i=1n and calculate the proportion of {Zi}i=1n that exceeds the 90th, 95th and 99th percentiles of the original sequence {Zi}i=1n. The simulation scenarios cover all the possible combinations of three types of extreme value distributions. For each combination scenario, the process is repeated 100 times and the standard deviations of the estimated proportions are shown in the parentheses. The results are in Table .

Table 1. The proportions of the simulated data based on the fitted accelerated max-stable distributions and GEV distributions that exceeds the 90th, 95th and 99th percentiles of the original data Zi.

From Table , for the 90th percentile, we can observe that accelerated max-stable distributions perform better than the GEV alone, and the exceeding proportion is closer to the theoretical value 0.1. The same is true for the 95th percentiles. For both of these two percentiles, the proportions are larger than the theoretical value 0.1 and 0.05 in general, with the GEV distribution deviating more. This observation implies that both estimations overestimate the true values. For the 99th percentiles, we observe that the differences are not large overall. With a few cases (2nd and 3rd), the accelerated max-stable distribution outperforms the GEV distribution. Also, the proportions for accelerated max-stable distributions are all larger than 0.01 and those for GEV distributions are mostly smaller than 0.01. This phenomenon implies that the accelerated max-stable distribution may overestimate the 99th percentiles. On the other hand, the GEV distribution may underestimate the 99th percentiles.

5.2. Real data

In this section, we apply both AMSD/AEVD and GEV fitting to stock data. The data contains the daily closing prices of 330 S&P 500 companies. Based on the closing prices, we calculate the daily negative log returns using the formula ri=log(pipi1). Here pi represents the stock's closing price of one company on day i. For each day i, we obtain the 330 negative log returns and calculate the maximal value of them, denoting it as mi. The time range is from 3 January 2000 to 30 December 2016, which contain 4277 trading days in the data. The histogram showing the distribution of {mi}i=14277 is in Figure .

Figure 10. The histogram of the daily maxima of negative log returns of 330 stocks in the S&P 500 companies list.

Figure 10. The histogram of the daily maxima of negative log returns of 330 stocks in the S&P 500 companies list.

We find the 90th, 95th and 99th sample percentiles of {mi}i=14277, which are 0.1545, 0.2 and 0.3229, respectively. Here the daily maximal negative log returns have some time dependency. However, for the purpose of demonstration, we treat them as independent and fit the AMSD/AEVD and the GEV distribution to {mi}i=14277. Based on the fitted distributions, we generate random samples with the same size and find the proportions of the samples that exceed the three percentiles. The proportions are shown in Table .

Table 2. The proportions of the simulated samples generated from the fitted distributions that exceed the 90th, 95th, and 99th sample percentiles of the maximal daily negative log returns.

Table  clearly reveals that the AMSD/AEVD performs better than the GEV alone. The modelling performance may be further improved if time series dependence is implemented in the model fitting, e.g., the AcF model proposed by Zhao et al. (Citation2018) and Mao and Zhang (Citation2018). We will leave this task as a future project.

6. Conclusions

This paper extends the classical extreme value theory to maxima of maxima of time series with mixture patterns depending on the sample size. It has been shown that the classical extreme value distributions are special cases of the accelerated max-stable (extreme value) distributions (AMSDs/AEVDs). Some basic probabilistic properties are presented in the paper. These properties can be used as the probability foundation of recently proposed statistical models for extreme observations. The AMSDs may shed the light of extreme value studies and inferences. Many of existing theories in classical extreme value literature can be renovated in a much more general setting. Many real applications, e.g., risk analysis and portfolio management, systemic risk, etc. can be reanalysed and better results can be expected. Under the newly introduced framework, many new statistical models can be introduced and explored.

Acknowledgments

The authors thank Editor Jun Shao and two referees for their valuable comments. The work by Cao was partially supported by NSF-DMS-1505367 and Wisconsin Alumni Research Foundation #MSN215758. The work by Zhang was partially supported by NSF-DMS-1505367 and NSF-DMS-2012298.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The work by Cao was partially supported by NSF-DMS-1505367 and Wisconsin Alumni Research Foundation #MSN215758. The work by Zhang was partially supported by National Science Foundation NSF-DMS-1505367 and NSF-DMS-2012298.

Notes on contributors

Wenzhi Cao

Wenzhi Cao is a PhD student from the statistics department at the University of Wisconsin-Madison. Cao received his bachelor degree in Mathematics from Nankai University. Cao's research areas include extreme value theory and machine learning.

Zhengjun Zhang

Zhengjun Zhang is Professor of Statistics at the University of Wisconsin. Zhang's main research areas of expertise are in financial time series and rare event modelling, virtual standard cryptocurrency, risk management, nonlinear dependence, asymmetric dependence, asymmetric and directed causal inference, gene-gene relationship in rare diseases.

References

Appendix

A.1. Proofs of Theorems and Propositions

A.1.1. Proof of Theorem 2.1

For Equation (Equation4), P(a2,n2(Mnb2,n2)x)=P(max(a2,n2(M1,n1b2,n2),a2,n2(M2,n2b2,n2))x)=P(a2,n2(M1,n1b2,n2)x,a2,n2(M2,n2b2,n2)x)=Pa1,n1(M1,n1b1,n1)a1,n1xa2,n2+b2,n2b1,n1,a2,n2(M2,n2b2,n2)xH1(ax+b)H2(x).For Equation (Equation5), P(a2,n2(Mnb2,n2)x)=P(a1,n1(M1,n1b1,n1)a1,n1xa2,n2+b2,n2b1,n1,a2,n2(M2,n2b2,n2)x)H2(x).

A.1.2. Proof of Fact 2.2

The density of Φα1Φα2 is (A1) f(x)=exα1xα2(α1xα11+α2xα21),x0,0,x<0.(A1) Thus (A2) μk=0xkf(x)d(x)=0xkexα1xα2(α1xα11+α2xα21)dx.(A2) Dividing the integral into two parts, we get (A3) μk=01xkf(x)dx+1xkf(x)dx.(A3) First, let us consider 01xkf(x)dx. Since limx0xkf(x)=0 and xkf(x) is continuous on [0,1], it is bounded on [0,1]. This implies that (A4) 01xkf(x)dx<.(A4) Next, let us consider 1xkf(x)dx. We have 1xkf(x)dx=1exα1xα2(α1xkα11+α2xkα21)dx.Notice that (A5) limxexα1xα2=1.(A5) Therefore 1exα1xα2(α1xkα11+α2xkα21)dx< only if k<α1 and k<α2, i.e., k<min(α1,α2).

A.1.3. Proof of Fact 2.3

We need to consider limx1exα1exα21exα1.

Since xα10 and xα20 as x, we have the Taylor expansions exα1=1xα1+o(xα1),exα1xα2=1(xα1+xα2)+o(xα1+xα2).Therefore (A6) limx1exα1exα21exα1=limx(xα1+xα2)+o(xα1+xα2)xα1+o(xα1)=limx(1+xα1α2)=1.(A6) This proves that Φα1Φα2 and Φα1 are tail-equivalent.

A.1.4. Proof of Fact 2.4

The density of Λ(x)Φα(x) is f(x)=eexxα(ex+αxα1),x0,0,x<0.Thus (A7) μk=0xkf(x)dx=0xkeexxα(ex+αxα1)dx.(A7) Dividing the above equation into two parts, we get (A8) μk=01xkf(x)dx+1xkf(x)dx.(A8) For the first part, since limx0xkf(x)=0 and xkf(x) is continuous on [0,1], it is bounded on [0,1]. Thus 01xkf(x)<.

For the second part, (A9) 1xkf(x)=1eexxα(exxk+αxkα1)dx.(A9) Since (A10) limxeexxα=1,and1exxkdx<for k,(A10) we have 1eexxα(exxk+αxkα1)dx< if and only if k<α.

A.1.5. Proof of Fact 2.5

We need to consider limx1eexexα1exα.

Since limxex0 and limxxα0, we have the Taylor expansions eexxα=1exxα+o(ex+xα),exα=1xα+o(xα).Thus (A11) limx1eexexα1exα=limxex+xα+o(ex+xα)xα+o(xα)=1.(A11) This implies that Λ(x)Φα(x) and Φα(x) are tail-equivalent.

A.1.6. Proof of Theorem 3.1

If (Equation25) holds, we must have 1F1(u1,n1)0,1F2(u2,n2)0.Then (A12) n1log(1(1F1(u1,n1)))+n2log(1(1F2(u2,n2)))=n1(1F1(u1,n1))(1+o(1))n2(1F2(u2,n2))(1+o(1))τ,(A12) which is equivalent to P(M1,n1u1,n1,M2,n2u2,n2)=(1(1F1(u1,n1)))n1(1(1F2(u2,n2)))n2=exp{n1log(1(1F1(u1,n1)))+n2log(1(1F2(u2,n2)))}eτ.Conversely, if (Equation26) holds, which is equivalent to (A13) n1log(1(1F1(u1,n1)))+n2log(1(1F2(u2,n2)))τ,(A13) we must have 1F1(u1,n1)0 and 1F2(u2,n2)0. Otherwise, suppose 1F1(u1,n1)0, then there is a sequence of indexes m1,m2, and ϵ>0 such that 1F1(u1,mk)>ϵ for k. This means that n1log(1(1F1(u1,mi)))+n2log(1(1F2(u2,nmi)))<n1log(1(1F1(u1,mi)))<n1log(1ϵ),which is contradictory to (EquationA13). We have (A14) n1[(1F1(u1,n1))+o(1F1(u1,n1))]+n2[(1F2(u2,n2))+o(1F2(u2,n2))]τ(A14) and Equation (Equation25) holds.

A.1.7. Proof of Corollary 3.1

Since (A15) n1(1F1(u1,n1))+n2(1F2(u2,n2))τ1+τ2,(A15) (Equation30) is a direct result of Theorem 3.1.

If n2(1F2(u1,n1))n1(1F1(u1,n1))t, then n2(1F2(u1,n1))tτ1, where 0tτ1<. Therefore, P(Mnu1,n1)=P(M1,n1u1,n1)P(M2,n2u1,n1)eτ1(1+t).

A.1.8. Proof of Theorem 3.2

Since P(Mi,niui,ni)=(1τi,ni/ni)ni and 0τi,ni=n1(1Fi(ui,ni))n, the result follows from Lemma A.2.

A.1.9. Proof of Theorem 4.1

For fixed k, write n=[n/k], suppose that there are n1 members from F1 and n2 members from F2 among {X1,,Xn}, n=n1+n2. If (Equation47) holds, by assumption we have n1pnn1k and n2(1p)nn2k, thus (A16) n1(1F1(u1,n1))+n2(1F2(u2,n2))τk.(A16) Since P({M1,n1u1,n1,M2,n2u2,n2})=1P({M1,n1>u1,n1}{M2,n2>u2,n2})=1Pi=1n1{X1,i>u1,n1}j=1n2{X2,j>u2,n2},we have (A17) 1n1(1F1(u1,n1))n2(1F2(u2,n2))P(M1,n1u1,n1,M2,n2u2,n2)1n1(1F1(u1,n1))n2(1F2(u2,n2))+Sn,(A17) where Sn=1i<jnP(Xi>un(i),Xj>un(j)).

Condition D(un) implies that lim supnSn=o(1k) as k. By (EquationA16) and (EquationA17), we have 1τklim infnP(M1,n1u1,n1,M2,n2u2,n2)lim supnP(M1,n1u1,n1,M2,n2u2,n2)1τk+o1k.Since D(un) implies D(u1,n1) and D(u2,n2), Lemma A.3 holds for each subsequence. We have 1τkklim infnP(M1,n1u1,n1,M2,n2u2,n2)lim supnP(M1,n1u1,n1,M2,n2u2,n2)1τk+o1kk.Letting k, we have limnP(M1,n1u1,n1,M2,n2u2,n2)eτ.

Conversely, if (Equation46) holds, (A18) 1P(M1,n1u1,n1,M2,n2u2,n2)n1(1F1(u1,n1))+n2(1F2(u2,n2))1P(M1,n1u1,n1,M2,n2u2,n2)+Sn.(A18) Since P(M1,n1u1,n1,M2,n2u2,n2)eτ, we have P(M1,n1u1,n1,M2,n2u2,n2)eτ/k. By letting n in (EquationA18), 1eτ/k1klim infnn1(1F1(u1,n1))+n2(1F2(u2,n2))1klim supnn1(1F1(u1,n1))+n2(1F2(u2,n2))1eτ/k+o1kfrom which (multiplying k on all sides and let k) we have n1(1F1(u1,n1))+n2(1F2(u2,n2))τ.

A.1.10. Proof of Corollary 4.1

Suppose n1(1F1(u1,n1))+n2(1F2(u2,n2)), by u1,n1<v1,n1 and u2,n2<v2,n2, we have P(M1,n1u1,n1,M2,n2u2,n2)P(M1,n1v1,n1,M2,n2v2,n2).By Theorem 4.1, P(M1,n1v1,n1,M2,n2v2,n2)eτ. Then lim supnP(M1,n1u1,n1,M2,n2u2,n2)eτ.By letting τ, we have limnP(M1,n1u1,n1,M2,n2u2,n2)=0.Conversely, we still have n1(1F1(u1,n1))+n2(1F2(u2,n2))n1(1F1(v1,n1)+n2(1F2(v2,n2))τ.Since the above inequality holds for arbitrary large τ>0, we must have n1(1F1(u1,n1))+n2(1F2(u2,n2)).

A.1.11. Proof of Theorem 4.2

For θ>0, the condition P(Mˆ1,n1u1,n1,Mˆ2,n2u2,n2)θ may be rewritten as P(Mˆ1,n1u1,n1,Mˆ2,n2u2,n2)eτ with τ=logθ, this holds if and only if n1(1F1(u1,n1))+n2(1F2(u2,n2))τ. The same is true for P(M1,n1u1,n1,M2,n2u2,n2) by condition D(un) and D(un). When θ=0, the result follows from Corollary 4.1.

A.1.12. Proof of Theorem 4.3

If G(x)>0, the equivalence follows from Theorem 4.2, with θ=G(x).

If G(x)=0, the continuity of G shows that, if 0<τ<,there exists x0 such that G(x0)=eτ. D(vi,ni) and D(vi,ni) hold for v1,n1=x0/a1,n1+b1,n1,v2,n2=x0/a2,n2+b2,n2 and P(M1,n1v1,n1,M2,n2v2,n2)eτ or P(Mˆ1,n1v1,n1,Mˆ2,n2v2,n2)eτ depending on the assumption made, so that n1(1F1(v1,n1))+n2(1F2(v2,n2))τ. If (Equation49) holds, then we have n1(1F1(u1,n1))+n2(1F2(u2,n2)), thus u1,n1<v1,n1 and u2,n2<v2,n2 (since one of the inequalities must hold and also implies another). By Theorem 4.2, (Equation48) holds. The converse direction can be proved similarly.

A.2. Lemmas

Lemma A.1

Khintchine, Theorem 1.2.3 in Leadbetter et al. (Citation2012)

Let {Fn} be a sequence of cdf's and H a nondegenerate cdf. Let an>0 and bn be constants such that (A19) Fn(anx+bn)wH(x).(A19) Then for some nondegenerate cdf H and constants αn>0, βn, (A20) Fn(αnx+βn)wH(x)(A20) if and only if (A21) an1αnaandan1(βnbn)b(A21) for some a>0 and b, and then (A22) H(x)=H(ax+b).(A22)

Lemma A.2

Lemma 2.4.1 in Leadbetter et al. (Citation2012)

  1. If 0xn then (A23) 0ex1xnnx2ex21n12e21n10.31n1forn=1,2,,(A23) and further (A24) ex1xnn=x2ex21n1+O1nasn,(A24) uniformly for x in bounded intervals.

  2. If xylog2 then (A25) eyex=ex{(xy)+θ(xy)2},(A25) with 0<θ<1.

Lemma A.3

Lemma 3.3.2 in Leadbetter et al. (Citation2012)

If D(un) holds, for a fixed integer k, we have (A26) P(Mnun)Pk(M[n/k]un)0asn.(A26)

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.