597
Views
3
CrossRef citations to date
0
Altmetric
Articles

Robust autoregressive modeling and its diagnostic analytics with a COVID-19 related application

, ORCID Icon, ORCID Icon, ORCID Icon, & ORCID Icon
Pages 1318-1343 | Received 03 Oct 2022, Accepted 28 Mar 2023, Published online: 19 Apr 2023

Abstract

Autoregressive models in time series are useful in various areas. In this article, we propose a skew-t autoregressive model. We estimate its parameters using the expectation-maximization (EM) method and develop the influence methodology based on local perturbations for its validation. We obtain the normal curvatures for four perturbation strategies to identify influential observations, and then to assess their performance through Monte Carlo simulations. An example of financial data analysis is presented to study daily log-returns for Brent crude futures and investigate possible impact by the COVID-19 pandemic.

1. Introduction

Autoregressive (AR) modeling is an essential technique in time series data analysis and is widely applied in biology, economics, finance, health and other areas. Several AR models and their statistical inference have been well established [Citation29,Citation47,Citation48]. Furthermore, influence diagnostics for statistical modeling is equally important nowadays [Citation13,Citation22,Citation28,Citation35].

The local influence technique [Citation9] examines how a minor perturbation affects the model fitting and is a powerful tool for statistical diagnostics when identifying potentially influential observations. Influence diagnostics has been conducted in regression models [Citation45,Citation46] and time-series analysis [Citation31,Citation32]. Among others, [Citation7,Citation13,Citation20,Citation27,Citation40,Citation49] investigated the sensitivity of estimates for regression parameters with AR disturbances or similar assumptions employing influence diagnostics. A number of authors, as [Citation17,Citation24,Citation25,Citation30,Citation33,Citation36,Citation51,Citation52], studied the estimation and its validity with diagnostic methods for time series models and related structures.

Note that the standard assumption for many circumstances is that all errors mutually independently follow normal or Student-t (simply t from now) distributions for regression and time-series analysis. For example, [Citation26,Citation34] investigated the inference by maximum likelihood (ML) and its stability by influence diagnostic methods for a vector AR model under normal and t distributions, However, certain economic, financial and other data are known to exhibit errors following skewed distributions. To study such characteristics, distributions proposed by [Citation2–4], related to the skew-normal (SN) and skew-t (ST) models, have had a great receptivity by researchers in recent years. Instead of the Student-t and Gaussian models, they are appealing candidates and can therefore be adopted. As a result, they are becoming increasingly popular; see, for example, [Citation4,Citation5,Citation21]. Moreover, [Citation6–8,Citation13,Citation14,Citation50] studied SN partially linear and nonlinear regression models and/or score test statistics. Robust mixture structures under an ST model have been analyzed by [Citation15,Citation23].

For financial applications, [Citation47,Citation48] discussed the ST distributions for their generalized autoregressive conditional heteroscedastic (GARCH) models, and [Citation12] advocated and compared SN and ST distributions. Liu et al. [Citation31] focused on diagnostic analystics for an AR model with SN errors (SNAR model), while [Citation32] studied estimation and other statistical aspects for an SNAR model and especially made a real-world application of financial data affected during the COVID-19 pandemic. However, we are not aware of any studies that have reported results about influence diagnostics measures in an AR model with ST errors (STAR model).

In the present article, we conduct inference and validation of the STAR model with the financial data analysis which relates to the COVID-19 pandemic. Our main contributions are threefold:

  1. we propose the STAR model with a systematic methodology of estimation and diagnostics. We consider on likelihood methods to fit the STAR model using the EM algorithm and conduct its influence diagnostics based on four perturbation schemes including a brand new one of skewness. The STAR model complements the ST innovation-based GARCH models discussed in [Citation47,Citation48] and an AR model under the SN distribution studied in [Citation31,Citation32].

  2. we establish our mathematical results for the STAR model using the standard matrix differential calculus and examine their statistical implementations using simulations. We compare the ST with normal, t and skew-normal distributions, especially for the diagnostics. Our findings demonstrate the ST model is preferred to the other alternatives which were previously considered in [Citation26,Citation31,Citation32,Citation34].

  3. we conduct an empirical study of real-world financial data to illustrate both the STAR model and our methodology to be effective in practical applications and data analytics.

We proceed as follows. In Section 2, we explain the STAR model and develop an efficient algorithm for calculating the ML estimates, whereas in Section 3 we present our curvature diagnostics under the four perturbation schemes using the local influence technique. Section 4 carries out our simulation studies to examine the influence diagnostics, and compare the ST model with the normal, t, and SN distributions. Section 5 provides an empirical example involving an STAR model to show potential applications of the results. Our concluding comments are provided in Section 6. Finally, our matrix results for curvature diagnostics are derived in the appendix.

2. Formulation and estimation

In this section, we propose our STAR model, obtain its parameters' ML estimates, and establish the corresponding Hessian matrix.

2.1. STAR(p) model

We assume an STAR(p) time series model formulated as yt:=ut+β1yt1++βpytp, with yt being the response observed at t, for t{1,,R}, and p previous values denoted by y1,,yp; βi if the i-th regression coefficient, for i{1,,p}; and ut is the t-th innovation following an ST distribution denoted as utST(0,σ2,λ,ς), with σ2>0 being the scale, λ being the skewness, and ς being the degrees of freedom. Conveniently, the model yt:=ut+β1yt1++βpytp is rewritten as (1) yt=xtβ+ut,(1) where xt=(yt1,,ytp) and β=(β1,,βp) are p×1 vectors. The parameters are collected by a (p+3)×1 vector Θ=(β,σ2,λ,ς).

Lemma 1

[Citation47]

Let β(z)=1β1zβpzp be a characteristic polynomial, and its modulus of all zero solutions is greater than one. Then, the associated AR data are stationary.

Lemma 1 is used in the numerical analysis of Sections 4 and 5, and it offers a necessary and sufficient condition for us to test if our data are stationary. Note that, if Y follows an ST model with μ, σ2, λ, and ς being location, scale, skewness, and degrees of freedom parameters, we denote it as YST(μ,σ2,λ,ς). Thus, if YST(μ,σ2,λ,ς), its density, which we denote by PDF, is established by (2) fY(y;ς)=2σtς(η)Tς+1(λη(ς+1η2+ς)1/2),tR,η=yμσ,(2) with Tς and tς being the distribution function, CDF in short, and PDF of the Student-t model. If λ=0, the PDF of Y stated in (2) corresponds to the t PDF; if ς, then the PDF of Y becomes the SN PDF; and if λ=0 and ς, then the PDF of Y is the normal PDF. Observe that the mean of Y and its variance are represented by E(Y)=σδλ(ς/π)1/2Γ((ς1)/2)Γ(ς/2)+μ,Var(Y)=σ2(ςς2ςδλ2π)(Γ((ς1)/2)Γ(ς/2))2, with δλ=λ/(1+λ2)1/2.

Lemma 2

[Citation23]

Let YST(μ,σ2,λ,ς). Then, we get Y|γ,τN(μ+δλγ,(1δλ2)σ2/τ),γ|τTN(0,σ2/τ;(0,)),τGamma(ς2,ς2), where TN(μ,σ2;(a,b)) represents the truncated normal distribution with N(μ,σ2) lying within the interval (a,b).

2.2. EM based ML estimation

In practice, directly maximizing the function associated with the logarithmic likelihood structure using the ML method to find the estimate of Θ can be a no easy task. Instead, we implement the ML method based on the EM algorithm with incomplete data proposed by [Citation11]. Here, ycomplete=(yobserved,ymissing) denotes the set of complete data, where ymissing stands for the set of missing data and yobserved for the set of observed data. Given a starting estimate Θ(0), which can be taken from the fit with the normal distribution, we get Θ(r), for r{1,2,}, iteratively between the E and M steps until reaching convergence as in [Citation13,Citation32]. To reach convergence, we use ||Θ(r+1)Θ(r+1)||<ϵ=105, same as used in the simulation study.

According to Lemma 2, the model defined as yt:=β1yt1++βpytp+ut can be presented hierarchically as u|γ,τN(μ+δλγ,(1δλ2)σ2τ),γ|τTN(0,σ2τ;(0,)),τGamma(ς2,ς2). The observed and missing (unobserved) data are {up+1,,uT} and {cp+1,,cT}. Let yobserved={up+1,,uT} and ymissing={cp+1,,cT}, with ycomplete=(yobserved,ymissing) being the complete observations. In such conditions, the function related to the logarithmic likelihood function of the set of complete-data for Θ=(β,σ2,δλ,ς) is given by complete(Θ;Ycomplete)=i=p+1R(ς2τiui2τi2(1δλ2)σ2+δλuiγiτi(1δλ2)σ2τi2γi2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)), where δλ=λ/(1+λ2)1/2. For the E step, we obtain, with Θˆ(r), the Q-function defined as follows: (3) QΘ=E(complete(Θ;Ycomplete)|yo),evaluated at ΘΘˆ(r),=i=p+1R(ς2sˆ1i(r)ui2sˆ1i(r)2(1δλ2)σ2+δλuisˆ2i(r)(1δλ2)σ2sˆ3i(r)2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i(r)),(3) with sˆ1i(r)=E(τi|yobserved,Θˆ(r))=(ςˆ(r)+1ςˆ(r)+ηˆi2(r))Rςˆ(r)+3(Tˆi(r)(ςˆ(r)+3ςˆ(r)+1)1/2)Rςˆ(r)+1(Tˆi(r)),sˆ2i(r)=E(γiτi|yobserved,Θˆ(r))=δλˆ(r)uiˆ(r)sˆ1i(r)+(1δλˆ2(r))1/2πfˆi(r)(uiˆ(r))(ηˆi2(r)ςˆ(r)(1δλˆ2(r))+1)ςˆ(r)+22, sˆ3i(r)=E(γi2τi|yobserved,Θˆ(r))=(δˆλ(r))2(uˆi(r))2sˆ1i(r)+(1δˆλ2(r))σˆ2(r)+δˆλ(r)uˆi(r)(1δˆλ2(r))1/2πfˆi(r)(uˆi(r))(ηˆi2(r)ςˆ(r)(1δˆλ2(r))+1)ςˆ(r)+22,sˆ4i(r)=E(ln(τi)|yobserved,Θˆ(r))=G(ςˆ(r)+12)+ςˆ(r)+1ςˆ(r)+ηˆi2(r)(Rςˆ(r)+3(Tˆi(r)(ςˆ(r)+3ςˆ(r)+1)1/2)Rςˆ(r)+1(Tˆi(r))1)ln(ηˆi2(r)+ςˆ(r)2)+δλˆ(r)ηˆi(r)(ηˆi2(r)1)((ςˆ(r)+1)(ςˆ(r)+ηˆi2(r))3)1/2(tςˆ(r)+1(Tˆi(r))Rςˆ(r)+1(Tˆi(r)))+1Rςˆ(r)+1(Tˆi(r))×Tˆi(r)gˆςˆ(r)(x)tςˆ(r)+1(x)dx, uˆi(r)=yixiβˆ(r),ηˆi(r)=uˆi(r)σˆ(r),δλˆ(r)=λˆ(r)(1+λˆ2(r))1/2,Tˆi(r)=λˆ(r)ηˆi(r)(ςˆ(r)+1ςˆ(r)+ηˆi2(r))1/2,fˆi(r)(uˆi(r))=2σˆ(r)tςˆ(r)(ηˆi(r))Rςˆ(r)+1(Tˆi(r)),G(x)=Γ(x)Γ(x),gˆςˆ(r)(x)=G(ςˆ(r)+22)G(ςˆ(r)+12)ln(1+x2ςˆ(r)+1)+x21ςˆ(r)+1+x2. In the case of the M step, we use Q˙Θˆ(k+1) to update Θˆ(r) using an iterative algorithm, with Q¨ and Q˙ denoting the Hessian matrix and gradient vector. Now, if Θˆ(r+1)Θˆ(r)0, then we get that Θˆ(r+1)=Θˆ(r)Q¨Θˆ(r)1Q˙Θˆ(r). Under mild conditions, and considering appropriate starting values of Θˆ(0), which as mentioned can be taken from the fit with the normal distribution, Θˆ(r) converges to the ML estimate Θˆ.

Note that there is an alternative approach as taken by reparametrizating and estimating the parameters in their regression models; see [Citation42]. Thus, we could use a reparametrization to find closed expressions for the estimators of the parameters of the STAR(p) model as well.

2.3. Observed information matrix

We calculate the observed information matrix Q¨Θ starting from complete(Θ;Ycomplete)=i=p+1R(ς2τiui2τi2(1δλ2)σ2+δλuiγiτi(1δλ2)σ2τi2γi2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)) and (4) QΘ=E(complete(Θ;Ycomplete)|yo),evaluated at Θ=Θˆ,=i=p+1R(ς2sˆ1iui2sˆ1i2(1δλ2)σ2+δλuisˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i),(4) where ui=yixiβ and (5) sˆ1i=E(τi|yobserved,Θˆ)=(ςˆ+1ςˆ+ηˆi2)Rςˆ+3(Tˆi(ςˆ+3ςˆ+1)1/2)Rςˆ+1(Tˆi),(5) (6) sˆ2i=E(γiτi|yobserved,Θˆ)=δλˆuiˆsˆ1i+(1δλˆ2)1/2πfˆi(uiˆ)(ηˆi2ςˆ(1δλˆ2)+1)ςˆ+22,(6) (7) sˆ3i=E(γi2τi|yobserved,Θˆ)=δλ2ˆuiˆ2sˆ1i+(1δλˆ2)σˆ2+δλˆui(1δλˆ2)1/2πfˆi(uiˆ)(ηˆi2ςˆ(1δλˆ2)+1)ςˆ+22,(7) (8) sˆ4i=E(ln(τi)|yobserved,Θˆ)=G(ςˆ+12)+ςˆ+1ςˆ+ηˆi2(Rςˆ+3(Tˆi(ςˆ+3ςˆ+1)1/2)Rςˆ+1(Tˆi)1)ln(ηˆi2+ςˆ2)+δλˆηˆi(ηˆi21)((ςˆ+1)(ςˆ+ηˆi2)3)1/2(tςˆ+1(Tˆi)Rςˆ+1(Tˆi))+1Rςˆ+1(Tˆi)×Tˆigˆςˆ(x)tςˆ+1(x)dx,(8) with uˆi=yixiβˆ,ηˆi=uˆiσˆ,δλˆ=λˆ(1+λˆ2)1/2,Tˆi=λˆηˆi(ςˆ+1ςˆ+ηˆi2)1/2,fˆi(uiˆ)=2σˆtςˆ(ηˆi)Rςˆ+1(Tˆi),gˆςˆ(x)=G(ςˆ+22)G(ςˆ+12)ln(1+x2ςˆ+1)+x21ςˆ+1+x2.

Theorem 1

For the STAR model, the (p+3)×(p+3) observed Fisher information matrix Q¨Θˆ with Θˆ=(βˆ,σ2ˆ,δλˆ,ςˆ) is obtained, whose diagonal and off-diagonal submatrices are provided in the appendix.

3. Diagnostic analysis

We obtain our normal curvatures for the influence diagnostic analysis considering the following perturbation strategies: case-weights, data, variance and skewness.

3.1. Influence diagnostics

For the STAR model postulated in (Equation1), we use the logarithmic likelihood function of complete-data, (Θ;Ycomplete), where Θ is a (p+3)×1 parameter vector. We consider a minor modification denoted by q×1 perturbation vector ω=(ω1,,ωq) belonging to ΩRq, and (Θ;ω;Ycomplete) is the function of logarithmic likelihood for the set of complete data under ω. Consider ω0 as the a q×1 vector of non-perturbation satisfying (Θ;Ycomplete)=(Θ;ω0;Ycomplete), and this vector can be ω0=(0,,0) or ω0=(1,,1) or an alternative as properly chosen, and the dimension q depends on the perturbation strategy adopted. Denote by Θˆ and Θˆω the ML estimates for the postulated model and perturbed model, respectively. Then, as in [Citation13,Citation14], we compare Θˆ and Θˆω using the influence measure named Q-displacement, and derive the normal curvature at q×1 vector l (with ||l||=1) stated as (9) Cl=2|l(ΔQ¨1Δ)l|,(9) with a (p+3)×(p+3) matrix Q¨=2QΘ/ΘΘ, evaluated at Θ=Θˆ, as well as a (p+3)×q matrix Δ=2QΘ;ω/Θω, evaluated at Θ=Θˆ,ω=ω0. We use the expression given in (Equation9) for our perturbation strategies proposed, by calculating Q¨ and Δ, for which the derivatives are presented in the appendix. Poon and Poon [Citation41] proposed a measure of conformal normal curvature to classify an observation to be potentially influential. Note that for linear regression models, but not restricted to these, to assess the influence of ω, [Citation9] utilized the influence measure named likelihood displacement (LD). Then, a high value of LD provides us information that ML estimates Θˆ and Θˆω to differ significantly. For details, see [Citation26,Citation27,Citation31,Citation32,Citation34,Citation41].

3.2. Perturbation strategies

3.2.1. Perturbation of case-weights

As in [Citation32], we make a minor perturbation on the residual of the STAR model using ωiui=ωi(yixiβ) instead of ui=yixiβ, where ωi is the modification. We consider (Rp)×1 vectors ω=(ωp+1,,ωT) and ω0=1. Then, the expression for complete(Θ;ω;Ycomplete) is presented as complete(Θ;ω;Ycomplete)=i=p+1R(ς2τiωi2ui2τi2(1δλ2)σ2+δλωiuiγiτi(1δλ2)σ2τi2γi2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)). Thus, we express the Q-function (perturbed) by means of (10) QΘ;ω=E(complete(Θ;ω;Ycomplete)|yo),evaluated at ΘΘˆ,=i=p+1R(ς2sˆ1iωi2ui2sˆ1i2(1δλ2)σ2+δλωiuisˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i),(10) where ui=yixiβ and sˆ1i,sˆ2i,sˆ3i,sˆ4i are the same as in the formulations given in (Equation5)–( Equation8).

Theorem 2

For case-weights perturbation strategy, we get the (p+3)×(Rp) matrix given by (11) Δ=2QΘ;ω∂Θ∂ω=(2(yixiβˆ)sˆ1iδλˆsˆ2i(1δλˆ2)σ2ˆxi(yixiβˆ)2sˆ1iδλˆ(yixiβˆ)sˆ2i(1δλˆ2)σ2ˆ2δλˆ(yixiβˆ)2sˆ1i+δλˆ2(yixiβˆ)sˆ2i+(yixiβˆ)sˆ2i(1δλˆ2)2σ2ˆ0),(11) evaluated at Θ=Θˆ,ω=ω0, where sˆ1i,sˆ2i,sˆ3i,sˆ4i are the same as in the expressions stated by (Equation5)–(Equation8).

3.2.2. Perturbation of data

As in [Citation32], we make a perturbation to replace yi by ωi+yi. Let ω=(ωp+1,,ωT) and ω0=0 be (Rp)×1 vectors. For the response perturbation yi+ωi=β1(yi1+ωi1)++βp(yip+ωip)+ui, where ui=yixiβ+μ(ωi) and μ(ωi)=ωiβ1ωi1βpωip, we reach complete(Θ;ω;Ycomplete)=i=p+1R(ς2τi(ui+μ(ωi))2τi2(1δλ2)σ2+δλ(ui+μ(ωi))γiτi(1δλ2)σ2τi2γi2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)). Thus, we get the Q-function (perturbed) by means of (12) QΘ;ω=i=p+1R(ς2sˆ1i(ui+μ(ωi))2sˆ1i2(1δλ2)σ2+δλ(ui+μ(ωi))sˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i),(12) where ui=yixiβ and s1ˆ,s2ˆ,s3ˆ,s4ˆ are the same as in (Equation5) –(Equation8).

Theorem 3

For the data perturbation strategy, we get the (p+3)×(Rp) matrix given by (13) Δ=2QΘ;ω∂Θ∂ω=(sˆ1i(1δλˆ2)σ2ˆxi(yixiβˆ)sˆ1iδλˆsˆ2i(1δλˆ2)σ2ˆ2(δλˆ2+1)sˆ2i2δλˆ(yixiβˆ)sˆ1i(1δλˆ2)2σ2ˆ0),(13) evaluated at Θ=Θˆ,ω=ω0, where sˆ1i,sˆ2i,sˆ3i,sˆ4i are the same as in the formulations given in (Equation5)(Equation8).

3.2.3. Perturbation of variance

As in [Citation32], we replace the variance σ2 by ωi1σ2, that is, uiST(0,ωi1σ2,λ,ς). Consider (Rp)×1 vectors ω=(ωp+1,,ωR) and ω0=1. Thus, we reach that complete(Θ;ω;Ycomplete)=i=p+1R(ς2τiωiui2τi2(1δλ2)σ2+δλωiuiγiτi(1δλ2)σ2ωiτi2γi2(1δλ2)σ2ln(σ2ωi)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)(ς2τiωiui2τi2(1δλ2)σ2+δλωiuiγiτi(1δλ2)σ2ωiτi2γi2(1δλ2)σ2). Then, we get the Q-function (perturbed) by means of (14) QΘ;ω=i=p+1R(ς2sˆ1iωiui2sˆ1i2(1δλ2)σ2+δλωiuisˆ2i(1δλ2)σ2ωisˆ3i2(1δλ2)σ2ln(σ2ωi)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i),(14) where ui=yixiβ and s1ˆ,s2ˆ,s3ˆ,s4ˆ are the same as in the expressions stated by (Equation5) –(Equation8).

Theorem 4

For the variance perturbation strategy, we attain the (p+3)×(Rp) matrix expressed as (15) Δ=2QΘ;ω∂Θ∂ω|Θ=Θˆ,ω=ω0=(sˆ1iδλˆsˆ2i(1δλˆ2)σ2ˆxi(yixiβˆ)2sˆ1iδλˆ(yixiβˆ)sˆ2i+sˆ3i2(1δλˆ2)σ2ˆ2δλ(yixiβˆ)2sˆ1i+(δλˆ2+1)uiˆsˆ2iδλˆsˆ3i(1δλˆ2)2σ2ˆ0),(15) where sˆ1i,sˆ2i,sˆ3i,sˆ4i are the same as in the formulations given in (Equation5)–(Equation8).

3.2.4. Perturbation of skewness

In particular, due to the characteristic of skew distribution of our proposed model, we study its impact by making a small modification in λ, that is, changing δλ by (ωi)1/2δλ. Consider (Rp)×1 vectors ω=(ωp+1,,ωT) and ω0=1. Then, we get complete(Θ;ω;Ycomplete)=i=p+1R(ς2τiui2τi2(1ωiδλ2)σ2+δλ(ωi)1/2uiγiτi(1ωiδλ2)σ2τi2γi2(1ωiδλ2)σ2ln(σ2)12ln(1ωiδλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2ln(τi)). Hence, the Q-function (perturbed) is presented by means of (16) QΘ;ω=i=p+1R(ς2sˆ1iui2sˆ1i2(1ωiδλ2)σ2+δλ(ωi)1/2uisˆ2i(1ωiδλ2)σ2sˆ3i2(1ωiδλ2)σ2ln(σ2)12ln(1ωiδλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i),(16) where ui=yixiβ and s1ˆ,s2ˆ,s3ˆ,s4ˆ are the same as in the expressions presented in (Equation5) –(Equation8).

Theorem 5

For the skewness perturbation strategy, we get the (p+3)×(Rp) matrix formulated as (17) Δ=2QΘ;ω∂Θ∂ω|Θ=Θˆ,ω=ω0=(δλˆ2(yixiβˆ)sˆ1iδλˆ3sˆ2i(1δλˆ2)2σ2ˆδλˆsˆ2i2(1δλˆ2)σ2ˆxiδλˆ2(yixiβˆ)2sˆ1iδλˆ(yixiβˆ)(δλˆ2+1)sˆ2i+δλˆ2sˆ3i2(1δλˆ2)2σ2ˆ22δλˆ3(yixiβˆ)2sˆ1i+4δλˆ4(yixiβˆ)sˆ2i2δλˆ3sˆ3i(1δλˆ2)3σ2ˆ+(yixiβˆ)sˆ2i+2δλˆσ2ˆ2(1δλˆ2)σ2ˆ+δλˆ(yixiβˆ)2sˆ1i+4δλˆ2(yixiβˆ)sˆ2iδλˆsˆ3i+δλˆ3σ2ˆ(1δλˆ2)2σ2ˆ0),(17) where sˆ1i,sˆ2i,sˆ3i,sˆ4i are the same as in (Equation5)(Equation8).

3.3. Benchmark of influential observations

To assess if a case is influential, we use the benchmark stated as 1/q+cSDM(0), where q = np, n is the sample size, c is a pre-chosen positive constant and SDM(0) is the sample standard deviation (SD) of M(0)s, for s{1,,q}; see [Citation41]. Then, if a diagnostic value is greater than 1/q+cSDM(0), we identify the corresponding case to be influential.

4. Numerical simulation

We present five simulation studies to examine our estimators and diagnostic results. The results in Sections 4 and 5 are calculated with the software Matlab [Citation44].

4.1. EM algorithm

The first simulation study uses the results found in Section 3. For the STAR(p) model with p{1,2,3}), we choose sample sizes n belonging to {250,500,1000}, and parameters: λ{0.2,0.15,0.1} and ς{3,4,5}. The results are presented in Tables  and . Note that the mean and median of the estimates are coherent with the values stated. In addition, the standard errors (SEs) are very small to indicate our estimates are stable. In short, the results prove our proposal is suitable.

Table 1. Values of the SE, median, and mean with n{250,500,1000}, λ{0.1,0.15,0.2} and ς{3,4,5} using the STAR model.

Table 2. Values of the SE, median, and mean with n{250,500,1000}, λ=0.1 and ς{3,4,5} using the STAR model.

4.2. Influence diagnostic analysis

The second study proceeds the steps as follows:

[S1]

The STAR(1) model is yt=ut+βyt1, where utST(0,σ2,λ,ς), with ς=3,λ=0.2,β=0.12,σ2=0.1. Observe the stationarity of the AR(1) model according to Lemma 1.

[S2]

The model in [S1] is used to generate 1000 samples of 400 observations each.

[S3]

A perturbed scalar ε is added to the 200th observation of each sample obtained in [S2]. Note that yt is perturbed to be yt=yt+ϵ, where ϵ{0,0.2,0.4,,2} and t = 200 are used to obtain 1000 samples with the 200th observation of each sample as a possible influential observation.

[S4]

Our influence diagnostic is used to detect influential observations for the each of 1000 samples obtained in [S3]. Under each perturbation strategy, 1000 diagnostic values are calculated inspecting an eigenvector linked to its corresponding largest eigenvalue. Then, the index of largest element in absolute value of this vector is registered as an influential observation.

[S5]

How well our diagnostics performs is examined in terms of how many diagnostic values of the 1000 samples in [S3] fulfill the above criteria, with ε run from 0 to 2, and the number of diagnostic values of the 200th position is counted. The number of correct identifications versus ε for the schemes are displayed in Figure  (left).

Figure 1. Number of correct identifications related to the four perturbation strategies for the STAR(1) (left) and STAR(2) (right) models.

Figure 1. Number of correct identifications related to the four perturbation strategies for the STAR(1) (left) and STAR(2) (right) models.

From Figure (left), we see clearly that the number of samples with correct identification of the influential observation has followed the scale of the perturbed vector to increase in the four perturbation strategies. When ϵ=2, the four strategies are becoming apparent. Under 1000 samples, the influential observation of the four schemes are 634, 658, 642 and 269 times, respectively. These results are expected.

Next, we further study the STAR(2) model stated as yt=β1yt1+β2yt2+ut, where utST(0,σ2,λ,ς), with β1=0.15,β2=0.2,σ2=0.1,λ=0.2,ς=3. It is easy to prove that the AR(2) model is stationary. We use the same 5 steps for the AR(1) model to get Figure (right).

From Figure (right), we note that the number of samples with correct identification of the influential observation has follewed the scale of the perturbed vector to increase in the four schemes. When ϵ=2, the four schemes become apparent. Under 1000 samples, the influential observation of the four schemes are 525, 375, 561 and 277 times, respectively. From both figures, the AR(1) model's pattern appears stronger than the AR(2) model's patter. However, diagnostic ability of the four schemes are different so they should be used in combination in practice.

4.3. Student-t versus Gaussian models

Liu et al.[Citation26] investigated about a methodology to detect influential points in an AR model with the normal distribution. Here, we generate 1000 samples following the procedure provided in Section 4.2 (with ϵ=2 in [3]) based on the ST distribution with β=0.12,σ2=0.1,λ=0.2,ς=3, and then make an influence analysis under the ST and normal distributions to compare our result with those provided in [Citation26]. Using the method stated in Step 5 of Section 4.2, the comparison is made in Table .

Table 3. Comparison between the ST and normal distributions with ε = 2.

There are both similarities and differences. The similarities are obvious in the study under both distributions as the number of correct outlier detections are all large, that is, 634, 658, 642 and 603, 596, 122 out of 1000. This indicates that the diagnostic results works well. However, there are also differences. In the case of ϵ=2, the number of 1000 samples with influential points is 634 and 603 under case-weights strategy; the number of 1000 samples with influential points is 658 and 596 under the data strategy; and the number of 1000 samples with influential points is 642 and 122 under the variance strategy. In terms of diagnostics, the ST model is better than the normal model. Utilizing a correctly specified model in analyzing data is particularly important.

4.4. Skew-student-t versus student-t models

Liu et al. [Citation34] diagnosed influential points in an AR model with the t distribution. We generate 1000 samples following the procedure provided in Section 4.2 (with ε = 2 in Step 3) based on the ST distribution with β=0.12,σ2=0.1,λ=0.2,ς=3 to evaluate the performance of local influence analysis under the ST and t distributions. The comparison is reported in Table . The number of samples where influential observations were detected are all large in scale, namely, 634, 658 and 637, 613 out of 1000, implying that the method works well. There are differences in employing the ST and t distributions. In the case of ϵ=2, the number of samples with influential observations is 634 and 658 under case weights strategy; and the number of samples with influential observations is 658 and 613 under the data strategy; both with a total 1000 samples. The diagnostic effect under the ST distribution is better than under the t distribution.

Table 4. Comparison between the ST and t distributions with ε = 2.

4.5. Skew-student-t versus skew-normal models

Liu et al. [Citation31,Citation32] detected influential observations in an SNAR model. We generate 1000 samples following the procedure provided in Section 4.2 (with ϵ=2 in Step 3) based on the ST distribution with β=0.12,σ2=0.1,λ=0.2,ς=3, and then apply a local influence analysis under the ST and SN distributions. The comparison is provided in Table .

Table 5. Comparison between the ST and SN distributions with ϵ=2.

The similarities are noticeable in the local diagnostic study under the ST and SN models as the number of samples with the influential points are most large enough in the most strategies of perturbation: 634, 658, 642, 269 and 611, 609, 124, 46 out of 1000, meaning that the diagnostic results are effective. However, there are also differences. In the case of ϵ=2, these can be observed for the variance and skewness perturbation strategies, which influential observations are 642 and 124 under the variance strategy; and 269 and 46 under the skewness strategy; both with a total 1000 samples. The local influence analysis performance is better assuming a ST distribution that an SN distribution, as shown in the previous cases.

5. Empirical analysis

In this section, we use our values stated in Section 3 to analyze financial data and discuss the performance of our proposed methodology. The Brent crude futures (BIPE hereafter) daily log-return data from 16 January 2007 to 11 March 2021 are chosen to construct an STAR model and perform our diagnostic analysis.

5.1. STAR model for BIPE

Figure  (left) shows BIPE daily log-return time series data from 16 January 2007 to 11 March 2021. An exploratory data analysis based on basic statistics of the daily financial returns is the following: n = 3442 (sample size); minimum and maximum returns of −0.30856 and 0.15449; first and third quartiles of −0.010586 and 0.011026; sample mean and median of 0.0000669 and 0.00013246; sample coefficient of skewness of −0.93637; and sample excess kurtosis of 18.812. Figures (right) and  present a histogram and a kernel plot employing the t model of the BIPE daily data. From this exploratory data analysis, we detect an asymmetrical distributional feature and a high kurtosis level of the data. We can observe that the adjustment with the t model is unsuitable. Furthermore, we use the D'Agostino skewness test [Citation10] and Anscombe-Glynn kurtosis test [Citation1] to assess for skewness and kurtosis of our data, where the D'Agostino statistic is -19.25 and Anscombe-Glynn statistic is 27.754, with their p-values both significantly less than 0.01. These indicate that the distribution of BIPE is skewed, with an obvious peak and two fat tails. We first select an STAR model and determine its order using the following steps as stated as [Citation31,Citation32,Citation47] for their t and SN models.

Figure 2. Time-series (left) and histogram with kernel PDF (right) of BIPE daily returns.

Figure 2. Time-series (left) and histogram with kernel PDF (right) of BIPE daily returns.

Figure 3. BIPE daily returns with t and ST PDFs.

Figure 3. BIPE daily returns with t and ST PDFs.

[S1] Consider an AR(p) model (18) yt=ut+β1yt1,yt=ut+β1yt1++βpytp,(18) with p{1,2,}.

[S2] In the ith equation of (Equation19), that is, the AR(i) model, the ordinary least squares estimate of βj is βˆj(i), the residual is uˆt(i)=ytβˆ1(i)yt1βˆi(i)yti, and the estimate of σi2 is σˆi2=1R2i1t=i+1R(uˆt(i))2.

[S3] The (i1)th and ith equations in (Equation19) are used to test if coefficient βi is equal to zero or not, that is, to compare the AR(i−1) with AR(i) models. For this hypothesis testing the statistic is stated as Ti=(Ri2.5)ln(σˆi2/σˆi12), which follows an asymptotic χ2(1) distribution. Values of Ti for i{1,,7} are presented in Table . As the 95% quantile of χ2(1) distribution is 3.841, we use the empirical values of Ti presented in Table  to determine the value of p of the AR(p) model. Hence, p = 1 because Ti=3.863>3.841. Using the EM algorithm, we calculate Θˆ=(β1ˆ,σˆ2,λˆ,ςˆ)=(0.07232,0.000201,0.024942,3). Then, as the absolute value of β1 is less than one, we cannot reject the BIPE data to be stationary. From Figure , we find that the ST distribution fits better than the t distribution. Thus, we fit the STAR(1) model yˆt=0.07232yt1+uˆt, with σˆ2=0.000201, λˆ=0.024942, and ςˆ=3. According to Lemma 1, the data being fitted by the STAR(1) model are stationary.

Table 6. Empirical values of Ti, with i{1,,7} for BIPE daily data.

5.2. Diagnostic analysis for BIPE

We conduct the local influence analysis in the STAR(1) model for BIPE under the four perturbation strategies proposed. Based on the idea given in Section 3.3 and the results in [Citation31,Citation32], we choose c = 3 and 1/3441+3SDM(0) as the benchmark, obtaining the values of 0.0021, 0.0022, 0.0015 and 0.0012 for the four respective perturbation strategies. In each plot, the red line symbolizes the benchmark to determine if an observation to be potentially influential or not, that is, when its value is beyond the red line. The influential observations are in Table  where * denotes the case that is identified as potentially influential for BIPE. We can observe that these cases are mainly related to the 2008–2009 ‘The Great Recession”, as well as to the 2020-2021 SARS-CoV-2 pandemic, which conducted to a large volatility. For example, on 3 October 2008, the U.S. government signed a financial rescue plan with a total of about 700 billion dollars. On 24 February 2020, the global stock markets fell after the global number of COVID-19 cases increased significantly. In summary, Figure  and Table  justify the effectiveness and practicability of the diagnostics methods for an STAR model.

Figure 4. BIPE diagnostics under skewness, variance, case-weights, and data perturbations.

Figure 4. BIPE diagnostics under skewness, variance, case-weights, and data perturbations.

Table 7. Summary of the diagnostic analysis by perturbation strategy.

Finally, we compare the two models for AR(1) with outliers and STAR(1) without outliers in Table . The two models’ mean square errors of their predicted values are 0.0014874 and 0.0013266, respectively. It can be seen that the prediction results made after the outliers' removal are better than those made before.

Table 8. Predicted results by the listed structure.

6. Conclusions

In this article, our STAR model was studied with an ML estimation methodology. Its validation was performed using local diagnostic analysis inspired by the EM algorithm, which allowed us to obtain normal curvatures for four perturbation strategies of interest. Our model was compared with the alternative models based on skew-normal, normal, and Student-t distributions. Our findings showed that the proposed STAR model was more accurate, applicable, and usable to diagnose longer time data and identify more abnormal cases.

The curvature results for our STAR(p) model with four perturbation strategies, including the newly proposed perturbation of skewness, were presented. Monte Carlo simulation studies were conducted to asses adequacy of our methodology. Approximate numerical benchmark values for detecting the possible influential observations were employed to analyze the daily log-returns of Brent crude futures for a period of time covering the events related to 2008 financial crisis and COVID-19 pandemic. Many of the influential observations with the BIPE data were identified to be related to historical events. Our methodology and findings are evidenced to be effective.

Further works are related to the study of statistical structures generated from settings associated with data functional, partial least squares, quantile, multivariate, and spatial regression frameworks [Citation16,Citation38,Citation39,Citation43]. Similarly, considering censored data may also be of interest to analyze in the context of the current investigation [Citation19]. We are planning to conduct studies on these issues in the future.

Acknowledgments

We would like to thank the Editor, Professor Jie Chen, the Associate Editor, and the reviewers for their constructive comments which led an improved presentation of this article.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The research of Y. Liu was supported by the National Social Science Fund of China [grant No. 19BTJ036]. The research of V. Leiva was partially funded by the National Agency for Research and Development (ANID) [project grant number FONDECYT 1200525] of the Chilean government under the Ministry of Science, Technology, Knowledge, and Innovation.

References

  • F.J. Anscombe and W.J. Glynn, Distribution of the kurtosis statistic b2 for normal samples, Biometrika 70 (1983), pp. 227–234.
  • A. Azzalini, A class of distribution which includes the normal ones, Scand. J. Stat. 12 (1985), pp. 171–178.
  • A. Azzalini, The Skew-Normal and Related Families, Cambridge University Press, Cambridge UK, 2014.
  • A Azzalini, An overview on the progeny of the skew-normal family – a personal perspective, J. Multi. Anal. 188 (2022), 104851.
  • A. Azzalini and A. Capitanio, Distributions generated by perturbation of symmetry with emphasis on a multivariate Student-t distribution, J. R. Stat. Soc. 65 (2003), pp. 367–389.
  • V.G. Cancho, V.H. Lachos, and E.M.M. Ortega, A nonlinear regression model with skew-normal errors, Stat. Pap. 51 (2010), pp. 547–558.
  • C.Z. Cao, J.G. Lin, and J.Q. Shi, Diagnostics on nonlinear model with scale mixtures of skew-normal and first-order autoregressive errors, Statistics 48 (2014), pp. 1033–1047.
  • B. Carmichael and A. Coën, Asset pricing with skewed-normal return, Finance Res. Lett. 10 (2013), pp. 50–57.
  • D. Cook, Assessment of local influence, J. R. Stat. Soc. B 48 (1986), pp. 133–169.
  • R.B. D'Agostino, Transformation to normality of the null distribution of g1, Biometrika 57 (1970), pp. 679–681.
  • A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. B 39 (1977), pp. 1–38.
  • M. Eling, Fitting asset returns to skewed distributions: Are the skew-normal and skew-Student good models? Insur. Mathem. Econom. 59 (2014), pp. 45–56.
  • C.S. Ferreira, G.A. Paula, and G.C. Lana, Estimation and diagnostic for partially linear models with first-order autoregressive skew-normal errors, Comput. Stat. 37 (2022), pp. 445–468.
  • A.M. Garay, V.H. Lachos, F.V. Labra, and E.M.M. Ortega, Statistical diagnostics for nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Comput. Simul. 84 (2014), pp. 1761–1778.
  • H.J. Ho and T.I. Lin, Robust linear mixed models using the skew-t distribution with application to schizophrenia data, Biom. J. 52 (2010), pp. 449–469.
  • M. Huerta, V. Leiva, S. Liu, M. Rodriguez, and D Villegas, On a partial least squares regression model for asymmetric data with a chemical application in mining, Chemom. Intell. Lab. Syst. 190 (2019), pp. 55–68.
  • L. Jin, X. Dai, A. Shi, and L. Shi, Detection of outliers in mixed regressive-spatial autoregressive models, Commun. Stat.: Theory Methods 45 (2016), pp. 5179–5192.
  • T. Kollo and D. von Rosen, Advanced Multivariate Statistics with Matrices, Springer, Dordrecht, the Netherlands, 2005.
  • J. Leao, V. Leiva, H. Saulo, and V Tomazella, Incorporation of frailties into a cure rate regression model and its diagnostics and application to melanoma data, Stat. Med. 37 (2018), pp. 4421–4440.
  • V. Leiva, S. Liu, L. Shi, and F.J.A. Cysneiros, Diagnostics in elliptical regression models with stochastic restrictions applied to econometrics, J. Appl. Stat. 43 (2016), pp. 627–642.
  • V. Leiva, H. Saulo, R. Souza, R.G. Aykroyd, and R. Vila, A new BISARMA time series model for forecasting mortality using weather and particulate matter data, J. Forecast. 40 (2021), pp. 346–364.
  • W.K. Li, Diagnostic Checks in Time Series, Chapman and Hall/CRC, Boca Raton, 2004.
  • T.I. Lin, J.C. Lee, and W.J. Hsieh, Robust mixture modeling using the skew t distribution, Stat. Comput. 17 (2007), pp. 81–92.
  • S. Liu, On diagnostics in conditionally heteroskedastic time series models under elliptical distributions, J. Appl. Probab. 41A (2004), pp. 393–405.
  • S. Liu and C.C. Heyde, On estimation in conditional heteroskedastic time series models under non-normal distributions, Stat. Pap. 49 (2008), pp. 455–469.
  • Y. Liu, G. Ji, and S. Liu, Influence diagnostics in a vector autoregressive model, J. Stat. Comput. Simul. 85 (2015), pp. 2632–2655.
  • S. Liu, V. Leiva, T. Ma, and A.H. Welsh, Influence diagnostic analysis in the possibly heteroskedastic linear model with exact restrictions, Stat. Methods Appl. 25 (2016), pp. 227–249.
  • S. Liu, V. Leiva, D. Zhuang, T. Ma, and J Figueroa-Zuniga, Matrix differential calculus with applications in the multivariate linear model and its diagnostics, J. Multivar. Anal. 188 (2022), 104849.
  • T. Liu, S. Liu, and L. Shi, Time Series Analysis Using SAS Enterprise Guide, Springer, Singapore, 2020.
  • S. Liu, T. Ma, A. SenGupta, K. Shimizu, and M.Z. Wang, Influence diagnostics in possibly asymmetric circular-linear multivariate regression models, Sankhya B 79 (2017), pp. 76–93.
  • Y. Liu, G. Mao, V. Leiva, S. Liu, and A Tapia, Diagnostic analytics for an autoregressive model under the skew-normal distribution, Mathematics 8 (2020), 693.
  • Y. Liu, C. Mao, V. Leiva, S. Liu, and W.A. Silva Neto, Asymmetric autoregressive models: Statistical aspects and a financial application under COVID-19 pandemic, J. Appl. Stat. 49 (2022), pp. 1323–1347.
  • S. Liu and H. Neudecker, On pseudo maximum likelihood estimation for multivariate time series models with conditional heteroskedasticity, Math. Comput. Simul. 79 (2009), pp. 2556–2565.
  • Y. Liu, R. Sang, and S. Liu, Diagnostic analysis for a vector autoregressive model under t distributions, Stat. Neerl. 71 (2017), pp. 86–114.
  • S. Liu and A.H. Welsh, Regression diagnostics, in International Encyclopedia of Statistical Science, M. Lovric, ed., Springer, Berlin, Heidelberg, 2011, pp. 1206–1208
  • J. Lu, L. Shi, and F. Chen, Outlier detection in time series models using local influence method, Commun. Stat. Theory Methods 41 (2012), pp. 2202–2220.
  • J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed. Wiley, Chichester, 2019.
  • C. Marchant and V. Leiva, Robust multivariate control charts based on Birnbaum–Saunders distributions, J. Stat. Comput. Simul. 88 (2018), pp. 182–202.
  • S. Martinez, R. Giraldo, and V Leiva, Birnbaum–Saunders functional regression models for spatial data, Stoch. Environ. Res. Risk Assess 33 (2019), pp. 1765–1780.
  • G.A. Paula, Influence diagnostics for linear models with first-order autoregressive elliptical errors, Stat. Probab. Lett. 79 (2009), pp. 339–346.
  • W.Y. Poon and Y.S. Poon, Conformal normal curvature and assessment of local influence, J. R. Stat. Soc. Ser B 61 (1999), pp. 51–61.
  • C.B. Zeller, V.H. Lachos, and F.E. Vilca-Labra, Local influence analysis for regression models with scale mixtures of skew-normal distributions, J. Appl. Stat. 38 (2011), pp. 348–363.
  • L. Sanchez, V. Leiva, M. Galea, and H. Saulo, Birnbaum–Saunders quantile regression models with application to spatial data, Mathematics 8 (2020), 1000.
  • The MathWorks Inc. MATLAB Version: 9.13.0 (R2022b), The MathWorks Inc, Natick, Massachusetts, 2022. Available at https://www.mathworks.com.
  • A. Tapia, V. Leiva, M.P. Diaz, and V. Giampaoli, Influence diagnostics in mixed effects logistic regression models, TEST 28 (2019), pp. 920–942.
  • A. Tapia, V. Giampaoli, M.P. Diaz, and V. Leiva, Sensitivity analysis of longitudinal count responses: A local influence approach and application to medical data, J. Appl. Stat. 46 (2019), pp. 1021–1042.
  • R.S. Tsay, Analysis of Financial Time Series, Wiley, New York, 2010.
  • R.S. Tsay, An Introduction to Analysis of Financial Data with R, Wiley, New York, 2013.
  • I. Vidal and L.M. Castro, Influential observations in the independent student-t measurement error model with weak nondifferential error, Chil. J. Stat. 1 (2010), pp. 17–34.
  • F.C. Xie, J.G. Lin, and B.C. Wei, Diagnostics for skew-normal nonlinear regression models with AR(1) errors, Comput. Stat. Data Anal. 53 (2009), pp. 4403–4416.
  • F. Zhu, S. Liu, and L. Shi, Local influence analysis for poisson autoregression with an application to stock transaction data, Stat. Neerl. 70 (2016), pp. 4–25.
  • F. Zhu, L. Shi, and S. Liu, Influence diagnostics in log-linear integer-valued GARCH models, AStA Adv. Stat. Anal. 99 (2015), pp. 311–335.

Appendix

In this appendix, we obtain the matrix derivatives involved in our diagnostic analytics [Citation18,Citation28,Citation37].

A.1 Proof (Theorem 1)

Proof.

The Q-function is established as QΘ=i=p+1R(ς2sˆ1iui2sˆ1i2(1δλ2)σ2ˆ+δλuisˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i). Its derivatives are expressed as QΘβ=i=p+1Ruisˆ1iδλsˆ2i(1δλ2)σ2xi,QΘσ2=i=p+1Rui2sˆ1i2δλuisˆ2i+sˆ3i2(1δλ2)σ41σ2,QΘδλ=i=p+1R(δλ1δλ2+δλui2sˆ1i+2δλ2uisˆ2i+(1δλ2)uisˆ2iδλsˆ3i(1δλ2)2σ2),QΘς=Rp2(1sˆ1i+sˆ4i+ln(ς2)G(ς2)). The second derivatives are formulated as 2QΘββ=i=p+1Rsˆ1i(1δλ2)σ2xixi,2QΘβσ2=i=p+1Rδλsˆ2iuisˆ1i(1δλ2)σ4xi,2QΘβδλ=i=p+1R2δλuisˆ1i(δλ2+1)sˆ2i(1δλ2)2σ2xi,2QΘ(σ2)2=i=p+1R(ui2sˆ1i+2δλuisˆ2isˆ3i(1δλ2)σ6+1σ4),2QΘσ2δλ=i=p+1Rδλui2sˆ1i(δλ2+1)uisˆ2i+δλsˆ3i(1δλ2)2σ4,2QΘδλ2=i=p+1R(δλ2+1(1δλ2)2+ui2sˆ1i+6δλuisˆ2isˆ3i(1δλ2)2σ2+4δλ2ui2sˆ1i+8δλ3uisˆ2i+4δλ2sˆ3i(1δλ2)3σσ2),2QΘδλς=2QΘβς=2QΘσ2ς=0,2QΘς2=Rp4(2ςG(ς2)). Hence, we obtain Q¨Θˆ with Θˆ=(βˆ,σˆ2,δλˆ,ςˆ).

A.2. Proof (Theorem 2)

Proof.

The Q-function (perturbed) is stated as QΘ;ω=i=p+1R(ς2sˆ1iωi2ui2sˆ1i2(1δλ2)σ2+δλωiuisˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i). Taking the differentials of the Q-function (perturbed) with respect to β, σ2, δλ, ς and then ωi, for i{p+1,,R}, the derivatives are calculated as QΘ;ωβ=i=p+1Ruiwi2sˆ1iδλwisˆ2i(1δλ2)σ2xi,QΘ;ωσ2=i=p+1R(wi2ui2sˆ1i2δλwiuisˆ2i+sˆ3i2(1δλ2)σ41σ2),QΘ;ωδλ=i=p+1R(δλ1δλ2+δλwi2ui2sˆ1i+2δλ2wiuisˆ2i+(1δλ2)wiuisˆ2iδλsˆ3i(1δλ2)2σ2),QΘ;ως=Rp2(1sˆ1i+sˆ4i+ln(ς2)G(ς2)), and the second derivatives are stated as 2QΘ;ωβωi=2wiuisˆ1iδλsˆ2i(1δλ2)σ2xi,2QΘ;ωσ2ωi=wiui2sˆ1iδλuisˆ2i(1δλ2)σ2,2QΘ;ωδλωi=2δλwiui2sˆ1i+δλ2uisˆ2i+uisˆ2i(1δλ2)2σ2,2QΘ;ωςωi=0. Letting Θˆ=(βˆ,σ2ˆ,δλˆ,ςˆ) and ω=(1,,1), we obtain the expression presented in (Equation11).

A.3. Proof (Theorem 3)

Proof.

We get the Q-function (perturbed) QΘ;ω=i=p+1R(ς2sˆ1i(ui+μ(ωi))2sˆ1i2(1δλ2)σ2+δλ(ui+μ(ωi))sˆ2i(1δλ2)σ2sˆ3i2(1δλ2)σ2ln(σ2)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i). Taking the differentials of the Q-function (perturbed) with respect to β, σ2, δλ, ς and then ωi, for i{p+1,,R}, the derivatives are found by QΘ;ωβ=i=p+1R(ui+μ(ωi))sˆ1iδλsˆ2i(1δλ2)σ2(xi+(ωi1,,ωip)),QΘ;ωσ2=i=p+1R((ui+μ(ωi))2sˆ1i2δλ(ui+μ(ωi))sˆ2i+sˆ3i2(1δλ2)σ41σ2),QΘ;ωδλ=i=p+1R(δλ1δλ2+δλ(ui+μ(ωi))2sˆ1i+(δλ2+1)(ui+μ(ωi))sˆ2iδλsˆ3i(1δλ2)σ2),QΘ;ως=Rp2(1sˆ1i+sˆ4i+ln(ς2)G(ς2)), and the second derivatives are established as 2QΘ;ωβωi=sˆ1i(1δλ2)σ2(xi+(ωi1,,ωip)),2QΘ;ωσ2ωi=(ui+μ(ωi))sˆ1iδλsˆ2i(1δλ2)σ4,2QΘ;ωδλωi=(δλ2+1)sˆ2i2δλ(ui+μ(ωi))sˆ1i(1δλ2)2σ2,2QΘ;ωςωi=0. Noting Θˆ=(βˆ,σˆ2,δλˆ,ςˆ) and ω=(0,,0), we reach the formula given in (Equation13).

A.4. Proof (Theorem 4)

Proof.

We attain Q-function (perturbed) QΘ;ω=i=p+1(ς2sˆ1iωiui2sˆ1i2(1δλ2)σ2+δλωiuisˆ2i(1δλ2)σ2ωisˆ3i2(1δλ2)σ2ln(σ2ωi)12ln(1δλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i). Taking the differentials of the Q-function (perturbed) with respect to β, σ2, δλ, ς and then ωi, for i{p+1,,R}, the derivatives are stated as QΘ;ωβ=i=p+1Rwisˆ1iuiδλsˆ2i(1δλ2)σ2xi,QΘ;ωσ2=i=p+1R(wiui2sˆ1i2δλuisˆ2i+sˆ3i2(1δλ2)σ41σ2),QΘ;ωδλ=i=p+1R(δλ1δλ2+wiδλui2sˆ1i+(δλ2+1)uisˆ2iδsˆ3i(1δλ2)2σ2),QΘ;ως=Rp2(1sˆ1i+sˆ4i+ln(ς2)G(ς2)), and the second derivatives are presented as 2QΘ;ωβωi=sˆ1iδλsˆ2i(1δλ2)σ2xi,2QΘ;ωσ2ωi=ui2sˆ1iδλuisˆ2i+sˆ3i2(1δλ2)σ4,2QΘ;ωδλωi=δλui2sˆ1i+(δλ2+1)uisˆ2iδλsˆ3i(1δλ2)2σ2,2QΘ;ωςωi=0. Noting Θˆ=(βˆ,σ2ˆ,δλˆ,ςˆ) and ω=(1,,1), we get the expression formulated in (Equation15).

A.5. Proof (Theorem 5)

Proof.

We obtain the Q-function (perturbed) QΘ;ω=i=p+1R(ς2sˆ1iui2sˆ1i2(1ωiδλ2)σ2+δλ(ωi)1/2uisˆ2i(1ωiδλ2)σ2sˆ3i2(1ωiδλ2)σ2ln(σ2)12ln(1ωiδλ2)+ς2ln(ς2)ln(Γ(ς2))+ς2sˆ4i). Taking the differentials of the Q-function (perturbed) with respect to β, σ2, δλ, ς and then ωi, for i{p+1,,R}, we obtain the first-order derivatives QΘ;ωβ=i=p+1Ruisˆ1iδλ(wi)1/2sˆ2i(1wiδλ2)σ2xi,QΘ;ωσ2=i=p+1R(ui2sˆ1i2δλui(wi)1/2sˆ2i+sˆ3i2(1wiδλ2)σ41σ2),QΘ;ωδλ=i=p+1R(δui2wisˆ1i+ui(wi)1/2sˆ2i(1wiδλ2)wiδλsˆ3i(1wiδλ2)2σ2+wiδλ1wiδλ2),QΘ;ως=Rp2(1sˆ1i+sˆ4i+ln(ς2)G(ς2)), and the second derivatives are 2QΘ;ωβωi=(δλ2uisˆ1iδλ3(wi)1/2sˆ2i(1wiδλ2)2σ2δλsˆ2i2(wi)1/2(1wiδλ2)σ2)xi,2QΘ;ωσ2ωi=δλ2ui2(wi)1/2sˆ1iδλui(δλ2(wi)1/2+1)sˆ2i+δλ2(wi)1/2sˆ3i2(wi)1/2(1wiδλ2)2σ4,2QΘ;ωςωi=0,2QΘ;ωδλωi=2δλ3ui2wisˆ1i+4δλ4uiwi1.5sˆ2i2δλ3wisˆ3i(1δλ2wi)3σ2+δλui2sˆ1i+4δλ2ui(wi)1/2sˆ2iδλsˆ3i+δλ3σ2wi(1δλ2wi)2σ2+uisˆ2i+2δλσ2(wi)1/22(wi)1/2(1δλ2wi)σ2. Using Θˆ=(βˆ,σˆ2,δλˆ,ςˆ) and ω=(1,,1), we get the expression introduced in (Equation17).