589
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Local asymptotic normality and efficient estimation for multivariate GINAR(p) models

| (Reviewing editor)
Article: 1695437 | Received 06 Aug 2019, Accepted 13 Nov 2019, Published online: 06 Dec 2019

Abstract

We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integer-valued autoregressive (MGINAR) process with order p. The generalized thinning operator in the MGINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, geometric thinning, Negative Binomial thinning and so on. By using the LAN property, we propose an efficient estimation method for the parameter of the MGINAR(p) process. Our procedure is based on the one-step method, which update initial n-consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.

PUBLIC INTEREST STATEMENT

We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integer-valued autoregressive (MGINAR) process with order p. recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics.

The MGINAR process is one of the class of discrete-valued time series models and contains various classes. By using the LAN property, we propose an efficient estimation method for the parameter. Our procedure is based on the one-step method, which update initial root-n consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.

1. Introduction

Recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics (e.g., Weiß, Citation2018). This paper is concerned with a special class of observation-driven models termed “integer-valued autoregressive processes” (INAR processes), which were introduced independently by Al-Osh and Alzaid (Citation1987) and McKenzie (Citation1985, Citation1988). They introduced the integer-valued autoregressive process with order 1 (INAR(1) process) to model non-negative integer-valued phenomena with time dependence. The more general INAR(p) processes were considered by Alzaid and Al-Osh (Citation1990), Du and Li (Citation1991) and so on. The INAR process consists of a mixture both of the distribution of thinning operator and the distribution of the innovation process. Alzaid and Al-Osh (Citation1990) discussed the INAR(p) process in case that the thinning operator follows Binomial distribution (i.e., specified), while the distribution of the innovation process is left unspecified. Latour (Citation1998) introduced a generalized version of the INAR(p) process, namely, the causal and stationary generalized integer-valued autoregressive (GINAR(p)) process. The generalized thinning operator in the GINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, Geometric thinning, and Negative Binomial thinning (by Ristić, Bakouch, & Nastić, Citation2009). The above history is appropriate for the univariate case. Nowadays, extensions for the multivariate case are being actively researched and applied. One of the first approaches to multivariate thinning mechanism was by McKenzie (Citation1988). After that, Franke and Subba Rao (Citation1993) introduced a multivariate INAR(1) (MINAR(1)) model based on independent Binomial thinning operators. The extensions for the MINAR(1) model were discussed by Latour (Citation1997), Karlis and Pedeli (Citation2013), and so on. This paper discusses a parameter estimation problem for a multivariate version of the GINAR(p) (MGINAR(p)) process. The MGINAR model is quite a large class, including all the models described above.

Estimation of the parameter for the INAR(p) process can be carried out in a variety of ways. Common ways for estimating parameters include the method of moments (MM), based on the Yule-Walker equations, and conditional least squares (CLS). The main advantage of both approaches is their simplicity due to the closed-form formulae and robustness due to require no assumption for the distribution. It is known that MM and CLS estimators are asymptotically equivalent. However, Al-Osh and Alzaid (Citation1987) and so on have recommended using maximum likelihood (ML) estimators instead of MM and CLS estimators because they are less biased for small sample sizes. However, it is well known that the ML method is computationally unattractive due to complicated transition probabilities, including many convolutions. To overcome this problem, Drost, Van Den Akker, and Werker (Citation2008) considered one-step, asymptotically efficient estimation of the INAR(p) model. Their method can reduce high computational cost due to the convolutions involved in the ML method. Following Drost et al. (Citation2008), this paper provides a one-step method, which update initial nconsistent estimators to efficient ones for the MGINAR (p) processes. In the class of multivariate integer-valued models, the number of convolutions involved in the likelihood function is very large. For some distributions with reproductive property, the likelihood can be simplified, but it is impossible to apply such a simplification for all models in the MGINAR (p) processes. Therefore, it would be quite important to reduce the computational cost by using a one-step estimation. We first establish the local asymptotic normality (LAN) structure for experiments of the MGINAR (p) process. Considering the CLS estimator as the initial estimator, a one-step update estimator is proposed. We can also show that this estimator is asymptotically efficient.

The organization of this paper is as follows. In Section 2, the MGINAR (p) process is introduced and the LAN property is established. Section 3 discusses the efficient estimation. The CLS estimator is introduced and its asymptotic property is shown. Then, by using the LAN property, a one-step estimator is proposed to update the CLS estimator. In Section 4, the asymptotic relative efficiency (ARE) of the one-step estimator and the CLS estimator is examined through some simulation experiments. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method. The proofs and other details are included in the Appendix.

2. The LAN property

Let {Xt=(X1,t,,Xd,t)T;tZ+=N{0}} be a d- dimensional non-negative integer-valued random process (i.e., Xt(ω)Z+d). The multivariate generalized integer-valued autoregressive process of order p (MGINAR(p)) is defined by

(1) Xt=k=1pA(k)Xtk+ϵt(1)

where A(k)=(Ai,j(k))t,j=1,,d is a d×d-matrix for k=1,,p and the matrix thinning A(k)Xtk gives a d-dimensional random vector with ith component

(2) A(k)Xtki=j=1dAi,j(k)Xj,tk=j=1dr=1Xj,tkξi,j,r(k),i=1,,d.(2)

Note that for any AR, A0=0; {ξi,j,r(k);rZ+} is a collection of independent and identically distributed (i.i.d.), non-negative, integer-valued random variables with the distribution function GAi,j(k) and the mean Ai,j(k); ϵt=(ϵ1,t,,ϵd,t)T;tZ+ is a collection of i.i.d. non-negative, integer-valued random vectors, where the ith component ϵi,t has an independent distribution function Fai and the mean ai. Suppose that the starting values {Xp,,X1} have a distribution ν on Z+d×p, and Xp,,X1,ξi,j,r(1),,ξi,j,r(p),ϵ.

are independent of each other.

Throughout the paper, the number of lags (pN) and the dimensions (dN) are fixed and known. Let F and G denote the classes of distributions of ϵ and ξ that belong to parametric classes, respectively, say F  (Fa;aΘa) and G  (GA;AΘA). For instance, when we consider the Binomial thinning operator and the Binomial innovation, F and G should be defined by Fa(x)=ax(1a)x and GA(x)=Ax(1A)x for aΘa=(0,1),AΘA=(0,1), and x{0,1}. Our goal is to estimate the parameter θ=(aT,AT)TΘRq with q=d+d2p, efficiently, where

Θ=(aT,AT)T;a=(a1,,ad)TΘad,A=vec(A(1),,A(p))T=(A1,1(1),,Ad,d(p))TΘAd2p.

For (probability) measures F1,,FkFG, we introduce the following notations for the convolutions:

  • F1F2: the convolution of F1 and F2

  •  k=1,,pFk: the convolution of F1,,Fk (i.e.,  k=1,,pFk=F1Fk)

  • F1n : the n times convolution of F1 (i.e., F1n=F1F1ntimes)

Based on the above notations, we consider the corresponding probability space for {Xt} denoted by (Ω,F,Pν,θ,F,G), where Ω is a sample space, F is the σ-algebra, and Pν,θ,F,G is the probability measure given by ν (distribution of {Xp,,X1}), θ (parameter), F (class of distribution of ϵ) and G (class of distribution of ξ). Furthermore, we introduce F={Ft;tp} as the natural filtration generated by {Xp,,Xt} (i.e. Ft=σ(Xp,,Xt)σ(Xp,,X1,ϵ0,,ϵt)).

From (1) and (2), we can write

Xt=k=1pj=1dr=1Xj,tkξj,r(k)+ϵt

where ξj,r(k)=(ξ1,j,r(k),,ξd,j,r(k))T. Hence, it follows, for any tZ+,

Pν,θ,F,G{Xt=xt|Ft1}=Pν,θ,F,G{Xt=xt|Xt1,,Xtp}:=P(Xt1,,Xtp),xtθ(say)

For some xtk=(x1,tk,,xd,tk)TZ+d,k=0,1,,p, the transition probability P(xt1,,xtp),xtθ is given by

P(xt1,,xtp),xtθ=Pν,θ,F,GXt=xt|Xt1=xt1,,Xtp=xtp=Pν,θ,F,Gk=1pj=1dr=1Xj,tkξj,r(k)+ϵt=xt|Xt1=xt1,,Xtp=xtp=i=1dPν,θ,F,Gk=1pj=1dr=1Xj,tkξi,j,r(k)+ϵi,t=xi,t|Xt1=xt1,,Xtp=xtp=i=1dFai k=1,,pj=1,,dGAi,j(k)xj,tk (xi,t).

We consider parametric MGINAR models in which the parameter space is restricted to the stationary parameter space, for instance, in case of the Binomial thinning operator and d=1, the thinning part of the parameter space should be defined by {A;A=(A(1),,A(p))T(0,1)d,k=1pA(k)<1}. Suppose that F×G is a combination of a family of parametric distributions for the thinning operator and innovation (immigrant) with the formula below:

F={Fa=(Fa1,,Fad);a=(a1,,ad)TRd}
G={GA=(GA1,1(1),,GAd,d(p));A=vec(A(1),,A(p))T=(A1,1(1),,Ad,d(p))TRd2p}

Suppose that for any θ=(aT,AT)TΘ, {Xt} with Fa×GAF×G is a strictly stationary process. Let Pν,θ,F,G(n) denote the law of {Xp,,Xn} on the measurable space Z+d(n+1+p),NZ+d(n+1+p) under Pν,θ,F,G. Here Z+d(n+1+p) is the sample space and the power set NZ+d(n+1+p) is the σ-algebra on this sample space. Observing (Xp,,Xn) yields the following sequence of experiments:

E(n)(ν,Θ,F,G)=Z+d(n+1+p),NZ+d(n+1+p),Pν,θ,F,G(n);θΘ,nZ+

where the initial distribution ν is fixed and the distributions for the thinning operator and innovation (immigrant) are parametrized (i.e., once θ=(aT,AT)TΘ is fixed, the distributions are given by Fa and GA).

To prove the LAN property for the sequence of experiments E(n)(ν,Θ,F,G),nZ+, we impose the following assumptions.

Assumption 1 . (A1) Θ is an open, convex subset of Rq with q=d+d2p.

(A2) The supports of Fa and GA do not depend on a and A and we have, for all θ=(aT,AT)T,Fa(0)×GA(0)(0,1)d×d2p.

(A3) For all eZ+ and all θ=(aT,AT)T=((ai),(Ai,j(k)))TΘ,

fai(e)=Faiai(e),f˙ai(e)=2Fai(ai)2(e),gAi,j(k)(e)=GAi,j(k)Ai,j(k)(e),g˙Ai,j(k)(e)=2GAi,j(k)(Ai,j(k))2(e)

are defined, continuous in Θ, and satisfied with Faiai=0,GAi,j(k)Ai,j(k)=0 for i  i,(i,j,k)(i,j,k), respectively.

(A4) Let sup() denotes

sup(Z):=supθ˜=((a˜i),(A˜i,j(k)))T;θ˜θ∥<δEν,θ˜,F,GZ|Xp,,X0.

For every θΘ, there exist a δ > 0 and a random variable Mθ such that for all i,j=1,,d,k=1,,p, sup(|fa˜i(ϵi,0)|2)Mθ, sup(|f˙a˜i(ϵi,0)|)Mθ, sup(|gA˜i,j(k)(ξi,j,1)|2)Mθ, sup(|g˙A˜i,j(k)(ξi,j,1)|)Mθ, and Eν,θ,F,G[Mθ]<.

(A5) Let ˙n and ¨n be the first and second derivatives of log-likelihood for {Xp,,Xn}. The information equality Eν,θ,F,G[˙n˙nT]=Eν,θ,F,G[¨n] is satisfied, and Eν,θ,F,G[˙n˙nT] is nonsingular and continuous in Θ.

(A6) For all θΘ, Eν,θ,F,G[ϵ0ϵ0T]∥< and Eν,θ,F,G[X0X0T]∥<.

(A7) Fa×GA=Fa×GA implies (a,A)=(a,A).

To prove that (E(n)(ν,Θ,F,G))nZ+ has the LAN property, we need to determine the behavior of a localized log-likelihood ratio. To this end, we first write down the following likelihood:

Ln(θ|Xp,,Xn)=νXp,,X1t=0nP(Xt1,,Xtp),Xtθ.

In addition, we introduce the following log-likelihood:

n(θ|Xp,,Xn)=logLn(θ|Xp,,Xn)=logνXp,,X1+t=0nlogP(Xtp,,Xt1),Xtθ:=logνXp,,X1+t=0n(Xt1,,Xt;θ).

Following Drost et al. (Citation2008), we establish the LAN property by using a Taylor expansion. To do so, the transition score for θ is needed. The transition score (˙) can be derived by calculating the partial derivatives of (=logP) as follows. For the partial derivatives with respect to ai (i=1,,d),

(3) ˙ai(xtp,,xt;θ)=ai(xtp,,xt;θ)=aiP(xtp,,xt1),xtθ1(0,1](P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ=aii=1dFai{(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi,t)1(0,1(P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ=i=1d Fi,i{(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi,t)1(0,1](P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ=Eν,θ,F,G[ailogFai(ϵi,t)|Xtp=xtp,,Xt=xt](3)

where F˜i,i=Fai if i  i and F˜i,i=fai if i=i. For the partial derivatives with respect to Ai,j(k) (i,j=1,,d,k=1,,p),

˙Ai,j(k)(xtp,,xt;θ)=Ai,j(k)i=1d{Fai(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi=1)1(0,1](P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ={Fai(kkGAi,j(k)xj,tkjj)(xj,tkgAi,j(k))GAi,j(k)(xj,tk1)(xi,t)}×ii{Fai(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi,t)1(0,1(P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ=i=1d{Fai(k=1,,pGi,j,k,i,j,kj=1,,d )}(xi,t)1(0,1](P(xtp,,xt1),xtθ)P(xtp,,xt1),xtθ=Eν,θ,F,G[Ai,j(k)logGAi,j(k)(ξi,j,r(k))|Xtp=xtp,,Xt=xt](4)

where G˜i,j,k,i,j,k=GAi,j(k)xj,tk if (i,j,k)(i,j,k) and G˜i,j,k,i,j,k=xj,tkgAi,j(k)GAi,j(k)(xj,tk1) if (i,j,k)=(i,j,k).

Then, by using the Equations (3) and (4), we can derive a Taylor expansion of the localized log-likelihood ratio, and the appropriate limit theorems suggest that E(n)(ν,Θ,F,G)nZ+ has the LAN property as follows:

Theorem 1. Suppose that any Fa×GAF×G given any θ=(aT,AT)TΘ satisfies Assumptions (A1)-(A7), and let ν be a probability measure on Z+dp with finite support. Then, the sequence of experiments E(n)(ν,Θ,F,G)nZ+ has the LAN property in θΘ, i.e., for every uRq the following expansion holds,

logdPν,θ+u/n,F,G(n)dPν,θ,F,G(n)=logLnθ+un|Xp,,XnLnθ|Xp,,Xn=uTSn12uTJu+Rn

where the score (also called the central sequence)

Sn  Sn(θ)=1nt=0n˙ai(Xtp,,Xt;θ)i=1,,d˙Ai,j(k)(Xtp,,Xt;θ)i,j=1,,d,k=1,,p (5)

satisfies

SndN(0,J)underν,θ,F,G.)(6)

The Fisher information defined by

J  J(θ)=Jai,aii,i=1,,dJai,Ai,j(k)i,i,j=1,,d,k=1,,pJAi,j(k),aii,j,i=1,,d,k=1,,pJAi,j(k),Ai,j(k)i,j,i,j=1,,d,k,k=1,,p=Eν,θ,F,G˙ai˙aii,iEν,θ,F,G˙ai˙Ai,j(k)i,i,j,kEν,θ,F,G˙Ai,j(k)˙aii,j,k,iEν,θ,F,G˙Ai,j(k)˙Ai,j(k)i,j,k,i,j,k

with ˙  ˙(Xp,,X0;θ) is nonsingular, and Rn  Rn(u,θ)p0 under Pν,θ,F,G.

3. Efficient estimation

This section provides efficient estimators based on the one-step update method. First, we use the multivariate conditional least squares estimator as an initial estimator of θ (e.g., Bu, McCabe, & Hadri, Citation2008).

Definition 1 Suppose that {Xp,,X0,X1,,Xn} is observed from the MGINAR(p) process defined by (1). Then, the conditional least squares (CLS) estimator θˆCLS for θ is defined by

θˆCLS=argminθΘQn(θ)

where

Qn(θ)=t=0nXtgt(θ)TXtgt(θ)andgt(θ)=EXt|Ft1=a+k=1pA(k)Xtk.

Note that by calculating the derivative of Qn with respect to all entries of θ, we have

θˆCLS=(ZTZ)1ZTY.

where YRd(n+1) and ZRd(n+1)×q are

Y=X0T,,XnTTandZ=Z0,,ZnTId

with Zt=1,X0T,...,XtpTTRdp+1.

Then, Du and Li (Citation1991) showed the following.

Proposition 1 .

nθˆCLSθdN0,Γ1ΣΓ1underPν,θ,F,G

where

Γ=Eν,θ,F,Ggt(θ)Tθgt(θ)θT,Σ=Eν,θ,F,Ggt(θ)TθXtgt(θ)Xtgt(θ)Tgt(θ)θT.

Moreover, we have the following.

Proposition 2. The CLS estimator θˆCLS is not asymptotically efficient, because it is evident that for some wRq

wTΓ1ΣΓ1J1w > 0underPν,θ,F,G

except for some special cases.

Since we have a n-consistent but inefficient estimator of θ, we update the CLS estimator to an efficient estimator by using the LAN result.

Theorem 2. Let ν be a probability measure on Z+dp with finite support. Let θˆ CLS be a CLS estimator. Define

θˆ :=θˆ CLS+1nJˆn(θˆ CLS)1Sn(θˆ CLS)

where

Jˆn(θ)=1nt=0n˙ai(X_t;θ)˙ai(X_t;θ)1nt=0n˙ai(X_t;θ)˙Ai,j(k)(X_t;θ)1nt=0n˙Ai,j(k)(X_t;θ)˙ai(X_t;θ)1nt=0n˙Ai,j(k)(X_t;θ)˙Ai,j(k)(X_t;θ)

with ˙(X_t;θ)=˙(Xtp,,Xt;θ). Then, under Assumptions (A1)-(A7), θˆ is an asymptotically efficient estimator of θ in the sequence of experiments E(n)(ν,Θ,F,G)nZ+. Moreover, Jˆn1 yields a consistent estimator of J1, i.e., Jˆn(θˆCLS)1pJ1 under Pν,θ,F,G

4. Numerical study

In this section, we first examine asymptotic relative efficiency (ARE) of our proposed estimator and the CLS estimator through some simulation experiments. Then, we present a real data analysis to illustrate the application of the proposed estimation method.

4.1. Simulation study

Our proposed estimator (θˆ) and the conditional least squares estimator (θˆCLS) under the MGINAR model are compared through a series of simulation experiments. Specifically, we assess the small sample properties of the two estimators in the following cases: a Binomial thinning operator and a Binomial innovation (Case 1); and a Poisson thinning operator and a Poisson innovation (Case 2) with d=2 and p=1. The count series {Xt} are defined by

Xt=X1tX2t=r=1X1,t1ξ1,1,r+r=1X2,t1ξ1,2,r+ϵ1tr=1X1,t1ξ2,1,r+r=1X2,t1ξ2,2,r+ϵ2t=AXt1+ϵt

where

a=a1a2=0.090.21,A=A11A12A21A22=0.240.180.120.06

and ξi,j,ri.i.d.GAi,j, ϵiti.i.d.Fai. We suppose the initial distribution as ν(x0)=1{x0=(1,1)T} (i.e., the initial value is fixed by x0=(1,1)T). The GAi,j and Fai are defined by

Case 1: Binomial thinning operator and Binomial innovation

GAi,j(x):=Ai,jx(1Ai,j)1x,Fai(x):=aix(1ai)1xforx{0,1},

Case 2: Poisson thinning operator and Poisson innovation

GAi,j(x):=(Ai,j)xexp((Ai,j))/x!,Fai(x):=(ai)xexp((ai))/x!forxZ+.

Then, the true parameter vector θ is written by

θ=aT,vec(A)TT=a1,a2,A11,A12,A21,A22T=0.09,0.21,0.24,0.12,0.18,0.16T.

Note that the parameter vector is chosen to obtain stationary count series.

We ran 1000 Monte Carlo replicas with sample sizes n=10,100,1000,10000. For each replica, we estimate the model parameters based on two procedures (CLS: θˆCLS, Efficient Est: θˆ) and calculate the (approximated) bias (Table ) and diagonal part of the (approximated) MSE (Table ) of the parameter estimators. Finally, we calculate the (approximated) ARE (Table ) defined by {|MSE of θˆ|/| MSE of θˆCLS|}1/q. Simulations are carried out in R. For the calculation of θˆ, we need an explicit form of the score ˙ under the given distributions F and G. Please see the derivation of the score ˙ for each case in the Appendix.

Table 1. Bias results for the MGINAR(1) model

Table 2. Diagonal part of MSE Results for the MGINAR(1) model

Table 3. ARE (asymptotic relative efficiency) Results for the MGINAR(1) model

The bias results are reported in Table . It can be seen that the biases for both estimators tend to be 0 when the sample size is sufficiently large. This implies that the both estimators are asymptotically unbiased. However, for the CLS estimator of A12 and A21, the biases are relatively large, which implies that the convergence rate is relatively slow. In contrast, our proposed estimator improves the CLS estimator in terms of bias.

The corresponding MSE results are displayed in Table . Similar to the bias results, the MSEs for both estimators tend to be 0 when the sample size is sufficiently large, which implies that both estimators converge to the true values in probability. However, for the CLS estimator of A12 and A21, the MSEs are relatively large, which implies that the convergence rate of the variance is relatively slow. In contrast, our proposed estimator improves the CLS estimator, because it appears that the MSEs of all components converge to 0.

Finally, the ARE (asymptotic relative efficiency) results are given in Table . The ARE of two estimators is defined as the ratio of their asymptotic variances (e.g., Cox & Hinkley, Citation1974; Serfling, Citation2011). Let θˆ1 and θˆ2 be two estimators of θΘRq. Let Σ1 and Σ2 be the asymptotic covariance matrices, i.e.,

n(θˆiθ)dN0,Σi

for i = 1,2, respectively. Then, the ARE of θˆ2 and θˆ1 is given by

ARE(θˆ2,θˆ1)=|Σ1||Σ2|1/q.

In our study, we consider the ARE of our proposed estimator (θˆ) and the conditional least squares estimator (θˆCLS) as follows:

ARE(θˆ,θˆCLS)=|Γ1ΣΓ1||J1|1/q.

Clearly, in this setup, if an ARE is larger than 1, it suggests that our proposed estimator improves the CLS estimator in terms of efficiency. Table reports the ARE results, but Γ1ΣΓ1 and J1 are replaced as the sample MSEs. It can be seen that the ARE tends to be larger than 1 as the sample size increases, which implies that our estimator improves the CLS estimator in terms of efficiency. We tried the same simulation studies under some different settings with respect to the parameter values. We omit them, but the results are similar.

4.2. Real data analysis

The data set consists of the number of cases of infectious diseases per week by prefecture for the period 2015–2018 (208 weeks) as reported by the National Institute of Infectious Diseases (NIID) in Japan (URL: https://www.niid.go.jp/niid/en/). Here we use the number of cases of “Epidemic keratoconjunctivitis (EK)” and “Aseptic meningitis (AM)” in Shimane prefecture. The sample path plot for the data in Figure reveals some seasonality or periodicity, but it looks that there exist no trend and stationarity.

Figure 1. The number of cases of epidemic keratoconjunctivitis (left hand side) and aseptic meningitis (right hand side) per week in Shimane prefecture.

Figure 1. The number of cases of epidemic keratoconjunctivitis (left hand side) and aseptic meningitis (right hand side) per week in Shimane prefecture.

Figure shows the sample ACF and PACF plots for each disease. The figure shows the time dependency but any long-range dependence is not observed, so it is acceptable to fit the MGINAR(1) model.

Figure 2. The sample ACF and PACF for the epidemic keratoconjunctivitis data (left hand side) and aseptic meningitis data (right hand side).

Figure 2. The sample ACF and PACF for the epidemic keratoconjunctivitis data (left hand side) and aseptic meningitis data (right hand side).

We suppose that the time series count data {Xt=(Xt(EK),Xt(AM))T} follows

Xt=AXt1+ϵt

where both of the thinning operator and the innovation follows the Binomial or Poisson distribution. The CLS estimator (aˆCLS,AˆCLS) for the parameter (a,A) is obtained as follows:

aˆCLS,AˆCLS=0.40500.5534,0.48450.12400.01880.1179.

Next, our proposed estimators for the Binomial thinning and the Binomial innovation (aˆE.B,AˆE.B) and for the Poisson thinning and the Poisson innovation (aˆE.P,AˆE.P) are obtained as follows:

aˆE.B,AˆE.B=0.18130.3404,0.52080.12220.03380.1664,
aˆE.P,AˆE.P=0.17030.2575,0.47470.05340.01910.1207.

For both cases of the Binomial distribution and the Poisson distribution, aˆCLS is greatly changed by our proposed estimation. In contrast, there is not much change with respect to AˆCLS.

Finally, we evaluate the goodness of fit for each estimator based on AIC. Denote AIC for an estimator (aˆ,Aˆ) under the Binomial distribution by AICBin(aˆ,Aˆ), and under the Poisson distribution by AICPois(aˆ,Aˆ). Then, we obtained

AICBin(aˆCLS,AˆCLS)=794.0651,AICBin(aˆE.B,AˆE.B)=718.821
AICPois(aˆCLS,AˆCLS)=43240.3805,AICPois(aˆE.P,AˆE.P)=43081.671

From the above results, we can see that our proposed estimators take good performance in terms of goodness of fit.

Acknowledgements

We thank the anonymous referees for constructive comments. This work was supported by JSPS KAKENHI, Grant Number JP16K00036.

Additional information

Funding

This work was supported by the JSPS [JP16K00036].

Notes on contributors

Hiroshi Shiraishi

Hiroshi Shiraishi received the BS degree in mathematics in 1998 and the MS and Dr degrees in mathematical science from Waseda University, Japan in 2004 and 2007, respectively. He joined the GE Edison Life Insurance Company, the Prudential Life Insurance Company of Japan and the Hannover-Re Reinsurance Company, in 1998, 2000 and 2005, respectively. His research interests are actuarial science, time series analysis, econometric theory and financial engineering. In particular, he investigates the statistical analysis of discrete-valued time series/the statistical estimation of optimal dividend problems in the field of the actuarial science/the statistical estimation of Hawkes graphs and so on. He is currently an associate professor in the Department of Mathematics, Keio University, Japan. He is a fellowof the Institute of Actuary of Japan (FIAJ).

References

  • Al-Osh, M. A., & Alzaid, A. A. (1987). First-order integer-valued autoregressive (INAR(1)) process. Journal of Time Series Analysis, 8(3), 261–18. doi:10.1111/j.1467-9892.1987.tb00438.x
  • Alzaid, A. A., & Al-Osh, M. A. (1990). An integer-valued pth-order autoregressive structure (INAR(p)) process. Journal of Applied Probability, 27(2), 314–324. doi:10.2307/3214650
  • Bu, R., McCabe, B., & Hadri, K. (2008). Maximum likelihood estimation of higher-order integer-valued autoregressive processes. Journal of Time Series Analysis, 29(6), 973–994. doi:10.1111/j.1467-9892.2008.00590.x
  • Cox, D. R., & Hinkley, D. (1974). Theoretical statistics. London: Chapman and Hall.
  • Drost, F. C., Van Den Akker, R., & Werker, B. J. M. (2008). Local asymptotic normality and efficient estimation for INAR (p) models. Journal of Time Series Analysis, 29(5), 783–801. doi:10.1111/j.1467-9892.2008.00581.x
  • Du, J. G., & Li, Y. (1991). The integer-valued autoregressive (INAR(p)) model. Journal of Time Series Analysis, 12(2), 129–142. doi:10.1111/j.1467-9892.1991.tb00073.x
  • Franke, J., & Subba Rao, T. 1993. Multivariate first-order integer-valued autoregression. Technical report No.95, Universität Kaiserslautern.
  • Karlis, D., & Pedeli, X. (2013). Flexible bivariate INAR(1) processes using copulas. Communications in Statistics-Theory and Methods, 42(4), 723–740. doi:10.1080/03610926.2012.754466
  • Latour, A. (1997). The multivariate GINAR(p) process. Advances in Applied Probability, 29(1), 228–248. doi:10.2307/1427868
  • Latour, A. (1998). Existence and stochastic structure of a non-negative integer-valued autoregressive process. Journal of Time Series Analysis, 19(4), 439–455. doi:10.1111/jtsa.1998.19.issue-4
  • McKenzie, E. (1985). Some simple models for discrete variate time series. Water Resources Bulletin, 21(4), 645–650. doi:10.1111/j.1752-1688.1985.tb05379.x
  • McKenzie, E. (1988). Some ARMA models for dependent sequences of Poisson counts. Advances in Applied Probability, 20(4), 822–835. doi:10.2307/1427362
  • Ristić, M. M., Bakouch, H. S., & Nastić, A. S. (2009). A new geometric first-order integer-valued autoregressive (NGINAR(1)) process. Journal of Statistical Planning and Inference, 139(7), 2218–2226. doi:10.1016/j.jspi.2008.10.007
  • Serfling, R. (2011). Asymptotic relative efficiency in estimation. In M. Lovric (Ed.), International encyclopedia of statistical science, (pp. 68–82). Berlin, Heidelberg: Springer.
  • Weiß, C. H. (2018). An introduction to discrete-valued time series. Chichester: John Wiley & Sons.

Appendix

Proof of Theorem 1 The proof is similar to that for Theorem 1 in Drost et al. (Citation2008).

Expansion of the log-likelihood ratio: Let u=(u1,,uq)TRq{0} with q=d+d2p. Under Assumption (A1), we obtain by Taylor’s theorem,

logLnθ+un|Xp,,XnLnθ|Xp,,Xn=uTSn(θ)12uTJn(θ˜n)u(7)

where θ˜n is a random point on the line-segment between θ and θ+u/n and

Jn(θ)=1nθTSn(θ).(8)

Then, we show

  • Part 0: auxiliary calculations

  • Part 1: Sn(θ)dN(0,J) under Pν,θ,F,G

  • Part 2: Jn(θ˜)PJ under Pν,θ,F,G

  • Part 3: non-singularity of J.

In what follows, for simplicity, we write (Xs,,Xt):=Xs:t and (xs,,xt):=xs:t for s,tZ with s < t.

Part 0: We first show that the existence of J. To do so, we need to show for each θi,θjθ

Eν,θ,F,G˙θi(X(p):0;θ)˙θj(X(p):0;θ)<.(9)

It can be shown by Assumptions (A4) and (A6).□

Part 1: From Equations (3) and (4), it follows that

Eν,θ,F,G˙ai(X(tp):t;θ)|X(tp):(t1)=Eν,θ,F,GailogFai(ϵi,t)|X(tp):(t1)=0
Eν,θ,F,G˙Ai,j(k)(X(tp):t;θ)|X(tp):(t1)=Eν,θ,F,GAi,j(k)logGAi,j(k)(ξi,j,r(k))|X(tp):(t1)=0

since ϵi,t and ξi,j,r(k) are independent of {Xtp,,Xt1}. Let wRq and ˙θ=˙a1,,˙ad,˙A1,1(1),,˙Ad,d(p)T. From the above equations, it follows that

Eν,θ,F,GwT˙θ(X(tp):t;θ)|X(tp):(t1)=0

and by Part 0, it follows that

Eν,θ,F,GwT˙θ(X(tp):t;θ)2=wTJw < .

Hence, we have, by Lemma B.1 of Drost et al. (Citation2008),

1n t=0nwT˙θ(X(tp):t;θ)dwTN(0,J)underPν,θ,F,G.

An application of the Cramér-Wold device concludes the proof of Part 1.□

Part 2: Assumption (A3) implies that, for fixed xp,,x0Z+d and for each θi,θjθ, the mapping

θ(/θi)log˙θj(x(p):0;θ)

is continuous, respectively. Since we have already proved (9) in Part 0, by Lemma B.3 of Drost et al. (Citation2008), the proof is completed if we have Jn(θ)pJ where

Jn(θ)=1nt=0nai˙ai(X(tp):t;θ)ai˙Ai,j(k)(X(tp):t;θ)Ai,j(k)˙ai(X(tp):t;θ)Ai,j(k)˙Ai,j(k)(X(tp):t;θ).

For i,i{1,,d} and t{0,1,,n}, it follows that from Equation (3)

ai˙ai(x(tp):t;θ)=ai[i=1d{F˜i,i(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi=1)]1(0,1(P(xtp,,xt1),xtθ)(P(xtp,,xt1),xtθ)2(aiP(xtp,,xt1),xtθ)i=1dF˜i,i{F˜i,i(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi=1)1(0,1](P(xtp,,xt1),xtθ)(P(xtp,,xt1),xtθ)2=i=1dF˜i,i{F˜i,i(k=1,,pGAi,j(k)xj,tkj=1,,d)}(xi,t)1(0,1(P(x(tp):(t1)),xtθ)Px(tp):(t1),xtθ˙ai(x(tp):t;θ)˙ai(x(tp):t;θ)=Eν,θ,F,G[ai,ailogFa(ϵt)|X(tp):t=x(tp):t]˙ai(x(tp):t;θ)˙ai(x(tp):t;θ)

where

F˜˜i,i,i={f˙aiifi=i=ifaiifi=iifaiifii=iFaiothers

and

ai,ailogFaϵt=2(ai)2logFaiϵi,tifi=iailogFaiϵi,tailogFaiϵi,tifi  i

which implies that

Eν,θ,F,Gai˙ai(X(tp):t;θ)=Eν,θ,F,GEν,θ,F,Gai,ailogFaϵt|X(tp):t˙ai(X(tp):t;θ)˙ai(X(tp):t;θ)=Eν,θ,F,GEν,θ,F,Gai,ailogFaϵt|X(tp):(t1)Eν,θ,F,G˙ai(X(tp):t;θ)˙ai(X(tp):t;θ)=Eν,θ,F,Gai,ailogFaϵtJai,ai=Jai,ai

where the third equation follows the independence between ϵt and X(tp):(t1). This result and Lemma B.3 of Drost et al. (Citation2008) yield

1nt=0nai˙ai(X(tp):t;θ)pEν,θ,F,Gai˙ai(X(p):0;θ)=Jai,ai

By the same argument, we obtain

1nt=0nai˙Ai,j(k)(X(tp):t;θ)pJai,Ai,j(k),1nt=0nAi,j(k)˙ai(X(tp):t;θ)pJAi,j(k),ai,1nt=0nAi,j(k)˙Ai,j(k)(X(tp):t;θ)pJAi,j(k),Ai,j(k).

This completes the proof of Part 2.

Part 3: The proof of the non-singularity for J is provided by Assumption (A5) in the same way as Drost et al. (Citation2008).

Proof of Theorem 2 The proof is the same as Theorem 3.2 of Drost et al. (Citation2008).

To calculate the one-step estimator (θˆ), we need to know the score functions (˙ai) and (˙Ai,j(k)) for each distribution. In what follows, we show the derivation of these functions for Cases 1 and 2 in Section 4.

Derivation of the score ˙ for the Case 1 For Case 1, the probability functions GAi,j and Fai are given by

GAi,j(x)=Ai,jx(1Ai,j)1x,Fai(x)=aix(1ai)1x

for x{0,1}, which implies that the derivatives gAi,j and fai are written as

gAi,j(x)=xAi,j1x1Ai,jGAi,j(x),fai(x)=xai1x1aiFai(x).

Let

Bin(n,A)(x)=nxAx(1A)nx,Bin(n,A)(x)=xAnx1ABin(n,A)(x)

for nN and x{0,,n}. Let Bin(0,A)(0)=1,Bin(0,A)(0)=0 and Bin(0,A)(x)=Bin(0,A)(x)=0 for x0. Then, we obtain for i,j=1,2,

P(xt1),xtθ=(1)Bin(1,a1)(n1)Bin(x1,t1,A1,1)(n2)Bin(x2,t1,A1,2)(n3)×(2)Bin(1,a2)(n1)Bin(x1,t1,A2,1)(n2)Bin(x2,t1,A2,2)(n3),
˙ai(xt1,xt;θ)=(1)Bin(1,ai)(n1)Bin(x1,t1,Ai,1)(n2)Bin(x2,t1,Ai,2)(n3)×(2)Bin(1,a3i)(n1)Bin(x1,t1,A3i,1)(n2)Bin(x2,t1,A3i,2)(n3)/P(xt1),xtθ
˙Ai,j(xt1,xt;θ)=(1)Bin(1,ai)(n1)Bin(xj,t1,Ai,j)(nj+1)Bin(x3j,t1,Ai,3j)(n4j)×(2)Bin(1,a3i)(n1)Bin(x1,t1,A3i,1)(n2)Bin(x2,t1,A3i,2)(n3)/P(xt1),xtθ

where (1)=n1+n2+n3=xi,t,n1{0,1},n2{0,,x1,t1},n3{0,,x2,t1} and (2)=n1+n2+n3=x3i,t,n1{0,1},n2{0,,x1,t1},n3{0,,x2,t1}.

The derivation of the score ˙ for Case 2 For Case 2, the probability functions GAi,j and Fai are given by

GAi,j(x):=(Ai,j)xexp((Ai,j))/x!,Fai(x):=(ai)xexp((ai))/x!

for xZ+, which implies that the derivatives gAi,j and fai are written by

gAi,j(x)=xAi,j1GAi,j(x),fai(x)=xai1Fai(x).

Let

Pois(λ)(x)=λxeλx!,Pois(λ)(x)=xλ1λxeλx!

and

Pois(nA)(x)=0ifn=0Pois(A)(x)ifn=1x1+x2=xnPois(A)(x1)Pois((n1)A)(x2)ifn > 1

for nZ+,A>0,xZ+. Then, we obtain for i,j=1,2,

P(xt1),xtθ=(3)Pois(a1)(n1)Pois(x1,t1A1,1)(n2)Pois(x2,t1A1,2)(n3)×(4)Pois(a2)(n1)Pois(x1,t1A2,1)(n2)Pois(x2,t1A2,2)(n3)
˙ai(xt1,xt;θ)=(3)Pois(ai)(n1)Pois(x1,t1Ai,1)(n2)Pois(x2,t1Ai,2)(n3)×(4)Pois(a3i)(n1)Pois(x1,t1A3i,1)(n2)Pois(x2,t1A3i,2)(n3)/P(xt1),xtθ
˙Ai,j(xt1,xt;θ)=(3)Pois(ai)(n1)Pois(xj,t1Ai,j)(nj+1)Pois(x3j,t1Ai,3j)(n4j)×(4)Pois(a3i)(n1)Pois(x1,t1A3i,1)(n2)Pois(x2,t1A3i,2)(n3)/P(xt1),xtθ

where (3)=n1+n2+n3=xi,t,n1,n2,n3{0,,xi,t} and (4)=n1+n2+n3=x3i,t,n1,n2,n3{0,,x3i,t}