Full article: Local asymptotic normality and efficient estimation for multivariate GINAR(p) models

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integer-valued autoregressive (MGINAR) process with order p. The generalized thinning operator in the MGINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, geometric thinning, Negative Binomial thinning and so on. By using the LAN property, we propose an efficient estimation method for the parameter of the MGINAR(p) process. Our procedure is based on the one-step method, which update initial $\sqrt{n}$ -consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.

Keywords:

PUBLIC INTEREST STATEMENT

We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integer-valued autoregressive (MGINAR) process with order p. recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics.

The MGINAR process is one of the class of discrete-valued time series models and contains various classes. By using the LAN property, we propose an efficient estimation method for the parameter. Our procedure is based on the one-step method, which update initial root-n consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.

1. Introduction

Recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics (e.g., Weiß, Citation2018). This paper is concerned with a special class of observation-driven models termed “integer-valued autoregressive processes” (INAR processes), which were introduced independently by Al-Osh and Alzaid (Citation1987) and McKenzie (Citation1985, Citation1988). They introduced the integer-valued autoregressive process with order 1 (INAR(1) process) to model non-negative integer-valued phenomena with time dependence. The more general INAR(p) processes were considered by Alzaid and Al-Osh (Citation1990), Du and Li (Citation1991) and so on. The INAR process consists of a mixture both of the distribution of thinning operator and the distribution of the innovation process. Alzaid and Al-Osh (Citation1990) discussed the INAR(p) process in case that the thinning operator follows Binomial distribution (i.e., specified), while the distribution of the innovation process is left unspecified. Latour (Citation1998) introduced a generalized version of the INAR(p) process, namely, the causal and stationary generalized integer-valued autoregressive (GINAR(p)) process. The generalized thinning operator in the GINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, Geometric thinning, and Negative Binomial thinning (by Ristić, Bakouch, & Nastić, Citation2009). The above history is appropriate for the univariate case. Nowadays, extensions for the multivariate case are being actively researched and applied. One of the first approaches to multivariate thinning mechanism was by McKenzie (Citation1988). After that, Franke and Subba Rao (Citation1993) introduced a multivariate INAR(1) (MINAR(1)) model based on independent Binomial thinning operators. The extensions for the MINAR(1) model were discussed by Latour (Citation1997), Karlis and Pedeli (Citation2013), and so on. This paper discusses a parameter estimation problem for a multivariate version of the GINAR(p) (MGINAR( $p$ )) process. The MGINAR model is quite a large class, including all the models described above.

Estimation of the parameter for the INAR( $p$ ) process can be carried out in a variety of ways. Common ways for estimating parameters include the method of moments (MM), based on the Yule-Walker equations, and conditional least squares (CLS). The main advantage of both approaches is their simplicity due to the closed-form formulae and robustness due to require no assumption for the distribution. It is known that MM and CLS estimators are asymptotically equivalent. However, Al-Osh and Alzaid (Citation1987) and so on have recommended using maximum likelihood (ML) estimators instead of MM and CLS estimators because they are less biased for small sample sizes. However, it is well known that the ML method is computationally unattractive due to complicated transition probabilities, including many convolutions. To overcome this problem, Drost, Van Den Akker, and Werker (Citation2008) considered one-step, asymptotically efficient estimation of the INAR( $p$ ) model. Their method can reduce high computational cost due to the convolutions involved in the ML method. Following Drost et al. (Citation2008), this paper provides a one-step method, which update initial $\sqrt{n} -$ consistent estimators to efficient ones for the MGINAR ( $p$ ) processes. In the class of multivariate integer-valued models, the number of convolutions involved in the likelihood function is very large. For some distributions with reproductive property, the likelihood can be simplified, but it is impossible to apply such a simplification for all models in the MGINAR ( $p$ ) processes. Therefore, it would be quite important to reduce the computational cost by using a one-step estimation. We first establish the local asymptotic normality (LAN) structure for experiments of the MGINAR ( $p$ ) process. Considering the CLS estimator as the initial estimator, a one-step update estimator is proposed. We can also show that this estimator is asymptotically efficient.

The organization of this paper is as follows. In Section 2, the MGINAR ( $p$ ) process is introduced and the LAN property is established. Section 3 discusses the efficient estimation. The CLS estimator is introduced and its asymptotic property is shown. Then, by using the LAN property, a one-step estimator is proposed to update the CLS estimator. In Section 4, the asymptotic relative efficiency (ARE) of the one-step estimator and the CLS estimator is examined through some simulation experiments. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method. The proofs and other details are included in the Appendix.

2. The LAN property

Let ${X_{t} = (X_{1, t}, \dots, X_{d, t})^{T}; t \in Z_{+} = N \cup {0}}$ be a $d$ - dimensional non-negative integer-valued random process (i.e., $X_{t} (ω) \in Z_{+}^{d}$ ). The multivariate generalized integer-valued autoregressive process of order $p$ (MGINAR( $p$ )) is defined by

(1)

X_{t} = \sum_{k = 1}^{p} A^{(k)} \circ X_{t - k} + ϵ_{t}

(1)

where $A^{(k)} = (A_{i, j}^{(k)})_{t, j = 1, \dots, d}$ is a $d \times d$ -matrix for $k = 1, \dots, p$ and the matrix thinning $A^{(k)} \circ X_{t - k}$ gives a $d$ -dimensional random vector with $i$ th component

(2)

{[A^{(k)} \circ X_{t - k}]}_{i} = \sum_{j = 1}^{d} A_{i, j}^{(k)} \circ X_{j, t - k} = \sum_{j = 1}^{d} \sum_{r = 1}^{X_{j, t - k}} ξ_{i, j, r}^{(k)}, i = 1, \dots, d .

(2)

Note that for any $A \in R$ , $A \circ 0 = 0$ ; ${ξ_{i, j, r}^{(k)}; r \in Z_{+}}$ is a collection of independent and identically distributed (i.i.d.), non-negative, integer-valued random variables with the distribution function $G_{A_{i, j}^{(k)}}$ and the mean $A_{i, j}^{(k)}$ ; $\{ϵ_{t} = {(ϵ_{1, t}, \dots, ϵ_{d, t})}^{T}; t \in Z_{+}\}$ is a collection of i.i.d. non-negative, integer-valued random vectors, where the $i$ th component $ϵ_{i, t}$ has an independent distribution function $F_{a_{i}}$ and the mean $a_{i}$ . Suppose that the starting values ${X_{- p}, \dots, X_{- 1}}$ have a distribution $ν$ on $Z_{+}^{d \times p}$ , and $\{X_{- p}, \dots, X_{- 1}\}, \{ξ_{i, j, r}^{(1)}\}, \dots, \{ξ_{i, j, r}^{(p)}\}, \{ϵ\}$ .

are independent of each other.

Throughout the paper, the number of lags ( $p \in N$ ) and the dimensions ( $d \in N$ ) are fixed and known. Let $F$ and $G$ denote the classes of distributions of $ϵ$ and $ξ$ that belong to parametric classes, respectively, say $F \equiv (F_{a}; a \in Θ_{a})$ and $G \equiv (G_{A}; A \in Θ_{A})$ . For instance, when we consider the Binomial thinning operator and the Binomial innovation, $F$ and $G$ should be defined by $F_{a} (x) = a^{x} (1 - a)^{x}$ and $G_{A} (x) = A^{x} (1 - A)^{x}$ for $a \in Θ_{a} = (0, 1), A \in Θ_{A} = (0, 1)$ , and $x \in {0, 1}$ . Our goal is to estimate the parameter $θ = (a^{T}, A^{T})^{T} \in Θ \subset R^{q}$ with $q = d + d^{2} p$ , efficiently, where

Θ = \{{(a^{T}, A^{T})}^{T}; a = (a_{1}, \dots, a_{d})^{T} \in Θ_{a}^{d}, A = v e c {(A^{(1)}, \dots, A^{(p)})}^{T} = (A_{1, 1}^{(1)}, \dots, A_{d, d}^{(p)})^{T} \in Θ_{A}^{d^{2} p}\} .

For (probability) measures $F_{1}, \dots, F_{k} \in F \cup G$ , we introduce the following notations for the convolutions:

$F_{1}$ _⊛ $F_{2}$ : the convolution of $F_{1}$ and $F_{2}$
_⊛ $_{k = 1, \dots, p} F_{k}$ : the convolution of $F_{1}, \dots, F_{k}$ (i.e., _⊛ $_{k = 1, \dots, p} F_{k} = F_{1}$ _⊛ $\dots$ _⊛ $F_{k}$ )
$F_{1}^{n ⊛}$ : the $n$ times convolution of $F_{1}$ (i.e., $F_{1}^{n ⊛} = \overset{n t i m e s}{\overset{⏞}{F_{1} ⊛ \dots ⊛ F_{1}}}$ )

Based on the above notations, we consider the corresponding probability space for ${X_{t}}$ denoted by $(Ω, F, P_{ν, θ, F, G})$ , where $Ω$ is a sample space, $F$ is the $σ$ -algebra, and $P_{ν, θ, F, G}$ is the probability measure given by $ν$ (distribution of ${X_{- p}, \dots, X_{- 1}}$ ), $θ$ (parameter), $F$ (class of distribution of $ϵ$ ) and $G$ (class of distribution of $ξ$ ). Furthermore, we introduce $F = {F_{t}; t \geq - p}$ as the natural filtration generated by ${X_{- p}, \dots, X_{t}}$ (i.e. $F_{t} = σ (X_{- p}, \dots, X_{t}) \neq σ (X_{- p}, \dots, X_{- 1}, ϵ_{0}, \dots, ϵ_{t})$ ).

From (1) and (2), we can write

X_{t} = \sum_{k = 1}^{p} \sum_{j = 1}^{d} \sum_{r = 1}^{X_{j, t - k}} ξ_{j, r}^{(k)} + ϵ_{t}

where $ξ_{j, r}^{(k)} = (ξ_{1, j, r}^{(k)}, \dots, ξ_{d, j, r}^{(k)})^{T}$ . Hence, it follows, for any $t \in Z_{+}$ ,

\begin{aligned} P_{ν, θ, F, G} {X_{t} = x_{t} | F_{t - 1}} = P_{ν, θ, F, G} {X_{t} = x_{t} | X_{t - 1}, \dots, X_{t - p}} \\ := P_{(X_{t - 1}, \dots, X_{t - p}), x_{t}}^{θ} (s a y) \end{aligned}

For some $x_{t - k} = (x_{1, t - k}, \dots, x_{d, t - k})^{T} \in Z_{+}^{d}, k = 0, 1, \dots, p$ , the transition probability $P_{(x_{t - 1}, \dots, x_{t - p}), x_{t}}^{θ}$ is given by

P_{(x_{t - 1}, \dots, x_{t - p}), x_{t}}^{θ} = P_{ν, θ, F, G} \{X_{t} = x_{t} | X_{t - 1} = x_{t - 1}, \dots, X_{t - p} = x_{t - p}\} = P_{ν, θ, F, G} \{\sum_{k = 1}^{p} \sum_{j = 1}^{d} \sum_{r = 1}^{X_{j, t - k}} ξ_{j, r}^{(k)} + ϵ_{t} = x_{t} | X_{t - 1} = x_{t - 1}, \dots, X_{t - p} = x_{t - p}\} = \prod_{i = 1}^{d} P_{ν, θ, F, G} \{\sum_{k = 1}^{p} \sum_{j = 1}^{d} \sum_{r = 1}^{X_{j, t - k}} ξ_{i, j, r}^{(k)} + ϵ_{i, t} = x_{i, t} | X_{t - 1} = x_{t - 1}, \dots, X_{t - p} = x_{t - p}\} = \prod_{i = 1}^{d} \{F_{a_{i} ⊛} (_{\begin{matrix} ⊛ k = 1, \dots, p \\ j = 1, \dots, d \end{matrix}} G_{A_{i, j}^{(k)}}^{x_{j, t - k ⊛}})\} (x_{i, t}) .

We consider parametric MGINAR models in which the parameter space is restricted to the stationary parameter space, for instance, in case of the Binomial thinning operator and $d = 1$ , the thinning part of the parameter space should be defined by ${A; A = (A^{(1)}, \dots, A^{(p)})^{T} \in (0, 1)^{d}, \sum_{k = 1}^{p} A^{(k)} < 1}$ . Suppose that $F \times G$ is a combination of a family of parametric distributions for the thinning operator and innovation (immigrant) with the formula below:

F = {F_{a} = (F_{a_{1}}, \dots, F_{a_{d}}); a = (a_{1}, \dots, a_{d})^{T} \in R^{d}}

G = {G_{A} = (G_{A_{1, 1}^{(1)}}, \dots, G_{A_{d, d}^{(p)}}); A = v e c (A^{(1)}, \dots, A^{(p)})^{T} = (A_{1, 1}^{(1)}, \dots, A_{d, d}^{(p)})^{T} \in R^{d^{2} p}}

Suppose that for any $θ = (a^{T}, A^{T})^{T} \in Θ$ , ${X_{t}}$ with $F_{a} \times G_{A} \in F \times G$ is a strictly stationary process. Let $P_{ν, θ, F, G}^{(n)}$ denote the law of ${X_{- p}, \dots, X_{n}}$ on the measurable space $(Z_{+}^{d (n + 1 + p)}, N^{Z_{+}^{d (n + 1 + p)}})$ under $P_{ν, θ, F, G}$ . Here $Z_{+}^{d (n + 1 + p)}$ is the sample space and the power set $N^{Z_{+}^{d (n + 1 + p)}}$ is the $σ$ -algebra on this sample space. Observing $(X_{- p}, \dots, X_{n})$ yields the following sequence of experiments:

E^{(n)} (ν, Θ, F, G) = (Z_{+}^{d (n + 1 + p)}, N^{Z_{+}^{d (n + 1 + p)}}, (P_{ν, θ, F, G}^{(n)}; θ \in Θ)), n \in Z_{+}

where the initial distribution $ν$ is fixed and the distributions for the thinning operator and innovation (immigrant) are parametrized (i.e., once $θ = (a^{T}, A^{T})^{T} \in Θ$ is fixed, the distributions are given by $F_{a}$ and $G_{A}$ ).

To prove the LAN property for the sequence of experiments $E^{(n)} (ν, Θ, F, G), n \in Z_{+}$ , we impose the following assumptions.

Assumption 1 . (A1) $Θ$ is an open, convex subset of $R^{q}$ with $q = d + d^{2} p$ .

(A2) The supports of $F_{a}$ and $G_{A}$ do not depend on $a$ and $A$ and we have, for all $θ = (a^{T}, A^{T})^{T}, F_{a} (0) \times G_{A} (0) \in (0, 1)^{d \times d^{2} p}$ .

(A3) For all $e \in Z_{+}$ and all $θ = (a^{T}, A^{T})^{T} = ((a_{i}), (A_{i, j}^{(k)}))^{T} \in Θ$ ,

f_{a_{i}} (e) = \frac{\partial F_{a_{i}}}{\partial a_{i}} (e), {\dot{f}}_{a_{i}} (e) = \frac{\partial^{2} F_{a_{i}}}{{(\partial a_{i})}^{2}} (e), g_{A_{i, j}^{(k)}} (e) = \frac{\partial G_{A_{i, j}^{(k)}}}{\partial A_{i, j}^{(k)}} (e), {\dot{g}}_{A_{i, j}^{(k)}} (e) = \frac{\partial^{2} G_{A_{i, j}^{(k)}}}{{(\partial A_{i, j}^{(k)})}^{2}} (e)

are defined, continuous in $Θ$ , and satisfied with $\frac{\partial F_{a_{i^{'}}}}{\partial a_{i}} = 0, \frac{\partial G_{A_{i^{'}, j^{'}}^{(k^{'})}}}{\partial A_{i, j}^{(k)}} = 0$ for $i \neq i^{'}, (i, j, k) \neq (i^{'}, j^{'}, k^{'})$ , respectively.

(A4) Let $sup (\cdot)$ denotes

sup (Z) := s u p_{\tilde{θ} = (({\tilde{a}}_{i}), ({\tilde{A}}_{i, j}^{(k)} {))}^{T}; ∥ \tilde{θ} - θ ∥< δ} E_{ν, \tilde{θ}, F, G} [Z | X_{- p}, \dots, X_{0}] .

For every $θ \in Θ$ , there exist a $δ > 0$ and a random variable $M^{θ}$ such that for all $i, j = 1, \dots, d, k = 1, \dots, p$ , $sup (| f_{{\tilde{a}}_{i}} (ϵ_{i, 0}) |^{2}) \leq M^{θ}$ , $sup (| {\dot{f}}_{{\tilde{a}}_{i}} (ϵ_{i, 0}) |) \leq M^{θ}$ , $sup (| g_{{\tilde{A}}_{i, j}^{(k)}} (ξ_{i, j, 1}) |^{2}) \leq M^{θ}$ , $sup (| {\dot{g}}_{{\tilde{A}}_{i, j}^{(k)}} (ξ_{i, j, 1}) |) \leq M^{θ}$ , and $E_{ν, θ, F, G} [M^{θ}] < \infty$ .

(A5) Let ${\dot{ℓ}}_{n}$ and ${\ddot{ℓ}}_{n}$ be the first and second derivatives of log-likelihood for ${X_{- p}, \dots, X_{n}}$ . The information equality $E_{ν, θ, F, G} [{\dot{ℓ}}_{n} {\dot{ℓ}}_{n}^{T}] = - E_{ν, θ, F, G} [{\ddot{ℓ}}_{n}]$ is satisfied, and $E_{ν, θ, F, G} [{\dot{ℓ}}_{n} {\dot{ℓ}}_{n}^{T}]$ is nonsingular and continuous in $Θ$ .

(A6) For all $θ \in Θ$ , $∥ E_{ν, θ, F, G} [ϵ_{0} ϵ_{0}^{T}] ∥< \infty$ and $∥ E_{ν, θ, F, G} [X_{0} X_{0}^{T}] ∥< \infty$ .

(A7) $F_{a} \times G_{A} = F_{a^{'}} \times G_{A^{'}}$ implies $(a, A) = (a^{'}, A^{'})$ .

To prove that $(E^{(n)} (ν, Θ, F, G))_{n \in Z_{+}}$ has the LAN property, we need to determine the behavior of a localized log-likelihood ratio. To this end, we first write down the following likelihood:

L_{n} (θ | X_{- p}, \dots, X_{n}) = ν \{X_{- p}, \dots, X_{- 1}\} \prod_{t = 0}^{n} P_{(X_{t - 1}, \dots, X_{t - p}), X_{t}}^{θ} .

In addition, we introduce the following log-likelihood:

\begin{aligned} ℓ_{n} (θ | X_{- p}, \dots, X_{n}) = log L_{n} (θ | X_{- p}, \dots, X_{n}) \\ = log ν \{X_{- p}, \dots, X_{- 1}\} + \sum_{t = 0}^{n} log P_{(X_{t - p}, \dots, X_{t - 1}), X_{t}}^{θ} \\ := log ν \{X_{- p}, \dots, X_{- 1}\} + \sum_{t = 0}^{n} ℓ (X_{t - 1}, \dots, X_{t}; θ) . \end{aligned}

Following Drost et al. (Citation2008), we establish the LAN property by using a Taylor expansion. To do so, the transition score for $θ$ is needed. The transition score $(\dot{ℓ})$ can be derived by calculating the partial derivatives of $ℓ (= log P)$ as follows. For the partial derivatives with respect to $a_{i}$ ( $i = 1, \dots, d$ ),

(3)

\begin{array}{l} {\dot{ℓ}}_{a_{i}} (x_{t - p}, \dots, x_{t}; θ) = \frac{\partial}{\partial a_{i}} ℓ (x_{t - p}, \dots, x_{t}; θ) = \frac{\partial}{\partial a_{i}} P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ} \frac{1_{(0, 1]} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = \frac{\partial}{\partial a_{i}} \prod_{i^{'} = 1}^{d} F_{a_{i^{'}}} ⊛ {(\begin{array}{l} ⊛ k = 1, \dots, p G_{A_{i^{'}, j}^{(k)}}^{x_{j, t - k ⊛}} \\ j = 1, \dots, d \end{array})} (x_{i^{'}, t}) \frac{1_{(0, 1} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = \prod_{i^{'} = 1}^{d} F i, i^{'} {(\begin{array}{l} ⊛ k = 1, \dots, p G_{A_{i^{'}, j}^{(k)}}^{x_{j, t - k ⊛}} \\ j = 1, \dots, d \end{array})} (x_{i^{'}, t}) \frac{1_{(0, 1]} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = E_{ν, θ, F, G} [\frac{\partial}{\partial a_{i}} \log F_{a_{i}} (ϵ_{i, t}) | X_{t - p} = x_{t - p}, \dots, X_{t} = x_{t}] \end{array}

(3)

where ${\tilde{F}}_{i, i^{'}} = F_{a_{i^{'}}}$ if $i \neq i^{'}$ and ${\tilde{F}}_{i, i^{'}} = f_{a_{i}}$ if $i = i^{'}$ . For the partial derivatives with respect to $A_{i, j}^{(k)}$ ( $i, j = 1, \dots, d, k = 1, \dots, p$ ),

\begin{array}{l} {\dot{ℓ}}_{A_{i, j}^{(k)}} (x_{t - p}, \dots, x_{t}; θ) = \frac{\partial}{\partial A_{i, j}^{(k)}} \prod_{i^{'} = 1}^{d} {F_{a_{i^{'}}} ⊛ (\begin{array}{l} ⊛ k = 1, \dots, p G_{A_{i^{'}, j}^{(k)}}^{x_{j, t - k} ⊛} \\ j = 1, \dots, d \end{array})} (x_{i^{'} = 1}) \frac{1_{(0, 1]} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = {F_{a_{i}} ⊛ (\begin{array}{l} ⊛ k^{'} \neq k G_{A_{i, j^{'}}^{(k^{'})}}^{x_{j^{'}, t - k^{'} ⊛}} \\ j^{'} \neq j \end{array}) ⊛ (x_{j, t - k} g_{A_{i, j}^{(k)}}) ⊛ G_{A_{i, j}^{(k)}}^{(x_{j, t - k} - 1) ⊛} (x_{i, t})} \\ \times \prod_{i^{'} \neq i} {F_{a_{i^{'}}} ⊛ (\begin{array}{l} ⊛ k = 1, \dots, p G_{A_{i^{'}, j}^{(k)}}^{x_{j, t - k} ⊛} \\ j = 1, \dots, d \end{array})} (x_{i^{'}, t}) \frac{1_{(0, 1} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = \prod_{i^{'} = 1}^{d} {F_{a_{i^{'}}} ⊛ (\begin{array}{l} ⊛ k^{'} = 1, \dots, p G i, j, k, i^{'}, j^{'}, k^{'} \\ j^{'} = 1, \dots, d \end{array})} (x_{i^{'}, t}) \frac{1_{(0, 1]} (P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ})}{P_{(x_{t - p}, \dots, x_{t - 1}), x_{t}}^{θ}} \\ = E_{ν, θ, F, G} [\frac{\partial}{\partial A_{i, j}^{(k)}} \log G_{A_{i, j}^{(k)}} (ξ_{i, j, r}^{(k)}) | X_{t - p} = x_{t - p}, \dots, X_{t} = x_{t}] (4) \end{array}

where ${\tilde{G}}_{i, j, k, i^{'}, j^{'}, k^{'}} = G_{A_{i^{'}, j^{'}}^{(k^{'})}}^{x_{j^{'}, t - k^{'}} ⊛}$ if $(i^{'}, j^{'}, k^{'}) \neq (i, j, k)$ and ${\tilde{G}}_{i, j, k, i^{'}, j^{'}, k^{'}} = (x_{j, t - k} g_{A_{i, j}^{(k)}})$ ⊛ $G_{A_{i, j}^{(k)}}^{(x_{j, t - k} - 1) ⊛}$ if $(i^{'}, j^{'}, k^{'}) = (i, j, k)$ .

Then, by using the Equations (3) and (4), we can derive a Taylor expansion of the localized log-likelihood ratio, and the appropriate limit theorems suggest that ${(E^{(n)} (ν, Θ, F, G))}_{n \in Z_{+}}$ has the LAN property as follows:

Theorem 1. Suppose that any $F_{a} \times G_{A} \in F \times G$ given any $θ = (a^{T}, A^{T})^{T} \in Θ$ satisfies Assumptions (A1)-(A7), and let $ν$ be a probability measure on $Z_{+}^{d p}$ with finite support. Then, the sequence of experiments ${(E^{(n)} (ν, Θ, F, G))}_{n \in Z_{+}}$ has the LAN property in $θ \in Θ$ , i.e., for every $u \in R^{q}$ the following expansion holds,

log \frac{d P_{ν, θ + u / \sqrt{n}, F, G}^{(n)}}{d P_{ν, θ, F, G}^{(n)}} = log \frac{L_{n} (θ + \frac{u}{\sqrt{n}} | X_{- p}, \dots, X_{n})}{L_{n} (θ | X_{- p}, \dots, X_{n})} = u^{T} S_{n} - \frac{1}{2} u^{T} J u + R_{n}

where the score (also called the central sequence)

S_{n} \equiv S_{n} (θ) = \frac{1}{\sqrt{n}} \sum_{t = 0}^{n} (\begin{matrix} {({\dot{ℓ}}_{a_{i}} (X_{t - p}, \dots, X_{t}; θ))}_{i = 1, \dots, d} \\ {({\dot{ℓ}}_{A_{i, j}^{(k)}} (X_{t - p}, \dots, X_{t}; θ))}_{i, j = 1, \dots, d, k = 1, \dots, p} \end{matrix}) (5)

satisfies

S_{n} \overset{d}{\to} N (0, J) u n d e r ℙ_{ν, θ, F, G} .) (6)

The Fisher information defined by

J \equiv J (θ) = (\begin{matrix} {(J_{a_{i}, a_{i^{'}}})}_{i, i^{'} = 1, \dots, d} & {(J_{a_{i}, A_{i^{'}, j^{'}}^{(k^{'})}})}_{\binom{i, i^{'}, j^{'} = 1, \dots, d,}{k^{'} = 1, \dots, p}} \\ {(J_{A_{i, j}^{(k)}, a_{i^{'}}})}_{\binom{i, j, i^{'} = 1, \dots, d,}{k = 1, \dots, p}} & {(J_{A_{i, j}^{(k)}, A_{i^{'}, j^{'}}^{(k^{'})}})}_{\binom{i, j, i^{'}, j^{'} = 1, \dots, d,}{k, k^{'} = 1, \dots, p}} \end{matrix}) = (\begin{matrix} {(E_{ν, θ, F, G} [{\dot{ℓ}}_{a_{i}} {\dot{ℓ}}_{a_{i^{'}}}])}_{i, i^{'}} & {(E_{ν, θ, F, G} [{\dot{ℓ}}_{a_{i}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}}])}_{i, i^{'}, j^{'}, k^{'}} \\ {(E_{ν, θ, F, G} [{\dot{ℓ}}_{A_{i, j}^{(k)}} {\dot{ℓ}}_{a_{i^{'}}}])}_{i, j, k, i^{'}} & {(E_{ν, θ, F, G} [{\dot{ℓ}}_{A_{i, j}^{(k)}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}}])}_{i, j, k, i^{'}, j^{'}, k^{'}} \end{matrix})

with ${\dot{ℓ}}_{\cdot} \equiv {\dot{ℓ}}_{\cdot} (X_{- p}, \dots, X_{0}; θ)$ is nonsingular, and $R_{n} \equiv R_{n} (u, θ) \overset{p}{\to} 0$ under $P_{ν, θ, F, G}$ .

3. Efficient estimation

This section provides efficient estimators based on the one-step update method. First, we use the multivariate conditional least squares estimator as an initial estimator of $θ$ (e.g., Bu, McCabe, & Hadri, Citation2008).

Definition 1 Suppose that ${X_{- p}, \dots, X_{0}, X_{1}, \dots, X_{n}}$ is observed from the MGINAR( $p$ ) process defined by (1). Then, the conditional least squares (CLS) estimator ${\hat{θ}}_{C L S}$ for $θ$ is defined by

{\hat{θ}}_{C L S} = a r g m i n_{θ \in Θ} Q_{n} (θ)

where

Q_{n} (θ) = \sum_{t = 0}^{n} {(X_{t} - g_{t} (θ))}^{T} (X_{t} - g_{t} (θ)) a n d g_{t} (θ) = E [X_{t} | F_{t - 1}] = a + \sum_{k = 1}^{p} A^{(k)} X_{t - k} .

Note that by calculating the derivative of $Q_{n}$ with respect to all entries of $θ$ , we have

{\hat{θ}}_{C L S} = (Z^{T} Z)^{- 1} Z^{T} Y .

where $Y \in R^{d (n + 1)}$ and $Z \in R^{d (n + 1) \times q}$ are

Y = {(X_{0}^{T}, \dots, X_{n}^{T})}^{T} a n d Z = {(Z_{0}, \dots, Z_{n})}^{T} \otimes I_{d}

with $Z_{t} = {(1, X_{0}^{T}, . . ., X_{t - p}^{T})}^{T} \in R^{d p + 1}$ .

Then, Du and Li (Citation1991) showed the following.

Proposition 1 .

\sqrt{n} ({\hat{θ}}_{C L S} - θ) \overset{d}{\to} N (0, Γ^{- 1} Σ Γ^{- 1}) u n d e r P_{ν, θ, F, G}

where

Γ = E_{ν, θ, F, G} [\frac{\partial g_{t} {(θ)}^{T}}{\partial θ} \frac{\partial g_{t} (θ)}{\partial θ^{T}}], Σ = E_{ν, θ, F, G} [\frac{\partial g_{t} {(θ)}^{T}}{\partial θ} (X_{t} - g_{t} (θ)) {(X_{t} - g_{t} (θ))}^{T} \frac{\partial g_{t} (θ)}{\partial θ^{T}}] .

Moreover, we have the following.

Proposition 2. The CLS estimator ${\hat{θ}}_{C L S}$ is not asymptotically efficient, because it is evident that for some $w \in R^{q}$

w^{T} (Γ^{- 1} Σ Γ^{- 1} - J^{- 1}) w > 0 u n d e r P_{ν, θ, F, G}

except for some special cases.

Since we have a $\sqrt{n}$ -consistent but inefficient estimator of $θ$ , we update the CLS estimator to an efficient estimator by using the LAN result.

Theorem 2. Let $ν$ be a probability measure on $Z_{+}^{d p}$ with finite support. Let $\hat{θ}_{C L S}$ be a CLS estimator. Define

\hat{θ}_{* *} := \hat{θ}_{C L S} + \frac{1}{\sqrt{n}} {\hat{J}}_{n} (\hat{θ}_{C L S})^{- 1} S_{n} (\hat{θ}_{C L S})

where

{\hat{J}}_{n} (θ) = (\begin{matrix} (\frac{1}{n} \sum_{t = 0}^{n} {\dot{ℓ}}_{a_{i}} ({\underline{X}}_{t}; θ) {\dot{ℓ}}_{a_{i^{'}}} ({\underline{X}}_{t}; θ)) & (\frac{1}{n} \sum_{t = 0}^{n} {\dot{ℓ}}_{a_{i}} ({\underline{X}}_{t}; θ) {\dot{ℓ}}_{A_{i^{'}, j}^{(k)}} ({\underline{X}}_{t}; θ)) \\ (\frac{1}{n} \sum_{t = 0}^{n} {\dot{ℓ}}_{A_{i, j}^{(k)}} ({\underline{X}}_{t}; θ) {\dot{ℓ}}_{a_{i^{'}}} ({\underline{X}}_{t}; θ)) & (\frac{1}{n} \sum_{t = 0}^{n} {\dot{ℓ}}_{A_{i, j}^{(k)}} ({\underline{X}}_{t}; θ) {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}} ({\underline{X}}_{t}; θ)) \end{matrix})

with ${\dot{ℓ}}_{\cdot} ({\underline{X}}_{t}; θ) = {\dot{ℓ}}_{\cdot} (X_{t - p}, \dots, X_{t}; θ)$ . Then, under Assumptions (A1)-(A7), ${\hat{θ}}_{*}$ is an asymptotically efficient estimator of $θ$ in the sequence of experiments ${(E^{(n)} (ν, Θ, F, G))}_{n \in Z_{+}}$ . Moreover, ${\hat{J}}_{n}^{- 1}$ yields a consistent estimator of $J^{- 1}$ , i.e., ${\hat{J}}_{n} ({\hat{θ}}_{C L S})^{- 1} \overset{p}{\to} J^{- 1}$ under $P_{ν, θ, F, G}$

4. Numerical study

In this section, we first examine asymptotic relative efficiency (ARE) of our proposed estimator and the CLS estimator through some simulation experiments. Then, we present a real data analysis to illustrate the application of the proposed estimation method.

4.1. Simulation study

Our proposed estimator ( ${\hat{θ}}_{*}$ ) and the conditional least squares estimator ( ${\hat{θ}}_{C L S}$ ) under the MGINAR model are compared through a series of simulation experiments. Specifically, we assess the small sample properties of the two estimators in the following cases: a Binomial thinning operator and a Binomial innovation (Case 1); and a Poisson thinning operator and a Poisson innovation (Case 2) with $d = 2$ and $p = 1$ . The count series ${X_{t}}$ are defined by

X_{t} = [\begin{matrix} X_{1 t} \\ X_{2 t} \end{matrix}] = [\begin{matrix} \sum_{r = 1}^{X_{1, t - 1}} ξ_{1, 1, r} + \sum_{r = 1}^{X_{2, t - 1}} ξ_{1, 2, r} + ϵ_{1 t} \\ \sum_{r = 1}^{X_{1, t - 1}} ξ_{2, 1, r} + \sum_{r = 1}^{X_{2, t - 1}} ξ_{2, 2, r} + ϵ_{2 t} \end{matrix}] = A \circ X_{t - 1} + ϵ_{t}

where

a = [\begin{matrix} a_{1} \\ a_{2} \end{matrix}] = [\begin{matrix} 0.09 \\ 0.21 \end{matrix}], A = [\begin{matrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{matrix}] = [\begin{matrix} 0.24 & 0.18 \\ 0.12 & 0.06 \end{matrix}]

and $ξ_{i, j, r} \overset{i . i . d .}{\sim} G_{A_{i, j}}$ , $ϵ_{i t} \overset{i . i . d .}{\sim} F_{a_{i}}$ . We suppose the initial distribution as $ν (x_{0}) = 1_{{x_{0} {= (1, 1)}^{T}}}$ (i.e., the initial value is fixed by $x_{0} = (1, 1)^{T}$ ). The $G_{A_{i, j}}$ and $F_{a_{i}}$ are defined by

Case 1: Binomial thinning operator and Binomial innovation

G_{A_{i, j}} (x) := A_{i, j}^{x} (1 - A_{i, j})^{1 - x}, F_{a_{i}} (x) := a_{i}^{x} (1 - a_{i})^{1 - x} f o r x \in {0, 1},

Case 2: Poisson thinning operator and Poisson innovation

G_{A_{i, j}} (x) := (A_{i, j})^{x} exp (- (A_{i, j})) / x!, F_{a_{i}} (x) := (a_{i})^{x} exp (- (a_{i})) / x! f o r x \in Z_{+} .

Then, the true parameter vector $θ$ is written by

θ = {(a^{T}, v e c {(A)}^{T})}^{T} = {(a_{1}, a_{2}, A_{11}, A_{12}, A_{21}, A_{22})}^{T} = {(0.09, 0.21, 0.24, 0.12, 0.18, 0.16)}^{T} .

Note that the parameter vector is chosen to obtain stationary count series.

We ran $1000$ Monte Carlo replicas with sample sizes $n = 10, 100, 1000, 10000$ . For each replica, we estimate the model parameters based on two procedures (CLS: ${\hat{θ}}_{C L S}$ , Efficient Est: ${\hat{θ}}_{*}$ ) and calculate the (approximated) bias (Table ) and diagonal part of the (approximated) MSE (Table ) of the parameter estimators. Finally, we calculate the (approximated) ARE (Table ) defined by ${|$ MSE of ${\hat{θ}}_{*} | / |$ MSE of ${\hat{θ}}_{C L S} |}^{1 / q}$ . Simulations are carried out in R. For the calculation of ${\hat{θ}}_{*}$ , we need an explicit form of the score $\dot{ℓ}$ under the given distributions $F$ and $G$ . Please see the derivation of the score $\dot{ℓ}$ for each case in the Appendix.

The bias results are reported in Table . It can be seen that the biases for both estimators tend to be $0$ when the sample size is sufficiently large. This implies that the both estimators are asymptotically unbiased. However, for the CLS estimator of $A_{12}$ and $A_{21}$ , the biases are relatively large, which implies that the convergence rate is relatively slow. In contrast, our proposed estimator improves the CLS estimator in terms of bias.

The corresponding MSE results are displayed in Table . Similar to the bias results, the MSEs for both estimators tend to be $0$ when the sample size is sufficiently large, which implies that both estimators converge to the true values in probability. However, for the CLS estimator of $A_{12}$ and $A_{21}$ , the MSEs are relatively large, which implies that the convergence rate of the variance is relatively slow. In contrast, our proposed estimator improves the CLS estimator, because it appears that the MSEs of all components converge to $0$ .

Finally, the ARE (asymptotic relative efficiency) results are given in Table . The ARE of two estimators is defined as the ratio of their asymptotic variances (e.g., Cox & Hinkley, Citation1974; Serfling, Citation2011). Let ${\hat{θ}}_{1}$ and ${\hat{θ}}_{2}$ be two estimators of $θ \in Θ \subset R^{q}$ . Let $Σ_{1}$ and $Σ_{2}$ be the asymptotic covariance matrices, i.e.,

\sqrt{n} ({\hat{θ}}_{i} - θ) \overset{d}{\to} N (0, Σ_{i})

for i = 1,2, respectively. Then, the ARE of ${\hat{θ}}_{2}$ and ${\hat{θ}}_{1}$ is given by

A R E ({\hat{θ}}_{2}, {\hat{θ}}_{1}) = {(\frac{| Σ_{1} |}{| Σ_{2} |})}^{1 / q} .

In our study, we consider the ARE of our proposed estimator ( ${\hat{θ}}_{*}$ ) and the conditional least squares estimator ( ${\hat{θ}}_{C L S}$ ) as follows:

A R E ({\hat{θ}}_{*}, {\hat{θ}}_{C L S}) = {(\frac{| Γ^{- 1} Σ Γ^{- 1} |}{| J^{- 1} |})}^{1 / q} .

Clearly, in this setup, if an ARE is larger than $1$ , it suggests that our proposed estimator improves the CLS estimator in terms of efficiency. Table reports the ARE results, but $Γ^{- 1} Σ Γ^{- 1}$ and $J^{- 1}$ are replaced as the sample MSEs. It can be seen that the ARE tends to be larger than $1$ as the sample size increases, which implies that our estimator improves the CLS estimator in terms of efficiency. We tried the same simulation studies under some different settings with respect to the parameter values. We omit them, but the results are similar.

4.2. Real data analysis

The data set consists of the number of cases of infectious diseases per week by prefecture for the period 2015–2018 (208 weeks) as reported by the National Institute of Infectious Diseases (NIID) in Japan (URL: https://www.niid.go.jp/niid/en/). Here we use the number of cases of “Epidemic keratoconjunctivitis (EK)” and “Aseptic meningitis (AM)” in Shimane prefecture. The sample path plot for the data in Figure reveals some seasonality or periodicity, but it looks that there exist no trend and stationarity.

Figure 1. The number of cases of epidemic keratoconjunctivitis (left hand side) and aseptic meningitis (right hand side) per week in Shimane prefecture.

Figure shows the sample ACF and PACF plots for each disease. The figure shows the time dependency but any long-range dependence is not observed, so it is acceptable to fit the MGINAR(1) model.

Figure 2. The sample ACF and PACF for the epidemic keratoconjunctivitis data (left hand side) and aseptic meningitis data (right hand side).

We suppose that the time series count data ${X_{t} = (X_{t}^{(E K)}, X_{t}^{(A M)})^{T}}$ follows

X_{t} = A \circ X_{t - 1} + ϵ_{t}

where both of the thinning operator and the innovation follows the Binomial or Poisson distribution. The CLS estimator ( ${\hat{a}}_{C L S}, {\hat{A}}_{C L S}$ ) for the parameter ( $a, A$ ) is obtained as follows:

({\hat{a}}_{C L S}, {\hat{A}}_{C L S}) = ([\begin{matrix} 0.4050 \\ 0.5534 \end{matrix}], [\begin{matrix} 0.4845 & 0.1240 \\ 0.0188 & 0.1179 \end{matrix}]) .

Next, our proposed estimators for the Binomial thinning and the Binomial innovation ( ${\hat{a}}_{E . B}, {\hat{A}}_{E . B}$ ) and for the Poisson thinning and the Poisson innovation ( ${\hat{a}}_{E . P}, {\hat{A}}_{E . P}$ ) are obtained as follows:

({\hat{a}}_{E . B}, {\hat{A}}_{E . B}) = ([\begin{matrix} 0.1813 \\ 0.3404 \end{matrix}], [\begin{matrix} 0.5208 & 0.1222 \\ 0.0338 & 0.1664 \end{matrix}]),

({\hat{a}}_{E . P}, {\hat{A}}_{E . P}) = ([\begin{matrix} 0.1703 \\ 0.2575 \end{matrix}], [\begin{matrix} 0.4747 & 0.0534 \\ 0.0191 & 0.1207 \end{matrix}]) .

For both cases of the Binomial distribution and the Poisson distribution, ${\hat{a}}_{C L S}$ is greatly changed by our proposed estimation. In contrast, there is not much change with respect to ${\hat{A}}_{C L S}$ .

Finally, we evaluate the goodness of fit for each estimator based on AIC. Denote AIC for an estimator $(\hat{a}, \hat{A})$ under the Binomial distribution by $A I C_{B i n} (\hat{a}, \hat{A})$ , and under the Poisson distribution by $A I C_{P o i s} (\hat{a}, \hat{A})$ . Then, we obtained

A I C_{B i n} ({\hat{a}}_{C L S}, {\hat{A}}_{C L S}) = 794.0651, A I C_{B i n} ({\hat{a}}_{E . B}, {\hat{A}}_{E . B}) = 718.821

A I C_{P o i s} ({\hat{a}}_{C L S}, {\hat{A}}_{C L S}) = 43240.3805, A I C_{P o i s} ({\hat{a}}_{E . P}, {\hat{A}}_{E . P}) = 43081.671

From the above results, we can see that our proposed estimators take good performance in terms of goodness of fit.

Acknowledgements

We thank the anonymous referees for constructive comments. This work was supported by JSPS KAKENHI, Grant Number JP16K00036.

Additional information

Funding

This work was supported by the JSPS [JP16K00036].

Notes on contributors

Hiroshi Shiraishi

Hiroshi Shiraishi received the BS degree in mathematics in 1998 and the MS and Dr degrees in mathematical science from Waseda University, Japan in 2004 and 2007, respectively. He joined the GE Edison Life Insurance Company, the Prudential Life Insurance Company of Japan and the Hannover-Re Reinsurance Company, in 1998, 2000 and 2005, respectively. His research interests are actuarial science, time series analysis, econometric theory and financial engineering. In particular, he investigates the statistical analysis of discrete-valued time series/the statistical estimation of optimal dividend problems in the field of the actuarial science/the statistical estimation of Hawkes graphs and so on. He is currently an associate professor in the Department of Mathematics, Keio University, Japan. He is a fellowof the Institute of Actuary of Japan (FIAJ).

References

Al-Osh, M. A., & Alzaid, A. A. (1987). First-order integer-valued autoregressive (INAR(1)) process. Journal of Time Series Analysis, 8(3), 261–18. doi:10.1111/j.1467-9892.1987.tb00438.x
Google Scholar
Alzaid, A. A., & Al-Osh, M. A. (1990). An integer-valued pth-order autoregressive structure (INAR(p)) process. Journal of Applied Probability, 27(2), 314–324. doi:10.2307/3214650
Web of Science ®Google Scholar
Bu, R., McCabe, B., & Hadri, K. (2008). Maximum likelihood estimation of higher-order integer-valued autoregressive processes. Journal of Time Series Analysis, 29(6), 973–994. doi:10.1111/j.1467-9892.2008.00590.x
Web of Science ®Google Scholar
Cox, D. R., & Hinkley, D. (1974). Theoretical statistics. London: Chapman and Hall.
Google Scholar
Drost, F. C., Van Den Akker, R., & Werker, B. J. M. (2008). Local asymptotic normality and efficient estimation for INAR (p) models. Journal of Time Series Analysis, 29(5), 783–801. doi:10.1111/j.1467-9892.2008.00581.x
Web of Science ®Google Scholar
Du, J. G., & Li, Y. (1991). The integer-valued autoregressive (INAR(p)) model. Journal of Time Series Analysis, 12(2), 129–142. doi:10.1111/j.1467-9892.1991.tb00073.x
Google Scholar
Franke, J., & Subba Rao, T. 1993. Multivariate first-order integer-valued autoregression. Technical report No.95, Universität Kaiserslautern.
Google Scholar
Karlis, D., & Pedeli, X. (2013). Flexible bivariate INAR(1) processes using copulas. Communications in Statistics-Theory and Methods, 42(4), 723–740. doi:10.1080/03610926.2012.754466
Web of Science ®Google Scholar
Latour, A. (1997). The multivariate GINAR(p) process. Advances in Applied Probability, 29(1), 228–248. doi:10.2307/1427868
Web of Science ®Google Scholar
Latour, A. (1998). Existence and stochastic structure of a non-negative integer-valued autoregressive process. Journal of Time Series Analysis, 19(4), 439–455. doi:10.1111/jtsa.1998.19.issue-4
Google Scholar
McKenzie, E. (1985). Some simple models for discrete variate time series. Water Resources Bulletin, 21(4), 645–650. doi:10.1111/j.1752-1688.1985.tb05379.x
Google Scholar
McKenzie, E. (1988). Some ARMA models for dependent sequences of Poisson counts. Advances in Applied Probability, 20(4), 822–835. doi:10.2307/1427362
Web of Science ®Google Scholar
Ristić, M. M., Bakouch, H. S., & Nastić, A. S. (2009). A new geometric first-order integer-valued autoregressive (NGINAR(1)) process. Journal of Statistical Planning and Inference, 139(7), 2218–2226. doi:10.1016/j.jspi.2008.10.007
Web of Science ®Google Scholar
Serfling, R. (2011). Asymptotic relative efficiency in estimation. In M. Lovric (Ed.), International encyclopedia of statistical science, (pp. 68–82). Berlin, Heidelberg: Springer.
Google Scholar
Weiß, C. H. (2018). An introduction to discrete-valued time series. Chichester: John Wiley & Sons.
Google Scholar

Appendix

Proof of Theorem 1 The proof is similar to that for Theorem 1 in Drost et al. (Citation2008).Expansion of the log-likelihood ratio: Let

u = (u_{1}, \dots, u_{q})^{T} \in R^{q} ∖ {0}

with

q = d + d^{2} p

. Under Assumption (A1), we obtain by Taylor’s theorem,

log \frac{L_{n} (θ + \frac{u}{\sqrt{n}} | X_{- p}, \dots, X_{n})}{L_{n} (θ | X_{- p}, \dots, X_{n})} = u^{T} S_{n} (θ) - \frac{1}{2} u^{T} J_{n} ({\tilde{θ}}_{n}) u (7)

where ${\tilde{θ}}_{n}$ is a random point on the line-segment between $θ$ and $θ + u / \sqrt{n}$ and

J_{n} (θ) = - \frac{1}{\sqrt{n}} \frac{\partial}{\partial θ^{T}} S_{n} (θ) . (8)

Then, we show

Part 0: auxiliary calculations
Part 1: $S_{n} (θ) \overset{d}{\to} N (0, J)$ under $P_{ν, θ, F, G}$
Part 2: $J_{n} (\tilde{θ}) \overset{P}{\to} J$ under $P_{ν, θ, F, G}$
Part 3: non-singularity of $J$ .

In what follows, for simplicity, we write $(X_{s}, \dots, X_{t}) := X_{s : t}$ and $(x_{s}, \dots, x_{t}) := x_{s : t}$ for $s, t \in Z$ with $s < t$ .

Part 0: We first show that the existence of $J$ . To do so, we need to show for each $θ_{i}, θ_{j} \in θ$

E_{ν, θ, F, G} [{\dot{ℓ}}_{θ_{i}} (X_{(- p) : 0}; θ) {\dot{ℓ}}_{θ_{j}} (X_{(- p) : 0}; θ)] < \infty . (9)

It can be shown by Assumptions (A4) and (A6).□

Part 1: From Equations (3) and (4), it follows that

E_{ν, θ, F, G} [{\dot{ℓ}}_{a_{i}} (X_{(t - p) : t}; θ) | X_{(t - p) : (t - 1)}] = E_{ν, θ, F, G} [\frac{\partial}{\partial a_{i}} log F_{a_{i}} (ϵ_{i, t}) | X_{(t - p) : (t - 1)}] = 0

E_{ν, θ, F, G} [{\dot{ℓ}}_{A_{i, j}^{(k)}} (X_{(t - p) : t}; θ) | X_{(t - p) : (t - 1)}] = E_{ν, θ, F, G} [\frac{\partial}{\partial A_{i, j}^{(k)}} log G_{A_{i, j}^{(k)}} (ξ_{i, j, r}^{(k)}) | X_{(t - p) : (t - 1)}] = 0

since $ϵ_{i, t}$ and $ξ_{i, j, r}^{(k)}$ are independent of ${X_{t - p}, \dots, X_{t - 1}}$ . Let $w \in R^{q}$ and ${\dot{ℓ}}_{θ} = {({\dot{ℓ}}_{a_{1}}, \dots, {\dot{ℓ}}_{a_{d}}, {\dot{ℓ}}_{A_{1, 1}^{(1)}}, \dots, {\dot{ℓ}}_{A_{d, d}^{(p)}})}^{T}$ . From the above equations, it follows that

E_{ν, θ, F, G} [w^{T} {\dot{ℓ}}_{θ} (X_{(t - p) : t}; θ) | X_{(t - p) : (t - 1)}] = 0

and by Part 0, it follows that

E_{ν, θ, F, G} [{\{w^{T} {\dot{ℓ}}_{θ} (X_{(t - p) : t}; θ)\}}^{2}] = w^{T} J w < \infty .

Hence, we have, by Lemma B.1 of Drost et al. (Citation2008),

\frac{1}{\sqrt{n}} \sum_{t = 0}^{n} [w^{T} {\dot{ℓ}}_{θ} (X_{(t - p) : t}; θ)] \overset{d}{\to} w^{T} N (0, J) u n d e r P_{ν, θ, F, G} .

An application of the Cramér-Wold device concludes the proof of Part 1.□

Part 2: Assumption (A3) implies that, for fixed $x_{- p}, \dots, x_{0} \in Z_{+}^{d}$ and for each $θ_{i}, θ_{j} \in θ$ , the mapping

θ \mapsto (\partial / \partial θ_{i}) log {\dot{ℓ}}_{θ_{j}} (x_{(- p) : 0}; θ)

is continuous, respectively. Since we have already proved (9) in Part 0, by Lemma B.3 of Drost et al. (Citation2008), the proof is completed if we have $J_{n} (θ) \overset{p}{\to} J$ where

J_{n} (θ) = - \frac{1}{n} \sum_{t = 0}^{n} (\begin{matrix} (\frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ)) & (\frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}} (X_{(t - p) : t}; θ)) \\ (\frac{\partial}{\partial A_{i, j}^{(k)}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ)) & (\frac{\partial}{\partial A_{i, j}^{(k)}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}} (X_{(t - p) : t}; θ)) \end{matrix}) .

For $i, i^{'} \in {1, \dots, d}$ and $t \in {0, 1, \dots, n}$ , it follows that from Equation (3)

\begin{array}{l} \frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{a_{i^{'}}} (x_{(t - p) : t}; θ) \\ = \partial \partial a_{i} [\prod_{i^{″} = 1}^{d} {{\tilde{F}}_{i^{'}, i^{″}} ⊛ (\begin{array}{l} \otimes k = 1, \dots, p G_{A_{i^{″}, j}^{(k)}}^{x_{j, t - k^{\otimes}}} \\ j = 1, \dots, d \end{array})} (x_{i^{″} = 1})] \frac{1_{(0, 1} (P_{{(x_{t - p}, \dots, x_{t - 1})}^{, x_{t}}}^{θ})}{{(P_{{(x_{t - p}, \dots, x_{t - 1})}^{, x_{t}}}^{θ})}^{2}} \\ - (\partial \partial a_{i} P_{{(x_{t - p}, \dots, x_{t - 1})}^{, x_{t}}}^{θ}) \prod_{i^{″} = 1}^{d} {\tilde{F}}_{i^{'}, i^{″}} ⊛ {{\tilde{F}}_{i^{'}, i^{″}} ⊛ (\begin{array}{l} \otimes k = 1, \dots, p G_{A_{i^{″}, j}^{(k)}}^{x_{j, t - k^{\otimes}}} \\ j = 1, \dots, d \end{array})} (x_{i^{″} = 1}) \frac{1_{(0, 1]} (P_{{(x_{t - p}, \dots, x_{t - 1})}^{, x_{t}}}^{θ})}{{(P_{{(x_{t - p}, \dots, x_{t - 1})}^{, x_{t}}}^{θ})}^{2}} \\ = \prod_{i^{″} = 1}^{d} {\tilde{F}}_{i^{'}, i^{″}} ⊛ {{\tilde{F}}_{i^{'}, i^{″}} ⊛ (\begin{array}{l} \otimes k = 1, \dots, p G_{A_{i^{″}, j}^{(k)}}^{x_{j, t - k^{\otimes}}} \\ j = 1, \dots, d \end{array})} (x_{{i^{'}}^{'}, t}) \frac{1_{(0, 1} (P_{(x_{(t - p) : (t - 1)}), x_{t}}^{θ})}{P_{x_{(t - p) : (t - 1)}, x_{t}}^{θ}} \\ - {\dot{ℓ}}_{a_{i}} (x_{(t - p) : t}; θ) {\dot{ℓ}}_{a_{i^{'}}} (x_{(t - p) : t}; θ) \\ = E_{ν, θ, F, G} [\partial_{a_{i}, a_{i^{'}}} \log F_{a} (ϵ_{t}) | X_{(t - p) : t} = x_{(t - p) : t}] - {\dot{ℓ}}_{a_{i}} (x_{(t - p) : t}; θ) {\dot{ℓ}}_{a_{i^{'}}} (x_{(t - p) : t}; θ) \end{array}

where

{\tilde{\tilde{F}}}_{i, i^{'}, i^{″}} = {\begin{cases} {\dot{f}}_{a_{i}} if i = i^{'} = i^{″} \\ f_{a_{i}} if i = i^{″} \neq i^{'} \\ f_{a_{i^{'}}} if i \neq i^{'} = i^{″} \\ F_{a_{i^{″}}} others \end{cases}

and

\partial_{a_{i}, a_{i^{'}}} log F_{a} (ϵ_{t}) = \{\begin{matrix} \frac{\partial^{2}}{{(\partial a_{i})}^{2}} log F_{a_{i}} (ϵ_{i, t}) & i f i = i^{'} \\ \frac{\partial}{\partial a_{i}} log F_{a_{i}} (ϵ_{i, t}) \frac{\partial}{\partial a_{i^{'}}} log F_{a_{i^{'}}} (ϵ_{i^{'}, t}) & i f i \neq i^{'} \end{matrix}

which implies that

E_{ν, θ, F, G} [\frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ)] = E_{ν, θ, F, G} [E_{ν, θ, F, G} [\partial_{a_{i}, a_{i^{'}}} log F_{a} (ϵ_{t}) | X_{(t - p) : t}] - {\dot{ℓ}}_{a_{i}} (X_{(t - p) : t}; θ) {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ)] = E_{ν, θ, F, G} [E_{ν, θ, F, G} [\partial_{a_{i}, a_{i^{'}}} log F_{a} (ϵ_{t}) | X_{(t - p) : (t - 1)}]] - E_{ν, θ, F, G} [{\dot{ℓ}}_{a_{i}} (X_{(t - p) : t}; θ) {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ)] = E_{ν, θ, F, G} [\partial_{a_{i}, a_{i^{'}}} log F_{a} (ϵ_{t})] - J_{a_{i}, a_{i^{'}}} = - J_{a_{i}, a_{i^{'}}}

where the third equation follows the independence between $ϵ_{t}$ and $X_{(t - p) : (t - 1)}$ . This result and Lemma B.3 of Drost et al. (Citation2008) yield

- \frac{1}{n} \sum_{t = 0}^{n} \frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ) \overset{p}{\to} - E_{ν, θ, F, G} [\frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(- p) : 0}; θ)] = J_{a_{i}, a_{i^{'}}}

By the same argument, we obtain

\begin{array}{l} - \frac{1}{n} \sum_{t = 0}^{n} \frac{\partial}{\partial a_{i}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}} (X_{(t - p) : t}; θ) \overset{p}{\to} J_{a_{i}, A_{i^{'}, j^{'}}^{(k^{'})}}, - \frac{1}{n} \sum_{t = 0}^{n} \frac{\partial}{\partial A_{i, j}^{(k)}} {\dot{ℓ}}_{a_{i^{'}}} (X_{(t - p) : t}; θ) \overset{p}{\to} J_{A_{i, j}^{(k)}, a_{i^{'}}}, \\ - \frac{1}{n} \sum_{t = 0}^{n} \frac{\partial}{\partial A_{i, j}^{(k)}} {\dot{ℓ}}_{A_{i^{'}, j^{'}}^{(k^{'})}} (X_{(t - p) : t}; θ) \overset{p}{\to} J_{A_{i, j}^{(k)}, A_{i^{'}, j^{'}}^{(k^{'})}} . \end{array}

This completes the proof of Part 2.

Part 3: The proof of the non-singularity for $J$ is provided by Assumption (A5) in the same way as Drost et al. (Citation2008).

Proof of Theorem 2 The proof is the same as Theorem 3.2 of Drost et al. (Citation2008).

To calculate the one-step estimator $({\hat{θ}}_{*})$ , we need to know the score functions $({\dot{ℓ}}_{a_{i}})$ and $({\dot{ℓ}}_{A_{i, j}^{(k)}})$ for each distribution. In what follows, we show the derivation of these functions for Cases 1 and 2 in Section 4.

Derivation of the score $\dot{ℓ}$ for the Case 1 For Case 1, the probability functions $G_{A_{i, j}}$ and $F_{a_{i}}$ are given by

G_{A_{i, j}} (x) = A_{i, j}^{x} (1 - A_{i, j})^{1 - x}, F_{a_{i}} (x) = a_{i}^{x} (1 - a_{i})^{1 - x}

for $x \in {0, 1}$ , which implies that the derivatives $g_{A_{i, j}}$ and $f_{a_{i}}$ are written as

g_{A_{i, j}} (x) = (\frac{x}{A_{i, j}} - \frac{1 - x}{1 - A_{i, j}}) G_{A_{i, j}} (x), f_{a_{i}} (x) = (\frac{x}{a_{i}} - \frac{1 - x}{1 - a_{i}}) F_{a_{i}} (x) .

Let

B i n_{(n, A)} (x) = (\begin{matrix} n \\ x \end{matrix}) A^{x} (1 - A)^{n - x}, {B i n}_{(n, A)}^{♢} (x) = (\frac{x}{A} - \frac{n - x}{1 - A}) B i n_{(n, A)} (x)

for $n \in N$ and $x \in {0, \dots, n}$ . Let $B i n_{(0, A)} (0) = 1, {B i n}_{(0, A)}^{♢} (0) = 0$ and $B i n_{(0, A)} (x) = {B i n}_{(0, A)}^{♢} (x) = 0$ for $x \neq 0$ . Then, we obtain for $i, j = 1, 2,$

P_{(x_{t - 1}), x_{t}}^{θ} = \sum_{(1)} B i n_{(1, a_{1})} (n_{1}) B i n_{(x_{1, t - 1}, A_{1, 1})} (n_{2}) B i n_{(x_{2, t - 1}, A_{1, 2})} (n_{3}) \times \sum_{(2)} B i n_{(1, a_{2})} (n_{1}^{'}) B i n_{(x_{1, t - 1}, A_{2, 1})} (n_{2}^{'}) B i n_{(x_{2, t - 1}, A_{2, 2})} (n_{3}^{'}),

{\dot{ℓ}}_{a_{i}} (x_{t - 1}, x_{t}; θ) = \sum_{(1)} {B i n}_{(1, a_{i})}^{♢} (n_{1}) B i n_{(x_{1, t - 1}, A_{i, 1})} (n_{2}) B i n_{(x_{2, t - 1}, A_{i, 2})} (n_{3}) \times \sum_{(2)} B i n_{(1, a_{3 - i})} (n_{1}^{'}) B i n_{(x_{1, t - 1}, A_{3 - i, 1})} (n_{2}^{'}) B i n_{(x_{2, t - 1}, A_{3 - i, 2})} (n_{3}^{'}) / P_{(x_{t - 1}), x_{t}}^{θ}

{\dot{ℓ}}_{A_{i, j}} (x_{t - 1}, x_{t}; θ) = \sum_{(1)} B i n_{(1, a_{i})} (n_{1}) {B i n}_{(x_{j, t - 1}, A_{i, j})}^{♢} (n_{j + 1}) B i n_{(x_{3 - j, t - 1}, A_{i, 3 - j})} (n_{4 - j}) \times \sum_{(2)} B i n_{(1, a_{3 - i})} (n_{1}^{'}) B i n_{(x_{1, t - 1}, A_{3 - i, 1})} (n_{2}^{'}) B i n_{(x_{2, t - 1}, A_{3 - i, 2})} (n_{3}^{'}) / P_{(x_{t - 1}), x_{t}}^{θ}

where $\sum_{(1)} = \sum_{\begin{matrix} n_{1} + n_{2} + n_{3} = x_{i, t}, \\ n_{1} \in {0, 1}, \\ n_{2} \in {0, \dots, x_{1, t - 1}}, \\ n_{3} \in {0, \dots, x_{2, t - 1}} \end{matrix}}$ and $\sum_{(2)} = \sum_{\begin{matrix} {n^{'}}_{1} + {n^{'}}_{2} + {n^{'}}_{3} = x_{3 - i, t}, \\ {n^{'}}_{1} \in {0, 1}, \\ {n^{'}}_{2} \in {0, \dots, x_{1, t - 1}}, \\ {n^{'}}_{3} \in {0, \dots, x_{2, t - 1}} \end{matrix}}$ .

The derivation of the score $\dot{ℓ}$ for Case 2 For Case 2, the probability functions $G_{A_{i, j}}$ and $F_{a_{i}}$ are given by

G_{A_{i, j}} (x) := (A_{i, j})^{x} exp (- (A_{i, j})) / x!, F_{a_{i}} (x) := (a_{i})^{x} exp (- (a_{i})) / x!

for $x \in Z_{+}$ , which implies that the derivatives $g_{A_{i, j}}$ and $f_{a_{i}}$ are written by

g_{A_{i, j}} (x) = (\frac{x}{A_{i, j}} - 1) G_{A_{i, j}} (x), f_{a_{i}} (x) = (\frac{x}{a_{i}} - 1) F_{a_{i}} (x) .

Let

P o i s_{(λ)} (x) = \frac{λ^{x} e^{- λ}}{x!}, {P o i s}_{(λ)}^{♢} (x) = (\frac{x}{λ} - 1) \frac{λ^{x} e^{- λ}}{x!}

and

{P o i s}_{(n A)}^{♢ ♢} (x) = \{\begin{matrix} 0 & i f n = 0 \\ {P o i s}_{(A)}^{♢} (x) & i f n = 1 \\ \sum_{x_{1} + x_{2} = x} n {P o i s}_{(A)}^{♢} (x_{1}) P o i s_{((n - 1) A)} (x_{2}) & i f n > 1 \end{matrix}

for $n \in Z_{+}, A > 0, x \in Z_{+}$ . Then, we obtain for $i, j = 1, 2,$

P_{(x_{t - 1}), x_{t}}^{θ} = \sum_{(3)} P o i s_{(a_{1})} (n_{1}) P o i s_{(x_{1, t - 1} A_{1, 1})} (n_{2}) P o i s_{(x_{2, t - 1} A_{1, 2})} (n_{3}) \times \sum_{(4)} P o i s_{(a_{2})} (n_{1}^{'}) P o i s_{(x_{1, t - 1} A_{2, 1})} (n_{2}^{'}) P o i s_{(x_{2, t - 1} A_{2, 2})} (n_{3}^{'})

{\dot{ℓ}}_{a_{i}} (x_{t - 1}, x_{t}; θ) = \sum_{(3)} {P o i s}_{(a_{i})}^{♢} (n_{1}) P o i s_{(x_{1, t - 1} A_{i, 1})} (n_{2}) P o i s_{(x_{2, t - 1} A_{i, 2})} (n_{3}) \times \sum_{(4)} P o i s_{(a_{3 - i})} (n_{1}^{'}) P o i s_{(x_{1, t - 1} A_{3 - i, 1})} (n_{2}^{'}) P o i s_{(x_{2, t - 1} A_{3 - i, 2})} (n_{3}^{'}) / P_{(x_{t - 1}), x_{t}}^{θ}

{\dot{ℓ}}_{A_{i, j}} (x_{t - 1}, x_{t}; θ) = \sum_{(3)} P o i s_{(a_{i})} (n_{1}) {P o i s}_{(x_{j, t - 1} A_{i, j})}^{♢ ♢} (n_{j + 1}) P o i s_{(x_{3 - j, t - 1} A_{i, 3 - j})} (n_{4 - j}) \times \sum_{(4)} P o i s_{(a_{3 - i})} (n_{1}^{'}) P o i s_{(x_{1, t - 1} A_{3 - i, 1})} (n_{2}^{'}) P o i s_{(x_{2, t - 1} A_{3 - i, 2})} (n_{3}^{'}) / P_{(x_{t - 1}), x_{t}}^{θ}

where $\sum_{(3)} = \sum_{\binom{n_{1} + n_{2} + n_{3} = x_{i, t},}{n_{1}, n_{2}, n_{3} \in {0, \dots, x_{i, t}}}}$ and $\sum_{(4)} = \sum_{\binom{{n^{'}}_{1} + {n^{'}}_{2} + {n^{'}}_{3} = x_{3 - i, t},}{{n^{'}}_{1}, {n^{'}}_{2}, {n^{'}}_{3} \in {0, \dots, x_{3 - i, t}}}}$ □

Local asymptotic normality and efficient estimation for multivariate GINAR(p) models

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. The LAN property

3. Efficient estimation

4. Numerical study

4.1. Simulation study

Table 1. Bias results for the MGINAR(1) model

Table 2. Diagonal part of MSE Results for the MGINAR(1) model

Table 3. ARE (asymptotic relative efficiency) Results for the MGINAR(1) model

4.2. Real data analysis

Acknowledgements

Notes on contributors

Hiroshi Shiraishi

References

Appendix

Information for

Open access

Opportunities

Help and information

Local asymptotic normality and efficient estimation for multivariate GINAR(p) models

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. The LAN property

3. Efficient estimation

4. Numerical study

4.1. Simulation study

Table 1. Bias results for the MGINAR(1) model

Table 2. Diagonal part of MSE Results for the MGINAR(1) model

Table 3. ARE (asymptotic relative efficiency) Results for the MGINAR(1) model

4.2. Real data analysis

Acknowledgements

Additional information

Funding

Notes on contributors

Hiroshi Shiraishi

References

Appendix

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date