Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

We consider local polynomial estimation for varying coefficient models and derive corresponding equivalent kernels that provide insights into the role of smoothing on the data and fill a gap in the literature. We show that the asymptotic equivalent kernels have an explicit decomposition with three parts: the inverse of the conditional moment matrix of covariates given the smoothing variable, the covariate vector, and the equivalent kernels of univariable local polynomials. We discuss finite-sample reproducing property which leads to zero bias in linear models with interactions between covariates and polynomials of the smoothing variable. By expressing the model in a centered form, equivalent kernels of estimating the intercept function are asymptotically identical to those of univariable local polynomials and estimators of slope functions are local analogues of slope estimators in linear models with weights assigned by equivalent kernels. Two examples are given to illustrate the weighting schemes and reproducing property.

KEYWORDS:

MATHEMATICS SUBJECT CLASSIFICATION CODE:

1. Introduction

Motivated by situations of analysing complex data, several flexible regression models have been developed during the last three decades. Among these are the varying coefficient models (Hastie and Tibshirani Citation1993). They differ from classical linear models in that the regression coefficients are no longer constants but rather functions of a smoothing variable. The model has a form: (1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) where $x = (X_{2}, \dots, X_{d})^{T}$ are continuous covariates, $X_{1} \equiv 1$ , U is a continuous smoothing variable, $a (U) = (a_{1} (U), \dots, a_{d} (U))^{T}$ is the functional coefficient vector, and ε is the error term with $E (ϵ | U, x) = 0$ and $Var (ϵ | U, x) = σ^{2} (U)$ . When d = 1, model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) is reduced to a univariable nonparametric model. If the varying coefficients are constants, i.e. $a_{g} (U) = a_{g}, g = 1, \dots, d$ , then (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) is the multiple linear regression model. The model retains general nonparametric characteristics and allows nonlinear interactions between the smoothing variable U and covariates $x$ . Methods of estimating $a (\cdot)$ include the local polynomial approach (Fan and Zhang Citation1999), smoothing splines (Hastie and Tibshirani Citation1993; Chiang, Rice, and Wu Citation2001), and penalised splines (Ruppert, Wand, and Carroll Citation2003). An overview on methodology of varying coefficient models is given in Fan and Zhang (Citation2008) and Park, Mammen, Lee, and Lee (Citation2015).

This paper focuses on the local polynomial approach. When d = 1, with local polynomial fitting of pth order, it is well known that the estimators are linear smoothers (linear in $Y_{i}$ 's) and the associated equivalent kernels are available (Fan and Gijbels Citation1996). The equivalent kernels give insights into the role of kernel smoothing on the data and the corresponding estimator of $a_{1} (\cdot)$ has a remarkable ‘reproducing’ property (Tsybakov Citation2009, p:36), reproducing polynomials of degree $\leq p$ . However, to our knowledge, there are no results on equivalent kernels for local polynomial estimators in varying coefficient models (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) in the literature. In this paper, we fill the gap by deriving asymptotically equivalent kernels for estimating $a_{g} (\cdot), g = 1, \dots, d$ and their derivatives, providing explicit forms of the connections between equivalent kernels of general d and those of d = 1, and studying extension of the reproducing property. The contribution of our paper includes the following: (i) under some conditions, the asymptotic equivalent kernels corresponding to estimating $a_{g}^{(ν)} (\cdot), g = 1, \dots, d, ν = 0, \dots, p$ in (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) have explicit decomposition forms that connect to those of d = 1 (Theorem 3.2); (ii) the finite-sample equivalent kernels corresponding to estimating $a_{g}^{(ν)} (u), g = 1, \dots, d, ν = 0, \dots, p$ reproduce the νth derivative of polynomials of degree $\leq p$ in u (Proposition 3.1); (iii) with centered covariates, estimators of $a_{k} (\cdot), k = 2, \dots, d$ are local analogues of the slope estimators in linear models (Corollary 3.2).

We start the discussion with local linear fitting p = 1 and d = 2 in Theorem 3.1, followed by extending the results to a general d in (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with pth order local polynomial fitting in Theorem 3.2. Equations (Equation16(16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) ) and (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) in Theorems 3.1 and 3.2 respectively show that there are direct connections between asymptotic equivalent kernels of (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) and those of univariable local polynomial regression ((Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with d = 1). It may be seen from the second equation in (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) that the asymptotic equivalent kernels are decomposed into three main parts, the inverse of the conditional moment matrix of $(1, x^{⊤})^{⊤}$ given $U = u_{0}$ , the covariate vector, and the equivalent kernels of univariable local polynomials. The finite-sample reproducing property is given in Proposition 3.1, which leads to zero bias when the true regression mean is the multiple linear model with interactions between $x$ and up to order-p polynomials of U. In Section 3.3, we present two Corollaries of Theorems 3.1 and 3.2 when $x$ is centered. It turns out that the centered form leads to simpler and more interpretable results: equivalent kernels of estimating $a_{1}^{(ν)} (\cdot$ ) in (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with a general d is asymptotically identical to those of d = 1; estimators of $a_{2}^{(ν)} (\cdot), \dots, a_{d}^{(ν)} (\cdot)$ may be asymptotically expressed in a form analogous to the slope estimators in linear models with weights assigned by equivalent kernels. This interpretation appears to be new in the literature. We conjecture that these equivalent kernel results may be useful to develop methodology when the responses are random objects, i.e. Fréchet regression. Petersen and Müller (Citation2019) propose to utilise the Euclidean local linear weights to fit Fréchet local linear regression. Thus Fréchet varying coefficient models will be an interesting topic for futureresearch.

The article is organised as follows. In Section 2, we summarise the local polynomial approach for estimating (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) (Fan and Zhang Citation1999) and equivalent kernels of local polynomial regression (Fan and Gijbels Citation1996) with reproducing property in Proposition 2.1 (Tsybakov Citation2009). We present the main results in Section 3 and give two examples in Section 4 to illustrate, respectively, the weighting schemes of equivalent kernels when d = 2 and p = 1, and the reproducing property of equivalent kernels when d = 2 and p = 2. Proofs of Proposition 3.1 and Theorem 3.2 are provided in the Appendix.

2. Background

Consider a random sample ${(Y_{i}, U_{i}, X_{i 2}, \dots, X_{id}), i = 1, \dots, n}$ from model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) and define $X_{i 1} = 1, i = 1, \dots, n$ . In this article, we adopt the local polynomial approach (Fan and Gijbels Citation1996) for estimating the coefficient functions $a_{g} (\cdot), g = 1, \dots, d$ in (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ). For $U_{i}$ in a neighbourhood of a grid point $u_{0}$ , $a_{g} (U_{i})$ is approximated locally by a polynomial of order p, $\sum_{j = 0}^{p} (a_{g}^{(j)} (u_{0}) / j!) (U_{i} - u_{0})^{j}$ based on a Taylor expansion. Then estimation is carried out by weighted least squares (Fan and Zhang Citation1999): (2) $min_{β} \sum_{i = 1}^{n} {(Y_{i} - \sum_{g = 1}^{d} [\sum_{j = 0}^{p} β_{g, j} (U_{i} - u_{0})^{j}] X_{ig})}^{2} K_{h} (U_{i} - u_{0}),$ (2) where $β = (β_{1, 0}, \dots, β_{1, p}, \dots, β_{d, 0}, \dots, β_{d, p})^{T}$ , $K (\cdot)$ is a symmetric probability density function, h is the bandwidth determining the size of local neighbourhood, and $K_{h} (\cdot) = K (\cdot / h) / h$ . Throughout the paper, the dependence of β on $u_{0}$ and h is suppressed if no ambiguity results. Under some conditions, (Equation2(2) $min_{β} \sum_{i = 1}^{n} {(Y_{i} - \sum_{g = 1}^{d} [\sum_{j = 0}^{p} β_{g, j} (U_{i} - u_{0})^{j}] X_{ig})}^{2} K_{h} (U_{i} - u_{0}),$ (2) ) has a unique solution, denoted by $\hat{β} = ({\hat{β}}_{1, 0}, \dots, {\hat{β}}_{1, p}, \dots, {\hat{β}}_{d, 0}, \dots, {\hat{β}}_{d, p})^{T}$ . It is clear that ${\hat{β}}_{g, 0} (u_{0})$ estimates $a_{g} (u_{0})$ of interest and $j! {\hat{β}}_{g, j} (u_{0})$ estimates the jth derivative $a_{g}^{(j)} (u_{0})$ . The expression (Equation2(2) $min_{β} \sum_{i = 1}^{n} {(Y_{i} - \sum_{g = 1}^{d} [\sum_{j = 0}^{p} β_{g, j} (U_{i} - u_{0})^{j}] X_{ig})}^{2} K_{h} (U_{i} - u_{0}),$ (2) ) and its solution can be expressed in matrix notation. Let $X_{u_{0}} = [\begin{matrix} 1 & \dots & (U_{1} - u_{0})^{p} & \dots & X_{1 d} & \dots & X_{1 d} (U_{1} - u_{0})^{p} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ 1 & \dots & (U_{n} - u_{0})^{p} & \dots & X_{nd} & \dots & X_{nd} (U_{n} - u_{0})^{p} \end{matrix}],$ $W_{u_{0}}$ be an $n \times n$ diagonal matrix of weights $K_{h} (U_{i} - u_{0})$ , $i = 1, \dots, n$ , and $y = (Y_{1}, \dots, Y_{n})^{T}$ . Then (Equation2(2) $min_{β} \sum_{i = 1}^{n} {(Y_{i} - \sum_{g = 1}^{d} [\sum_{j = 0}^{p} β_{g, j} (U_{i} - u_{0})^{j}] X_{ig})}^{2} K_{h} (U_{i} - u_{0}),$ (2) ) can be expressed as $min_{β} (y - X_{u_{0}} β)^{⊤} W_{u_{0}} (y - X_{u_{0}} β)$ which yields (3) $\hat{β} (u_{0}) = (X_{u_{0}}^{T} W_{u_{0}} X_{u_{0}})^{- 1} X_{u_{0}}^{T} W_{u_{0}} y .$ (3) The estimator of $a (u_{0})$ is (4) $\hat{a} (u_{0}) = ({\hat{β}}_{1, 0} (u_{0}), \dots, {\hat{β}}_{d, 0} (u_{0}))^{T} = (I_{d} \otimes e_{p + 1, 1}^{T}) \hat{β} (u_{0}),$ (4) where ⊗ denotes the Kronecker product, $e_{p + 1, k}$ is a column vector of length $(p + 1)$ with 1 at the kth position and 0 elsewhere, and $I_{d}$ is the d-dimensional identity matrix. Let $ζ_{g, ν + 1}$ be a column vector of length $d \times (p + 1)$ with 1 at the $[(g - 1) \times (p + 1) + (ν + 1)]$ th position and 0 elsewhere, $g = 1, \dots, d$ , $ν = 0, \dots, p$ . Then ${\hat{β}}_{g, ν} (u_{0}) = ζ_{g, ν + 1}^{T} \hat{β} (u_{0})$ . In other words, if $ζ_{g, ν + 1}$ is partitioned into d groups of length $(p + 1)$ , then $ζ_{g, ν + 1}$ indicates the $(ν + 1)$ th position in the gth group, and $ζ_{g, ν + 1} = e_{d, g} \otimes e_{p + 1, ν + 1}$ . In the special case of d = 1, $ζ_{1, ν + 1} = e_{p + 1, ν + 1}$ .

The behaviour of $\hat{β} (u_{0})$ differs whether $u_{0}$ is an interior or boundary point. In this paper, we consider the case of interior points only. An informative tool to understand $\hat{β} (u_{0})$ is the equivalent kernel. For d = 1 in model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ), the weight function $W_{ν} (\cdot)$ for ${\hat{β}}_{1, ν} (u_{0})$ , i.e. ${\hat{β}}_{1, ν} (u_{0}) = \sum_{i = 1}^{n} W_{ν} ((U_{i} - u_{0}) / h) Y_{i}, ν = 0, \dots, p$ , is given as follows (Fan and Gijbels Citation1996, p:63): (5) $W_{ν} (t) = e_{p + 1, ν + 1}^{T} S_{n}^{- 1} {1, th, \dots, (th)^{p}}^{T} K (t) / h,$ (5) where $S_{n} = X_{u_{0}}^{T} W_{u_{0}} X_{u_{0}}$ . For an interior point $u_{0}$ , $W_{ν} (\cdot)$ satisfies the following discrete moment conditions (Fan and Gijbels Citation1996, p:63 and p:103): (6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) where $δ_{ν, q}$ is an indicator function of ${ν = q}$ . For $ν = 0$ , the expression (Equation6(6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) ) is referred to as the ‘reproducing’ property by Tsybakov (Citation2009, p:36, Proposition 1.12) because ${\hat{β}}_{1, 0} (\cdot)$ reproduces polynomials of degree $\leq p$ . Here, we extend Tsybakov's statement to a general $ν = 0, \dots, p,$ in the following Proposition, which shows that the local polynomial kernel approach has derivative reproducing property.

Proposition 2.1

Let $P (\cdot)$ be a polynomial of degree $\leq p .$ Then (Equation6(6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) ) implies the reproducing property for $ν = 0, \dots, p$ : (7) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν} (\frac{U_{i} - u_{0}}{h}) = \frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}} .$ (7)

The proof of Proposition 2.1 is given in Tsybakov (Citation2009, pp:36-37), is straightforward based on (Equation6(6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) ), and hence is omitted. For $ν = 1, \dots, p,$ the reproducing property (Equation7(7) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν} (\frac{U_{i} - u_{0}}{h}) = \frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}} .$ (7) ) means that the weight function $ν! W_{ν} (\cdot)$ corresponding to $ν! {\hat{β}}_{1, ν} (\cdot)$ reproduces the νth derivative of polynomial $P (\cdot)$ with degree $\leq p$ . This includes shrinking polynomials of degree $< ν$ to 0.

Let $S_{p} = (μ_{i + j})_{0 \leq i, j \leq p}$ with $μ_{i + j}$ being the $(i + j)$ th moment of $K (\cdot)$ . In an asymptotic form (Fan and Gijbels Citation1996, p:64), ${\hat{β}}_{1, ν} (u_{0}) = (1 / (n h^{ν + 1} f_{U} (u_{0})) \times$ $\sum_{i = 1}^{n} K_{ν}^{*} ((U_{i} - u_{0}) / h) Y_{i} (1 + o_{p} (1)), ν = 0, \dots, p$ , where $f_{U} (\cdot)$ is the density function of U and (8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) with $s^{(i, j)}$ being the $(i, j)$ th element of $S_{p}^{- 1}$ . The $K_{ν}^{*} (\cdot)$ is the asymptotic equivalent kernel for ${\hat{β}}_{1, ν}$ , satisfying the following property (Fan and Gijbels Citation1996, p:64): (9) $\int t^{q} K_{ν}^{*} (t) d t = δ_{ν, q}, 0 \leq ν, q \leq p,$ (9) which is an asymptotic version of (Equation6(6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) ); that is, $K_{ν}^{*} (\cdot)$ is a kernel of order $(ν, p + 1)$ (see Gasser, Müller, and Mammitzsch Citation1985 for definition). It has been shown in Fan and Gijbels (Citation1996) that local polynomials with $p - ν = 1$ outperform those with $p - ν = 0$ asymptotically.

In the next section, we derive the equivalent kernels of ${\hat{β}}_{g, ν} (u_{0})$ , $g = 1, \dots, d$ , $ν = 0, \dots, p$ , for the varying coefficient model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ), and investigate their reproducing property and connection to $K_{ν}^{*} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ).

3. Results

3.1. Local linear case with d = 2

For clarity of presentation, we start with a simple case when d = 2 and p = 1, i.e. (10) $Y = a_{1} (U) + a_{2} (U) X_{2} + ϵ .$ (10) For a given interior point $u_{0}$ , with p = 1, $S_{n}$ defined around (Equation5(5) $W_{ν} (t) = e_{p + 1, ν + 1}^{T} S_{n}^{- 1} {1, th, \dots, (th)^{p}}^{T} K (t) / h,$ (5) ) is $S_{n} = [\begin{matrix} △_{11} & △_{12} \\ △_{21} & △_{22} \end{matrix}], where △_{jk} = [\begin{matrix} S_{jk}^{0} & S_{jk}^{1} \\ S_{jk}^{1} & S_{jk}^{2} \end{matrix}]$ with $S_{jk}^{l} = \sum_{i = 1}^{n} X_{ij} X_{ik} (U_{i} - u_{0})^{l} K_{h} (U_{i} - u_{0})$ . Under the model (Equation10(10) $Y = a_{1} (U) + a_{2} (U) X_{2} + ϵ .$ (10) ), based on (Equation3(3) $\hat{β} (u_{0}) = (X_{u_{0}}^{T} W_{u_{0}} X_{u_{0}})^{- 1} X_{u_{0}}^{T} W_{u_{0}} y .$ (3) ), ${\hat{β}}_{g, ν} (u_{0})$ , g = 1, 2 and $ν = 0, 1$ , is a linear smoother: (11) ${\hat{β}}_{g, ν} (u_{0}) = \sum_{i = 1}^{n} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) Y_{i},$ (11) where (12) $W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ U_{i} - u_{0} \end{matrix}]) K_{h} (U_{i} - u_{0}) .$ (12) It is straightforward to show that $W_{ν}^{g} (\cdot; \cdot)$ satisfies the following discrete moment conditions: for $ν, q = 0, 1$ , g = 1, 2, (13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) Equation (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) provides 4 moment conditions respectively for each combination of $(g, ν)$ . For g = 1, $W_{ν}^{1} (\cdot; \cdot)$ , $ν = 0, 1$ , satisfies the same reproducing property as in Proposition 2.1 by the first equation in (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ). In addition, by the second equation in (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ), $W_{ν}^{1} (\cdot; \cdot)$ shrinks the covariate $X_{i 2}$ 's and interaction term $X_{i 2} U_{i}$ 's to 0. In other words, the equivalent kernels corresponding to estimating $a_{1}^{(ν)} (\cdot)$ of d = 2 satisfy more properties than (Equation6(6) $\sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν} (\frac{U_{i} - u_{0}}{h}) = δ_{ν, q}, 0 \leq ν, q \leq p,$ (6) ) of d = 1 case. For g = 2, we list the properties of $W_{ν}^{2} (\cdot; \cdot)$ in the following, while the general statement is given in Proposition 3.1 in Section 3.2.

For simplicity, denote $u_{i 0} = (U_{i} - u_{0}) / h$ . When g = 2, (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) gives the following results:

When $ν = q = 0$ , $\sum_{i = 1}^{n} W_{0}^{2} (u_{i 0}; X_{i 2}) = 0$ , i.e. shrinking constants to 0, and $\sum_{i = 1}^{n} X_{i 2} W_{0}^{2} (u_{i 0}; X_{i 2}) = 1$ , i.e. reproducing the first derivative with respect to $x_{2}$ $(\frac{\partial}{\partial x_{2}} x_{2} = 1)$ .
When $ν = 0, q = 1$ , $\sum_{i = 1}^{n} (U_{i} - u_{0}) W_{0}^{2} (u_{i 0}; X_{i 2}) = 0$ or $\sum_{i = 1}^{n} U_{i} W_{0}^{2} (u_{i 0}; X_{i 2}) = 0$ , i.e. shrinking linear terms of $U_{i}$ 's to 0; $\sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0}) W_{0}^{2} (u_{i 0}; X_{i 2}) = 0$ or $\sum_{i = 1}^{n} X_{i 2} U_{i} W_{0}^{2} ((u_{i 0}; X_{i 2}) = u_{0}$ , reproducing the first partial derivative with respect to $x_{2}$ $(\frac{\partial}{\partial x_{2}} (u x_{2}) |_{u_{0}} = u_{0})$ .
When $ν = 1, q = 0$ , $\sum_{i = 1}^{n} W_{1}^{2} (u_{i 0}; X_{i 2}) = 0$ and $\sum_{i = 1}^{n} X_{i 2} W_{1}^{2} (u_{i 0}; X_{i 2}) = 0$ , shrinking constants and linear terms of $X_{i 2}$ 's to 0.
When $ν = q = 1$ , $\sum_{i = 1}^{n} (U_{i} - u_{0}) W_{1}^{2} (u_{i 0}; X_{i 2}) = 0$ or $\sum_{i = 1}^{n} U_{i} W_{1}^{2} (u_{i 0}; X_{i 2}) = 0$ , shrinking linear terms of $U_{i}$ 's to 0; $\sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0}) W_{1}^{2} (u_{i 0}; X_{i 2}) = 1$ or $\sum_{i = 1}^{n} X_{i 2} U_{i} W_{1}^{2} (u_{i 0}; X_{i 2}) = 1$ , reproducing the first partial derivative with respect to both $x_{2}$ and u $(\frac{\partial^{2}}{\partial u \partial x_{2}} (u x_{2}) |_{u_{0}} = 1)$ .
From 1–4 above, the finite-sample bias when estimating $E (Y | U, X_{2}) = b_{0} + b_{1} U + b_{2} X_{2} + b_{3} U X_{2}$ is zero.

Next we study the asymptotic forms for $W_{ν}^{g} (\cdot; \cdot)$ , $ν = 0, 1$ and g = 1, 2, to understand the local asymptotic behaviour of ${\hat{β}}_{g, ν} (\cdot)$ . Let $\tilde{x} = (1, X_{2}, \dots, X_{d})^{⊤}$ , $f (x_{2}, \dots, x_{d} | u)$ be the conditional density function of $x$ given U = u, and $Ω_{d} (u)$ be the conditional expectation of $\tilde{x} {\tilde{x}}^{T}$ given U = u with its $(j, k)$ th element $r_{jk} (u) = E (X_{j} X_{k} | U = u)$ , $j, k = 1, \dots, d$ . Mimicking the derivations of (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) in Fan and Gijbels (Citation1996) and using the asymptotic forms of $S_{jk}^{l}$ in Zhang and Lee (Citation2000, equation (5.2)), we obtain (14) $S_{n}^{- 1} = \frac{1}{n f_{U} (u_{0})} ([Ω_{2} (u_{0})]^{- 1} \otimes H_{1}^{- 1} S_{1}^{- 1} H_{1}^{- 1}) (1 + O_{p} (\frac{\log n}{\sqrt{n h}})),$ (14) where $H_{p} = diag (1, h, \dots, h^{p})$ . It can be seen from (Equation14(14) $S_{n}^{- 1} = \frac{1}{n f_{U} (u_{0})} ([Ω_{2} (u_{0})]^{- 1} \otimes H_{1}^{- 1} S_{1}^{- 1} H_{1}^{- 1}) (1 + O_{p} (\frac{\log n}{\sqrt{n h}})),$ (14) ) that when $r_{12} (u_{0}) = r_{21} (u_{0}) = E (X_{2} | U = u_{0}) = 0$ , $Ω_{2} (u_{0})$ is a diagonal matrix. The following theorem gives the explicit forms for the asymptotic equivalent kernel of ${\hat{β}}_{g, ν} (u_{0})$ for g = 1, 2 and $ν = 0, 1$ , and provides their moment properties.

Theorem 3.1

Consider a random sample ${(Y_{i}, U_{i}, X_{i 2}), i = 1, \dots, n}$ from model (Equation10(10) $Y = a_{1} (U) + a_{2} (U) X_{2} + ϵ .$ (10) ) (d = 2 in (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) )) with local linear p = 1 estimators. For an interior point $u_{0}$ , assume that $f (x_{2} | u_{0})$ is bounded away from 0 and ∞ and has a compact support. Conditioned on ${U_{i}, X_{i 2}}_{i = 1}^{n}$ and under Conditions A in the Appendix, ${\hat{β}}_{g, ν} (u_{0})$ , g = 1, 2, $ν = 0, 1$ , has the following asymptotic form: (15) ${\hat{β}}_{g, ν} (u_{0}) = \frac{1}{n h^{ν + 1} f_{U} (u_{0})} \sum_{i = 1}^{n} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) Y_{i} (1 + o_{p} (1)),$ (15) where the equivalent kernel (16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) with $τ^{(j, l)} (u_{0})$ and $s^{(j, l)}$ the $(j, l)$ th element of $[Ω_{2} (u_{0})]^{- 1}$ and $S_{1}^{- 1}$ respectively. Then for $ν, q = 0, 1$ , (17) $\begin{aligned} \int t^{q} (\int K_{g, ν}^{*} (t; x_{2}) f (x_{2} | u_{0}) d x_{2}) d t & = δ_{g, 1} δ_{ν, q}, \\ \int t^{q} (\int x_{2} K_{g, ν}^{*} (t; x_{2}) f (x_{2} | u_{0}) d x_{2}) d t & = δ_{g, 2} δ_{ν, q}, \end{aligned}$ (17) which are the asymptotic counterparts of (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ).

The results of Theorem 3.1 is a special case (d = 2, p = 1) of Theorem 3.2 and hence the proof of Theorem 3.1 follows that of Theorem 3.2. Expression (Equation16(16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) ) gives different decomposition forms of $K_{g, ν}^{*} (\cdot; \cdot)$ and interpretations of Theorem 3.1 are discussed as follows.

From the first equation in (Equation16(16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) ), $K_{g, ν}^{*} (\cdot; \cdot)$ consists of three main parts: (i) inverse of the Kronecker product of conditional moments $Ω_{2} (u_{0})$ and kernel moments $S_{1}$ , (ii) Kronecker product of covariate $(1, X_{i 2})^{⊤}$ and local linear term $(1, (U_{i} - u_{0}) / h)^{⊤}$ , and (iii) kernel function $K (\cdot)$ .
For the second equation in (Equation16(16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) ), $K_{g, ν}^{*} (\cdot; \cdot)$ can be rewritten as a matrix product of (i) gth row of $[Ω_{2} (u_{0})]^{- 1}$ , (ii) covariate $(1, X_{i 2})^{⊤}$ , and (iii) the equivalent kernel $K_{ν}^{*} (\cdot)$ (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of local linear regression. This expression shows the connection of $K_{g, ν}^{*} (\cdot; x_{2})$ to $K_{ν}^{*} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ).
It is clear that the $ν! K_{1, ν}^{*} (\cdot; \cdot)$ for estimating $a_{1}^{(ν)} (\cdot)$ under model (Equation10(10) $Y = a_{1} (U) + a_{2} (U) X_{2} + ϵ .$ (10) ) of d = 2 is different from (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of d = 1. More explicitly, from (Equation16(16) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) & = ζ_{g, ν + 1}^{T} {[Ω_{2} (u_{0}) \otimes S_{1}]}^{- 1} ([\begin{matrix} 1 \\ X_{i 2} \end{matrix}] \otimes [\begin{matrix} 1 \\ \frac{U_{i} - u_{0}}{h} \end{matrix}]) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{2, g}^{T} [Ω_{2} (u_{0})]^{- 1} [\begin{matrix} 1 \\ X_{i 2} \end{matrix}] K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + X_{i 2} τ^{(g, 2)} (u_{0})) (s^{(ν + 1, 1)} + s^{(ν + 1, 2)} ((U_{i} - u_{0}) / h)) \\ \times K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (16) ), (18) $\begin{aligned} \hat{β} (u_{0}) & = \sum_{i = 1}^{n} \frac{[det (Ω_{2} (u_{0}))]^{- 1}}{n f_{U} (u_{0})} [\begin{matrix} r_{22} (u_{0}) - r_{12} (u_{0}) X_{i 2} \\ h^{- 1} μ_{2}^{- 1} ((U_{i} - u_{0}) / h) (r_{22} (u_{0}) - r_{12} (u_{0}) X_{i 2}) \\ X_{i 2} - r_{21} (u_{0}) \\ h^{- 1} μ_{2}^{- 1} ((U_{i} - u_{0}) / h) (X_{i 2} - r_{21} (u_{0})) \end{matrix}] \\ \times K_{h} (U_{i} - u_{0}) Y_{i} (1 + o_{p} (1)) . \end{aligned}$ (18) Based on (Equation18(18) $\begin{aligned} \hat{β} (u_{0}) & = \sum_{i = 1}^{n} \frac{[det (Ω_{2} (u_{0}))]^{- 1}}{n f_{U} (u_{0})} [\begin{matrix} r_{22} (u_{0}) - r_{12} (u_{0}) X_{i 2} \\ h^{- 1} μ_{2}^{- 1} ((U_{i} - u_{0}) / h) (r_{22} (u_{0}) - r_{12} (u_{0}) X_{i 2}) \\ X_{i 2} - r_{21} (u_{0}) \\ h^{- 1} μ_{2}^{- 1} ((U_{i} - u_{0}) / h) (X_{i 2} - r_{21} (u_{0})) \end{matrix}] \\ \times K_{h} (U_{i} - u_{0}) Y_{i} (1 + o_{p} (1)) . \end{aligned}$ (18) ), even when $X_{2}$ and U are independent ( $r_{12} = E (X_{2})$ and $r_{22} = E (X_{2}^{2})$ are free of $u_{0}$ ), $K_{1, ν}^{*} (\cdot; \cdot)$ still involves $X_{i 2}$ 's. Thus a sufficient condition for the equivalent kernels $K_{1, ν}^{*} (\cdot; \cdot)$ to be identical to those in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of d = 1 is $r_{12} (\cdot) \equiv 0$ . That is, given U, $X_{2}$ has a conditional mean of 0.
For g = 1, 2, the equivalent kernels for estimating the first derivative $a_{g}^{'} (\cdot)$ is connected to those for estimating $a_{g} (\cdot)$ : $K_{g, 1}^{*} (t; \cdot) = μ_{2}^{- 1} t K_{g, 0}^{*} (t; \cdot)$ . This is analogous to $K_{1}^{*} (t) = μ_{2}^{- 1} t K_{0}^{*} (t)$ when p = 1 in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ).
(Equation17(17) $\begin{aligned} \int t^{q} (\int K_{g, ν}^{*} (t; x_{2}) f (x_{2} | u_{0}) d x_{2}) d t & = δ_{g, 1} δ_{ν, q}, \\ \int t^{q} (\int x_{2} K_{g, ν}^{*} (t; x_{2}) f (x_{2} | u_{0}) d x_{2}) d t & = δ_{g, 2} δ_{ν, q}, \end{aligned}$ (17) ) involves the conditional density $f (x_{2} | u)$ and shows the moment property of $K_{g, ν}^{*} (t; x_{2})$ with respect to the conditional density when estimating $a_{1}^{(ν)} (\cdot)$ and $a_{2}^{(ν)} (\cdot)$ in (Equation10(10) $Y = a_{1} (U) + a_{2} (U) X_{2} + ϵ .$ (10) ).

Next we discuss the decomposition and reproducing property of equivalent kernels for model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with general d and p.

3.2. The case with general d and p

Results in Theorem 3.1 and the reproducing property (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) are extended to general d and p in this subsection. For ease of notation, let $u_{i 0} = (1, U_{i} - u_{0}, \dots, (U_{i} - u_{0})^{p})^{⊤}$ , (19) $X = [\begin{matrix} 1 & X_{12} & \dots & X_{1 d} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & X_{n 2} & \dots & X_{nd} \end{matrix}] = [\begin{matrix} {\tilde{x}}_{(1)}^{T} \\ ⋮ \\ {\tilde{x}}_{(n)}^{T} \end{matrix}] = [\begin{matrix} 1 & x_{(1)}^{T} \\ ⋮ & ⋮ \\ 1 & x_{(n)}^{T} \end{matrix}],$ (19) where ${\tilde{x}}_{(i)}^{⊤}$ is the ith row vector of $X$ and $x_{(i)} = (X_{i 2}, \dots, X_{id})^{T}$ without the intercept. For $0 \leq ν \leq p$ and $1 \leq g \leq d$ , ${\hat{β}}_{g, ν} (u_{0})$ is written as (20) ${\hat{β}}_{g, ν} (u_{0}) = \sum_{i = 1}^{n} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) Y_{i},$ (20) where the weight function $W_{ν}^{g} (\cdot; x_{(i)}^{⊤})$ is (21) $W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0})$ (21) with $ζ_{g, ν + 1}$ defined in Section 2. Based on (Equation21(21) $W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0})$ (21) ), we show in the Appendix that $W_{ν}^{g} (\cdot; x_{(i)}^{T})$ (Equation21(21) $W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0})$ (21) ) enjoys the following property analogous to (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ): for $0 \leq ν, q \leq p$ , $2 \leq k \leq d$ , and $1 \leq g \leq d$ , (22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) We state the reproducing property of $W_{ν}^{g} (\cdot; x_{(i)}^{T})$ formally in the following Proposition.

Proposition 3.1

Let $P (\cdot)$ be a polynomial of degree $\leq p$ and $Q (\cdot, x_{k}) = x_{k} P (\cdot)$ , $k = 2, \dots, d$ . Then (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) implies the reproducing property, for $ν = 0, \dots, p$ : (23) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} (\frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}});$ (23) for $k = 2, \dots, d$ , (24) $ν! \sum_{i = 1}^{n} Q (U_{i}; X_{ik}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} \frac{\partial}{\partial x_{k}} (\frac{\partial^{ν}}{\partial u^{ν}} Q (u, x_{k}) |_{u = u_{0}}) .$ (24)

The outline of the proof of Proposition 3.1 is given in the Appendix. Proposition 3.1 shows that the equivalent kernel $ν! W_{ν}^{1} (\cdot; \cdot)$ for estimating $a_{1}^{(ν)} (\cdot)$ reproduces the νth derivative of polynomials of $U_{i}$ 's with degree $\leq p$ , while shrinking $Q (U_{i}; X_{ik})$ 's to 0. Moreover, based on (Equation24(24) $ν! \sum_{i = 1}^{n} Q (U_{i}; X_{ik}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} \frac{\partial}{\partial x_{k}} (\frac{\partial^{ν}}{\partial u^{ν}} Q (u, x_{k}) |_{u = u_{0}}) .$ (24) ), the equivalent kernel $ν! W_{ν}^{k} (\cdot; \cdot)$ , $k = 2, \dots, d$ , for estimating $a_{k}^{(ν)} (\cdot)$ reproduces the νth derivative of $a_{k} (\cdot)$ when $a_{k} (\cdot)$ is a polynomial with degree $\leq p$ . These results imply that the finite-sample bias when estimating $E (Y | U, X_{2}, \dots, X_{d}) = \sum_{j = 0}^{p} b_{1 j} U^{j} + \sum_{k = 2}^{d} \sum_{j = 0}^{p} b_{kj} U^{j} X_{k}$ is zero, and that the polynomial reproducing property in Proposition 2.1 with d = 1 is valid for the interaction terms of covariates $X_{k}$ 's and pth order polynomials of U under (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with a general d.

Theorem 3.2 below gives the decomposition and moment property of asymptotic equivalent kernels for general d and p, which is an extension of Theorem 3.1.

Theorem 3.2

Consider a random sample ${(Y_{i}, U_{i}, x_{(i)}^{T}), i = 1, \dots, n}$ from model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) with local pth order polynomial estimators, $p \geq 0$ . For an interior point $u_{0}$ , assume that $f (x_{2}, \dots, x_{d} | u_{0})$ is bounded away from 0 and ∞ and has a compact support. Conditioned on ${U_{i}, x_{(i)}^{T}}_{i = 1}^{n}$ and under Conditions A in the Appendix, ${\hat{β}}_{g, ν} (u_{0})$ has the following asymptotic form for $0 \leq ν \leq p$ and $1 \leq g \leq d$ : (25) ${\hat{β}}_{g, ν} (u_{0}) = \frac{1}{n h^{ν + 1} f_{U} (u_{0})} \sum_{i = 1}^{n} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) Y_{i} (1 + o_{p} (1)),$ (25) where the equivalent kernel (26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) with $τ^{(j, l)} (u_{0})$ and $s^{(j, l)}$ the $(j, l)$ th element of $[Ω_{d} (u_{0})]^{- 1}$ and $S_{p}^{- 1}$ respectively. The moment property of $K_{g, ν}^{*} (t; x_{2}, \dots, x_{d})$ is given below: (27) $\begin{aligned} \int t^{q} (\int \dots \int K_{1, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, 1} δ_{ν, q}, \\ \int t^{q} (\int \dots \int x_{k} K_{g, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, k} δ_{ν, q}, \end{aligned}$ (27) where $k = 2, \dots, d$ , and $q = 0, \dots, p$ . (Equation27(27) $\begin{aligned} \int t^{q} (\int \dots \int K_{1, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, 1} δ_{ν, q}, \\ \int t^{q} (\int \dots \int x_{k} K_{g, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, k} δ_{ν, q}, \end{aligned}$ (27) ) is the asymptotic counterpart of (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) and contains $(p + 1) d$ conditions for each combination of $(g, ν)$ .

The proof of Theorem 3.2 is given in the Appendix. From (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ), the equivalent kernel for ${\hat{β}}_{g, ν} (u_{0})$ has decomposition forms analogous to the case for d = 2 and p = 1 in Theorem 3.1, while (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) involves higher orders of local polynomials and more covariates. Some interpretations about (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) is given below:

The second equation in (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) shows that $K_{g, ν}^{*} (\cdot; \cdot)$ consists of three parts: (i) $[Ω_{d} (u_{0})]^{- 1}$ , the inverse of the conditional moment matrix of $\tilde{x}$ given $U = u_{0}$ , (ii) the covariate vector ${\tilde{x}}_{(i)}^{⊤}$ , and the equivalent kernels $K_{ν}^{*} (\cdot)$ (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of univariable local polynomials.
The second equation in (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ) not only gives an explicit connection between $K_{g, ν}^{*} (\cdot; \cdot)$ and $K_{ν}^{*} (\cdot)$ , but also the product $[Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T}$ is analogous to the form of slopes ( $(X^{⊤} X)^{- 1} {\tilde{x}}_{(i)}^{T})$ in classical linear models.
Based on the first equation in (Equation26(26) $\begin{aligned} K_{g, ν}^{*} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ({\tilde{x}}_{(i)} \otimes H_{p}^{- 1} u_{i 0}) K (\frac{U_{i} - u_{0}}{h}) \\ = e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}^{T} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (τ^{(g, 1)} (u_{0}) + \sum_{j = 2}^{d} X_{ij} τ^{(g, j)} (u_{0})) (\sum_{l = 0}^{p} {(\frac{U_{i} - u_{0}}{h})}^{l} s^{(ν + 1, l + 1)}) K (\frac{U_{i} - u_{0}}{h}) \end{aligned}$ (26) ), one alternative form of $\hat{β} (u_{0})$ gives another connection to the case of d = 1: $\begin{aligned} \hat{β} (u_{0}) & = \sum_{i = 1}^{n} \frac{1}{n f_{U} (u_{0})} (([Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}) \otimes [H_{p}^{- 1} S_{p}^{- 1} H_{p}^{- 1} u_{i 0}]) K_{h} (U_{i} - u_{0}) Y_{i} \\ \times (1 + o_{p} (1)), \end{aligned}$ which reduces to the case in Section 2 when d = 1.

3.3. Centering covariates

For classical linear models, centering covariates is useful in interpreting the effects of covariates and the slope estimators via least squares are the same with or without centering. In this subsection, we explore an analogous centered form for (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) and we show that the resulting asymptotic equivalent kernels for estimating $a_{1} (\cdot)$ are identical to $K_{ν}^{*} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of d = 1 case. Moreover, the varying coefficient model may be interpreted as locally multiple linear model with interactions.

Let ${\bar{X}}_{k}$ be the sample mean of kth covariate ${X_{ik}}_{i = 1}^{n}$ , $k = 2, \dots, d$ . Then rewrite model (Equation1(1) $Y = \sum_{g = 1}^{d} a_{g} (U) X_{g} + ϵ = (1, x)^{T} a (U) + ϵ,$ (1) ) in terms of centered covariates for the ith observation, $i = 1, \dots, n$ , (28) $Y_{i} = (a_{1} (U_{i}) + {\bar{X}}_{2} a_{2} (U_{i}) + \dots + {\bar{X}}_{d} a_{d} (U_{i})) + \sum_{k = 2}^{d} a_{k} (U_{i}) (X_{ik} - {\bar{X}}_{k}) + ϵ_{i} .$ (28) It is straightforward to observe that the coefficient functions $a_{k} (\cdot), k = 2, \dots, d$ , are the same whether the covariates are centered or not, while the intercept function $a_{1} (\cdot)$ will be different. For ease of notation, define $X_{ik}^{c} \equiv X_{ik} - {\bar{X}}_{k}$ , $k = 2, \dots, d$ , (29) $[\begin{matrix} X_{12} - {\bar{X}}_{2} & \dots & X_{1 d} - {\bar{X}}_{d} \\ ⋮ & ⋱ & ⋮ \\ X_{n 2} - {\bar{X}}_{2} & \dots & X_{nd} - {\bar{X}}_{d} \end{matrix}] \equiv [\begin{matrix} X_{12}^{c} & \dots & X_{1 d}^{c} \\ ⋮ & ⋱ & ⋮ \\ X_{n 2}^{c} & \dots & X_{nd}^{c} \end{matrix}] \equiv [\begin{matrix} (x_{(1)}^{c})^{⊤} \\ ⋮ \\ (x_{(n)}^{c})^{⊤} \end{matrix}],$ (29) and $(d - 1) \times (d - 1)$ matrix $M_{xx} (u)$ with $(j - 1, k - 1)$ th element being $E (X_{ij}^{c} X_{ik}^{c} | u) \equiv r_{jk}^{c} (u)$ , $j, k = 2, \dots, d$ . When the conditional density of $X_{k}$ given U = u is well defined, $E (X_{ik} | u)$ is a function of u and $E (X_{ik}^{c} | u) = 0$ . The following Corollary for d = 2 and p = 1 is a special case of Theorem 3.1 either when $X_{i 2}$ 's are centered or when $E (X_{2} | U = u) = r_{12} (u) = 0$ .

Corollary 3.1

Under the conditions in Theorem 3.1, results (a) and (b) below hold when $X_{i 2}$ 's are centered.

(a)	$K_{1, ν}^{} (\cdot)$ , $ν = 0, 1$ , is identical to $K_{ν}^{} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) and does not involve $X_{i 2}^{c}$ 's;
(b)	the form of $K_{2, ν}^{*} (\cdot; \cdot)$ , $ν = 0, 1$ , becomes simpler:

(30)

\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}

(30)

(c)

When $E (X_{2} | u_{0}) = 0$ , the results in (a) and (b) hold without centering of $X_{i 2}$ 's, and $X_{i 2}^{c}$ in (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ) can be replaced by $X_{i 2}$ .

Corollary 3.1 shows that with $X_{i 2}^{c}$ 's, the equivalent kernels $K_{2, ν}^{*} (\cdot; \cdot)$ , $ν = 0, 1$ corresponding to ${\hat{β}}_{2, 0} (u_{0})$ and ${\hat{β}}_{2, 1} (u_{0})$ respectively involve a factor $X_{i 2}^{c} / r_{22}^{c} (u_{0})$ . In addition, when $X_{2}$ and U are independent, $r_{22}^{c}$ is a constant free of $u_{0}$ , and further adopting standardised $X_{i 2}$ in (Equation28(28) $Y_{i} = (a_{1} (U_{i}) + {\bar{X}}_{2} a_{2} (U_{i}) + \dots + {\bar{X}}_{d} a_{d} (U_{i})) + \sum_{k = 2}^{d} a_{k} (U_{i}) (X_{ik} - {\bar{X}}_{k}) + ϵ_{i} .$ (28) ) by $X_{i 2}^{std} = X_{i 2}^{c} / \sqrt{r_{22}^{c}}$ , $i = 1, \dots, n$ , leads to $K_{2, ν}^{*} ((U_{i} - u_{0}) / h; X_{i 2}^{std}) = X_{i 2}^{std} K_{1, ν}^{*} ((U_{i} - u_{0}) / h)$ , $ν = 0, 1$ .

Let us further explore Corollary 3.1 by centering $Y_{i}$ 's as well, denoted by $Y_{i}^{c}$ . With centered observations ${(Y_{i}^{c}, U_{i}, X_{i 2}^{c}), i = 1 \dots, n}$ , and based on (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ), (31) ${\hat{β}}_{2, 0} (u_{0}) = [\frac{n^{- 1} \sum_{i = 1}^{n} X_{i 2}^{c} Y_{i}^{c} K_{h} (U_{i} - u_{0})}{r_{22}^{c} (u_{0}) f_{U} (u_{0})}] (1 + o_{p} (1)) .$ (31) The denominator in (Equation31(31) ${\hat{β}}_{2, 0} (u_{0}) = [\frac{n^{- 1} \sum_{i = 1}^{n} X_{i 2}^{c} Y_{i}^{c} K_{h} (U_{i} - u_{0})}{r_{22}^{c} (u_{0}) f_{U} (u_{0})}] (1 + o_{p} (1)) .$ (31) ) can be viewed as a local variance of $X_{2}$ at $u_{0}$ , while the numerator in (Equation31(31) ${\hat{β}}_{2, 0} (u_{0}) = [\frac{n^{- 1} \sum_{i = 1}^{n} X_{i 2}^{c} Y_{i}^{c} K_{h} (U_{i} - u_{0})}{r_{22}^{c} (u_{0}) f_{U} (u_{0})}] (1 + o_{p} (1)) .$ (31) ) can be interpreted as the locally weighted sample covariance between $X_{i 2}$ 's and $Y_{i}$ 's with weights assigned by $K_{h} (U_{i} - u_{0})$ around $u_{0}$ , denoted by ${\hat{Cov}}_{u_{0}} (X_{2}, Y)_{K}$ . Hence ${\hat{β}}_{2, 0} (u_{0})$ may be interpreted as ${\hat{Cov}}_{u_{0}} (X_{2}, Y)_{K} / [\hat{Var} (X_{2} | U = u_{0}) f_{U} (u_{0})]$ . This enhances the interpretations of ${\hat{β}}_{2, 0} (u_{0})$ for estimating $a_{2} (u_{0})$ and presents a local analogue of the slope in simple linear regression. From (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ), ${\hat{β}}_{2, 1} (u_{0})$ for estimating $a_{2}^{'} (\cdot)$ has an analogous interpretation as ${\hat{Cov}}_{u_{0}} (X_{2}, Y)_{K_{1}^{*}} / [\hat{Var} (X_{2} | U = u_{0}) f_{U} (u_{0})]$ with weights assigned by $K_{1}^{*} (\cdot)$ . When d = 2 and p = 1, it is obvious that (Equation28(28) $Y_{i} = (a_{1} (U_{i}) + {\bar{X}}_{2} a_{2} (U_{i}) + \dots + {\bar{X}}_{d} a_{d} (U_{i})) + \sum_{k = 2}^{d} a_{k} (U_{i}) (X_{ik} - {\bar{X}}_{k}) + ϵ_{i} .$ (28) ) could be interpreted as locally multiple linear model with interactions, since $E (Y_{i}^{c} | u_{0}) \approx {\hat{β}}_{1, 0} (u_{0}) + {\hat{β}}_{1, 1} (u_{0}) (U_{i} - u_{0}) + {\hat{β}}_{2, 0} (u_{0}) X_{i 2}^{c} + {\hat{β}}_{2, 1} (u_{0}) X_{i 2}^{c} (U_{i} - u_{0})$ .

We now present a Corollary of Theorem 3.2 that extends the results in Corollary 3.1 to the case with general d and p.

Corollary 3.2

Under the conditions in Theorem 3.2, with centered covariates ${X_{ik}^{c}, i = 1, \dots, n, k = 2, \dots, d}$ ,

(a)	the asymptotic equivalent kernel $K_{1, ν}^{} (\cdot)$ corresponding to estimating $a_{1}^{(ν)} (\cdot)$ is identical to $K_{ν}^{} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ), $ν = 0, \dots, p$ ;
(b)	the asymptotic equivalent kernel $K_{g, ν}^{} (\cdot; \cdot)$ , $g = 2, \dots, d$ , $ν = 0, \dots, p$ , possesses a simpler form, (32) $K_{g, ν}^{} (\frac{U_{i} - u_{0}}{h}; (x_{(i)}^{c})^{⊤}) = e_{d, g}^{T} [M_{xx} (u_{0})]^{- 1} (x_{(i)}^{c})^{⊤} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h}) .$ (32)
(c)	Suppose that $Y_{i}$ 's are centered as $Y_{i}^{c}$ 's. Then for $ν = 0, \dots, p,$ (33) $\begin{aligned} ({\hat{β}}_{2, ν} (u_{0}), \dots, {\hat{β}}_{d, ν} (u_{0}))^{⊤} = n^{- 1} h^{- ν} [f_{U} (u_{0}) M_{xx} (u_{0})]^{- 1} {\hat{Cov}}_{u_{0}} (x, Y)_{K_{ν}^{*}} (1 + o_{p} (1)) . \end{aligned}$ (33)

Corollary 3.2(a) shows that with centered covariates, the asymptotic equivalent kernels corresponding to estimating $a_{1}^{(ν)} (\cdot)$ of $d \geq 2$ , $K_{1, ν}^{*} (\cdot)$ , $ν = 0, \dots, p$ , are identical to $K_{ν}^{*} (\cdot)$ in (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) of d = 1. The expression (Equation33(33) $\begin{aligned} ({\hat{β}}_{2, ν} (u_{0}), \dots, {\hat{β}}_{d, ν} (u_{0}))^{⊤} = n^{- 1} h^{- ν} [f_{U} (u_{0}) M_{xx} (u_{0})]^{- 1} {\hat{Cov}}_{u_{0}} (x, Y)_{K_{ν}^{*}} (1 + o_{p} (1)) . \end{aligned}$ (33) ) in the case of $ν = 0$ , $({\hat{β}}_{2, 0} (u_{0}), \dots, {\hat{β}}_{d, 0} (u_{0}))^{⊤}$ presents a local analogue of the slope estimators in multiple linear regression, since $M_{xx} (u_{0})$ is approximately the conditional variance matrix of $x$ given $U = u_{0}$ . The derivative terms $({\hat{β}}_{2, ν} (u_{0}), \dots, {\hat{β}}_{d, ν} (u_{0}))^{⊤}$ , $ν = 1, \dots, p$ , could be interpreted as $[M_{xx} (u_{0})]^{- 1} {\hat{Cov}}_{u_{0}} (x, Y)_{K_{ν}^{*}}$ asymptotically through equivalent kernels $K_{ν}^{*} (\cdot)$ of d = 1. We conjecture that these interpretations may be useful to develop methodology for Fréchet regression (Petersen and Müller Citation2019) when responses are random objects in a metric space. Petersen and Müller propose to adopt Euclidean local linear weights to fit Fréchet local linear regression, and in a similar approach, the equivalent kernels in Theorem 3.2 and Corollary 3.2 may be utilised to develop Fréchet varying coefficient models.

4. Examples

In Example 4.1, we demonstrate the weighting schemes of equivalent kernels (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) and Corollary 3.1 when d = 2 and p = 1. Example 4.2 is for illustrating the reproducing property in Proposition 3.1 when d = 2 and p = 2. For illustration, the Epanechnikov kernel is used with a pre-specified bandwidth h = 0.1. The issues of bandwidth selection and the choice of local polynomial orders are beyond the scope of this work and the reader may explore the related discussion in Fan and Zhang (Citation2008) and Park et al. (Citation2015).

Example 4.1

$(X_{2}$ $|$ $U = u) \sim Uniform (- (1 + u) / 2, (1 + u) / 2)$ ,where U is Uniform $(0, 1)$ . For this example, $r_{12} (u) = E (X_{2} | U = u) = 0$ , $r_{22} (u) = (1 + u)^{2} / 12$ , and $X_{2}$ and U are not independent. Since equivalent kernels (Equation8(8) $K_{ν}^{*} (t) = e_{p + 1, ν + 1}^{T} S_{p}^{- 1} (1, t, \dots, t^{p})^{T} K (t) = (\sum_{j = 0}^{p} t^{j} s^{(ν + 1, j + 1)}) K (t)$ (8) ) for estimating $a_{1}^{(ν)} (\cdot)$ are known in the literature, we illustrate equivalent kernels for estimating $a_{2}^{(ν)} (\cdot)$ , $ν = 0, 1$ . A random sample ${(U_{i}, X_{i 2}), i = 1, \dots, n}$ with size n = 100 was drawn and at a fixed $u_{0} = 0.5$ , its neighbourhood (0.4, 0.6) contains 18 data points, $i = 39, \dots, 56$ .

For ${\hat{β}}_{2, 0}$ estimating $a_{2} (\cdot)$ , Figure (a) shows the finite-sample weights (solid line) of ${X_{i 2} W_{0}^{2} ((U_{i} - 0.5) / h; X_{i 2}), i = 39, \dots, 56}$ whose sum equals to 1 ((Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) with g = 2, $ν = 0$ , and q = 0), reproducing the first derivative with respect to $x_{2}$ . The asymptotic weights ${X_{i 2} K_{2, 0}^{*} ((U_{i} - 0.5) / h; X_{i 2}) / (nh), i = 39, \dots, 56}$ with $K_{2, 0}^{*} (\cdot)$ in (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ) are normalised to have a sum of 1 and shown as dash line in Figure (a).
Figure (b) shows ${X_{i 2} (U_{i} - 0.5) W_{0}^{2} ((U_{i} - 0.5) / h; X_{i 2}), i = 39, \dots, 56}$ (solid line) whose sum equals to 0 ((Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) with g = 2, $ν = 0$ , and q = 1), so that $\sum_{i = 39}^{56} X_{i 2} U_{i} W_{0}^{2} ((U_{i} - 0.5) / h; X_{i 2}) = u_{0} = 0.5$ , reproducing the first derivative of $u x_{2}$ with respect to $x_{2}$ at $u_{0} = 0.5$ . Their asymptotic normalised $X_{i 2} (U_{i} - 0.5) K_{2, 0}^{*} ((U_{i} - 0.5) / h; X_{i 2}) / (nh)$ 's are shown in dash line.
For ${\hat{β}}_{2, 1}$ estimating $a_{2}^{'} (\cdot)$ , Figure (c) show the finite-sample weights (solid lines) of ${X_{i 2} W_{1}^{2} ((U_{i} - 0.5) / h; X_{i 2}), i = 39, \dots, 56}$ whose sum is 0, i.e. shrinking linear terms of $X_{i 2}$ to 0. Their normalised asymptotic weights $X_{i 2} K_{2, 1}^{*} ((U_{i} - 0.5) / h; X_{i 2}) / (n h^{2})$ 's are shown in dash line, with $K_{2, 1}^{*} (\cdot)$ in (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ).
Figure (d) shows ${X_{i 2} (U_{i} - 0.5) W_{1}^{2} ((U_{i} - 0.5) / h; X_{i 2}), i = 39, \dots, 56}$ whose sum is 1, i.e. $\sum_{i = 39}^{56} X_{i 2} U_{i} W_{1}^{2} ((U_{i} - 0.5) / h; X_{i 2}) = 1,$ reproducing the first partial derivative with respect to both $x_{2}$ and u, as well as their normalised asymptotic weights $X_{i 2} (U_{i} - 0.5) K_{2, 1}^{*} ((U_{i} - 0.5) / h; X_{i 2}) / (n h^{2})$ (dash line).
In contrast to the univariable local linear regression where the weights are typically concentrated around the target point of estimation, the weights in Figure (a,c) are influenced by the covariate $X_{i 2}$ 's and the conditional variance function $r_{22}^{c} (\cdot)$ (Corollary 3.1), and may not be concentrated around the target point.

Figure 1. Example 4.1 of Section 4, comparison between the exact weight function (Equation13(13) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 1} δ_{ν, q}; \\ \sum_{i = 1}^{n} X_{i 2} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; X_{i 2}) = δ_{g, 2} δ_{ν, q} . \end{aligned}$ (13) ) (solid lines) and its normalised asymptotic form (Equation30(30) $\begin{aligned} K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K (\frac{U_{i} - u_{0}}{h}), \\ K_{2, 1}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) & = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) K_{1}^{*} (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{X_{i 2}^{c}}{r_{22}^{c} (u_{0})}) (\frac{U_{i} - u_{0}}{h μ_{2}}) K (\frac{U_{i} - u_{0}}{h}) \\ = (\frac{U_{i} - u_{0}}{h μ_{2}}) K_{2, 0}^{*} (\frac{U_{i} - u_{0}}{h}; X_{i 2}^{c}) . \end{aligned}$ (30) ) (dash lines) of ${\hat{β}}_{2, 0}$ with (a) q = 0 and (b) q = 1; of ${\hat{β}}_{2, 1}$ with (c) q = 0 and (d) q = 1.

Figure 1. Example 4.1 of Section 4, comparison between the exact weight function (Equation13(13) ∑i=1n(Ui−u0)qWνg(Ui−u0h;Xi2)=δg,1δν,q;∑i=1nXi2(Ui−u0)qWνg(Ui−u0h;Xi2)=δg,2δν,q.(13) ) (solid lines) and its normalised asymptotic form (Equation30(30) K2,0∗(Ui−u0h;Xi2c)=(Xi2cr22c(u0))K(Ui−u0h),K2,1∗(Ui−u0h;Xi2c)=(Xi2cr22c(u0))K1∗(Ui−u0h)=(Xi2cr22c(u0))(Ui−u0hμ2)K(Ui−u0h)=(Ui−u0hμ2)K2,0∗(Ui−u0h;Xi2c).(30) ) (dash lines) of βˆ2,0 with (a) q = 0 and (b) q = 1; of βˆ2,1 with (c) q = 0 and (d) q = 1.

Example 4.2

We set $X_{i 2}$ 's and $U_{i}$ 's the same as those in Example 4.1, and further set $P (U_{i}) = U_{i}^{2} - 1.5 U$ to illustrate the reproducing property in Proposition 3.1 when d = 2 and p = 2. Again, let $u_{0} = 0.5$ and h = 0.1, Figure (a) plots the points $(U_{i}, P (U_{i}))$ as +'s, $i = 39, \dots, 56$ , and the weights $W_{0}^{1} ((U_{i} - u_{0}) / h; X_{i 2})$ of g = 1 and $ν = 0$ reproducing $P (0.5) = - 0.5$ (a solid black-square point) are plotted as circles with lines connecting them. There are a few negative weights since the weights are not necessarily nonnegative when p = 2 and d = 2. Figures (b,c) are similar to Figure (a) except for $ν = 1$ and 2 respectively; i.e. the + points $(U_{i}, P^{(ν)} (U_{i}))$ 's and the weights $W_{ν}^{1} ((U_{i} - u_{0}) / h; X_{i 2})$ reproducing $P^{(ν)} (0.5)$ ( $P^{'} (0.5) = - 0.5$ and $P^{′′} (0.5) = 0$ , solid black-square points) are shown. The variation of weights increases as ν increases, as shown by the scale of the y-axis.

Figure 2. Example 4.2 of Section 4. (a)–(c) for g = 1 and $ν =$ 0, 1, 2, respectively: the +'s points are $(U_{i}, P^{(ν)} (U_{i}))$ , $i = 39, \dots, 56$ , and the weights $W_{ν}^{1} ((U_{i} - u_{0}) / h; X_{i 2})$ reproducing $P^{(ν)} (0.5)$ (a solid black-square point) are plotted as circles with lines. (d)-(f) for g = 2 and $ν =$ 0, 1, 2, respectively: The + points $(U_{i}, \frac{\partial^{ν}}{\partial u^{ν}} Q (U_{i}, X_{i 2}))$ 's and the weights $W_{ν}^{2} ((U_{i} - u_{0}) / h; X_{i 2})$ that reproduces $\frac{\partial}{\partial x_{2}} (\frac{\partial^{ν}}{\partial u^{ν}} Q (u, x_{2}) |_{u = 0.5})$ (a solid black-square point) are shown.

Figure 2. Example 4.2 of Section 4. (a)–(c) for g = 1 and ν=0, 1, 2, respectively: the +'s points are (Ui,P(ν)(Ui)), i=39,…,56, and the weights Wν1((Ui−u0)/h;Xi2) reproducing P(ν)(0.5) (a solid black-square point) are plotted as circles with lines. (d)-(f) for g = 2 and ν=0, 1, 2, respectively: The + points (Ui,∂ν∂uνQ(Ui,Xi2))'s and the weights Wν2((Ui−u0)/h;Xi2) that reproduces ∂∂x2(∂ν∂uνQ(u,x2)|u=0.5) (a solid black-square point) are shown.

Analogous illustration is given in Figures (d–f) for g = 2 and $Q (U_{i}, X_{i 2}) = X_{i 2} P (U_{i})$ . The + points $(U_{i}, \frac{\partial^{ν}}{\partial u^{ν}} Q (U_{i}, X_{i 2}))$ 's and the weights $W_{ν}^{2} ((U_{i} - u_{0}) / h; X_{i 2})$ that reproduces $\frac{\partial}{\partial x_{2}} (\frac{\partial^{ν}}{\partial u^{ν}} Q (u, x_{2}) |_{u = 0.5}) = P^{(ν)} (0.5)$ (a solid black-square point) are shown. Again, the variation of weights increases as ν increases, as shown by the scale of the y-axis. The difference of weights between g = 1 and 2 are visually obvious when comparing Figure (a–c,d–f).

Acknowledgments

We thank the Editor, an Associate Editor, and two referees for constructive suggestions and insightful comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

C-Y Wu and L-H Huang were partially supported by the Ministry of Science and Technology, Taiwan, under Grants 107-2118-M-007-002-MY2, 105-2118-M-007-006-MY2 and 103-2118-M-007-001-MY2.

References

Chiang, C.-T., Rice, J.A., and Wu, C.O. (2001), ‘Smoothing Spline Estimation for Varying Coefficient Models with Repeatedly Measured Dependent Variables’, Journal of the American Statistical Association, 96, 605–619.
Web of Science ®Google Scholar
Fan, J., and Gijbels, I. (1996), Local Polynomial Modelling and Its Applications, London: Chapman & Hall.
Google Scholar
Fan, J., and Zhang, W. (1999), ‘Statistical Estimation in Varying Coefficient Models’, Annals of Statistics, 27, 1491–1518.
Web of Science ®Google Scholar
Fan, J., and Zhang, W. (2008), ‘Statistical Methods with Varying Coefficient Models’, Statistics and its Interface, 1, 179–195.
PubMed Web of Science ®Google Scholar
Gasser, T., Müller, H.-G., and Mammitzsch, V. (1985), ‘Kernels for Nonparametric Curve Estimation’, Journal of the Royal Statistical Society: Series B Statistical Methodology, 47, 238–252.
Google Scholar
Hastie, T.J., and Tibshirani, R.J. (1993), ‘Varying-Coefficient Models’, Journal of the Royal Statistical Society: Series B Statistical Methodology, 55, 757–796.
Web of Science ®Google Scholar
Park, B.U., Mammen, E., Lee, Y.K., and Lee, E.R. (2015), ‘Varying Coefficient Regression Models: a Review and New Developments’, International Statistical Review, 83, 36–64.
Web of Science ®Google Scholar
Petersen, A., and Müller, H.-G. (2019), ‘Fréchet Regression for Random Objects with Euclidean Predictors’, Annals of Statistics, 47, 691–719.
Web of Science ®Google Scholar
Ruppert, D., Wand, M.P., and Carroll, R.J. (2003), Semiparametric Regression, London: Cambridge University Press.
Google Scholar
Tsybakov, A.B. (2009), Introduction to Nonparametric Estimation, New York: Springer-Verlag.
Google Scholar
Zhang, W., and Lee, S.Y. (2000), ‘Variable Bandwidth Selection in Varying-Coefficient Models’, Journal of Multivariate Analysis, 74, 116–134.
Web of Science ®Google Scholar

Appendix

Conditions AThe following assumptions are taken from Zhang and Lee (Citation2000).

(A1)	$E X_{j}^{2 s} < \infty$ for some $s > 2, j = 2, \dots, d$ .
(A2)	Let $a_{g}^{(l)}$ denote the lth derivative of $a_{g} (\cdot)$ ; $a_{g}^{(p + 1)} (\cdot)$ is continuous in a neighbourhood of $u_{0}$ for $g = 1, \dots, d$ . Further, assume $a_{g}^{(p + 1)} (u_{0}) \neq 0$ , for $g = 1, \dots, d$ .
(A3)	The functions $r_{jk} (\cdot)$ , $j, k = 1, \dots, d$ and $σ^{2} (\cdot)$ have bounded second derivatives in a neighbourhood of $u_{0}$ .
(A4)	The marginal density $f_{U} (u)$ of U has a continuous second derivative in some neighbourhood of $u_{0}$ and $f_{U} (u_{0}) \neq 0$ .
(A5)	The kernel function $K (\cdot)$ is a symmetric density function with compact support.

Proof

Proof of (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) and Proposition 3.1

From (Equation21(21) $W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0})$ (21) ), the LHS of the first equation in (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) is $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0}) \\ = ζ_{g, ν + 1}^{T} S_{n}^{- 1} S_{n} ζ_{1, q + 1} = δ_{g, 1} δ_{ν, q} . \end{aligned}$ Analogously, the LHS of the second equation in (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) is $\begin{aligned} \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = ζ_{g, ν + 1}^{T} S_{n}^{- 1} \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} ({\tilde{x}}_{(i)} \otimes u_{i 0}) K_{h} (U_{i} - u_{0}) \\ = ζ_{g, ν + 1}^{T} S_{n}^{- 1} S_{n} ζ_{k, q + 1} = δ_{g, k} δ_{ν, q} . \end{aligned}$ Hence (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ) is obtained. Then we show the results of Proposition 3.1. For (Equation23(23) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} (\frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}});$ (23) ), since $P (\cdot)$ is a polynomial of degree $\leq p$ , $P (U_{i}) = P (u_{0}) + P^{'} (u_{0}) (U_{i} - u_{0}) + \dots + \frac{P^{(p)} (u_{0})}{p!} (U_{i} - u_{0})^{p} .$ Plugging this polynomial into the LHS of (Equation23(23) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} (\frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}});$ (23) ), the RHS of (Equation23(23) $ν! \sum_{i = 1}^{n} P (U_{i}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} (\frac{d^{ν}}{d u^{ν}} P (u) |_{u = u_{0}});$ (23) ) is obtained based on the first equation in (Equation22(22) $\begin{aligned} \sum_{i = 1}^{n} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, 1} δ_{ν, q}, \\ \sum_{i = 1}^{n} X_{ik} (U_{i} - u_{0})^{q} W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} δ_{ν, q} . \end{aligned}$ (22) ). Since $Q (\cdot, x_{k}) = x_{k} P (\cdot)$ , analogous arguments can be derived for (Equation24(24) $ν! \sum_{i = 1}^{n} Q (U_{i}; X_{ik}) W_{ν}^{g} (\frac{U_{i} - u_{0}}{h}; x_{(i)}^{T}) = δ_{g, k} \frac{\partial}{\partial x_{k}} (\frac{\partial^{ν}}{\partial u^{ν}} Q (u, x_{k}) |_{u = u_{0}}) .$ (24) ).

Proof

Proof of Theorem 3.2

From (Equation14(14) $S_{n}^{- 1} = \frac{1}{n f_{U} (u_{0})} ([Ω_{2} (u_{0})]^{- 1} \otimes H_{1}^{- 1} S_{1}^{- 1} H_{1}^{- 1}) (1 + O_{p} (\frac{\log n}{\sqrt{n h}})),$ (14) ), the matrix $S_{n}^{- 1}$ for general d and p has an asymptotic form Zhang and Lee (Citation2000, equation (5.2)) $S_{n}^{- 1} = \frac{1}{n f_{U} (u_{0})} ([Ω_{d} (u_{0})]^{- 1} \otimes H_{p}^{- 1} S_{p}^{- 1} H_{p}^{- 1}) (1 + O_{p} (\frac{\log n}{\sqrt{n h}})) .$ Then based on (Equation3(3) $\hat{β} (u_{0}) = (X_{u_{0}}^{T} W_{u_{0}} X_{u_{0}})^{- 1} X_{u_{0}}^{T} W_{u_{0}} y .$ (3) ), $\hat{β} (u_{0}) = \sum_{i = 1}^{n} \frac{1}{nh f_{U} (u_{0})} ([Ω_{d} (u_{0})]^{- 1} \otimes H_{p}^{- 1} S_{p}^{- 1} H_{p}^{- 1}) ({\tilde{x}}_{(i)} \otimes u_{i}) K (\frac{U_{i} - u_{0}}{h}) Y_{i} \times (1 + o_{p} (1)) .$ By properties of Kronecker product, the equivalent kernel corresponding to ${\hat{β}}_{g, ν} (u_{0}) = ζ_{g, ν + 1}^{T} \hat{β} (u_{0})$ ( $ζ_{g, ν + 1}$ is derived as follows: $\begin{aligned} {\hat{β}}_{g, ν} (u_{0}) & = \sum_{i = 1}^{n} \frac{1}{n h^{1 + ν} f_{U} (u_{0})} (e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)}) \otimes (e_{p + 1, ν + 1}^{T} S_{p}^{- 1} H_{p}^{- 1} u_{i} K (\frac{U_{i} - u_{0}}{h})) Y_{i} \\ \times (1 + o_{p} (1)) \\ = \sum_{i = 1}^{n} \frac{1}{n h^{1 + ν} f_{U} (u_{0})} (e_{d, g}^{T} [Ω_{d} (u_{0})]^{- 1} {\tilde{x}}_{(i)} K_{ν}^{*} (\frac{U_{i} - u_{0}}{h})) Y_{i} (1 + o_{p} (1)) . \end{aligned}$ To show the moment property (Equation27(27) $\begin{aligned} \int t^{q} (\int \dots \int K_{1, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, 1} δ_{ν, q}, \\ \int t^{q} (\int \dots \int x_{k} K_{g, ν}^{*} (t; x_{2}, \dots, x_{d}) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) d t = δ_{g, k} δ_{ν, q}, \end{aligned}$ (27) ), again by properties of Kronecker product, for $k = 1, \dots, d$ , $\begin{aligned} \int (\int \dots \int ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ([\begin{matrix} x_{k} \\ ⋮ \\ x_{d} x_{k} \end{matrix}] \otimes [\begin{matrix} t^{q} \\ ⋮ \\ t^{p + q} \end{matrix}]) K (t) f (x_{2}, \dots, x_{d} | u_{0}) d x_{2} \dots d x_{d}) dt \\ = \int ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} ([\begin{matrix} r_{1 k} (u_{0}) \\ ⋮ \\ r_{dk} (u_{0}) \end{matrix}] \otimes [\begin{matrix} t^{q} \\ ⋮ \\ t^{p + q} \end{matrix}]) K (t) d t \\ = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} [(Ω_{d} (u_{0}) e_{d, k}) \otimes (S_{p} e_{p + 1, q + 1})] \\ = ζ_{g, ν + 1}^{T} (Ω_{d} (u_{0}) \otimes S_{p})^{- 1} (Ω_{d} (u_{0}) \otimes S_{p}) ζ_{k, q + 1} = δ_{g, k} δ_{ν, q} . \end{aligned}$

Decomposition and reproducing property of local polynomial equivalent kernels in varying coefficient models

ABSTRACT

1. Introduction

2. Background

3. Results

3.1. Local linear case with d = 2

3.2. The case with general d and p

3.3. Centering covariates

4. Examples

Acknowledgments

Disclosure statement

References

Appendix

Proof of Theorem 3.2

Information for

Open access

Opportunities

Help and information

Decomposition and reproducing property of local polynomial equivalent kernels in varying coefficient models

ABSTRACT

1. Introduction

2. Background

3. Results

3.1. Local linear case with d = 2

3.2. The case with general d and p

3.3. Centering covariates

4. Examples

Acknowledgments

Disclosure statement

Additional information

Funding

References

Appendix

Proof of (Equation22(22) ∑i=1n(Ui−u0)qWνg(Ui−u0h;x(i)T)=δg,1δν,q,∑i=1nXik(Ui−u0)qWνg(Ui−u0h;x(i)T)=δg,kδν,q.(22) ) and Proposition 3.1

Proof of Theorem 3.2

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date