623

Views

CrossRef citations to date

Altmetric

Articles in the special topic of Bayesian analysis

Bayesian analysis for quantile smoothing spline

Zhongheng Caia Faculty of Economics and Management, School of Statistics, East China Normal University, Shanghai, People's Republic of ChinaCorrespondence[email protected]

https://orcid.org/0000-0002-0000-3389 View further author information

Dongchu Suna Faculty of Economics and Management, School of Statistics, East China Normal University, Shanghai, People's Republic of China;b Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE, USAView further author information

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In Bayesian quantile smoothing spline [Thompson, P., Cai, Y., Moyeed, R., Reeve, D., & Stander, J. (2010). Bayesian nonparametric quantile regression using splines. Computational Statistics and Data Analysis, 54, 1138–1150.], a fixed-scale parameter in the asymmetric Laplace likelihood tends to result in misleading fitted curves. To solve this problem, we propose a new Bayesian quantile smoothing spline (NBQSS), which considers a random scale parameter. To begin with, we justify its objective prior options by establishing one sufficient and one necessary condition of the posterior propriety under two classes of general priors including the invariant prior for the scale component. We then develop partially collapsed Gibbs sampling to facilitate the computation. Out of a practical concern, we extend the theoretical results to NBQSS with unobserved knots. Finally, simulation studies and two real data analyses reveal three main findings. Firstly, NBQSS usually outperforms other competing curve fitting methods. Secondly, NBQSS considering unobserved knots behaves better than the NBQSS without unobserved knots in terms of estimation accuracy and precision. Thirdly, NBQSS is robust to possible outliers and could provide accurate estimation.

Keywords:

1. Introduction

While mean regression has been widely used in statistics, quantile regression, first proposed by Koenker and Bassett (Citation1978), is regarded as a powerful supplement of mean regression. It models the conditional quantiles of the dependent variable as a function of the covariates. There is an extensive literature on quantile regression in classic statistics from various fields including the works of Koenker (Citation2004), Chernozhukov (Citation2005), Wang et al. (Citation2007), Li and Zhu (Citation2008) and Wu and Liu (Citation2009). We recommend Koenker (Citation2005) for a comprehensive review on quantile regression. Suppose that we have N samples $(x_{1}, y_{1}), \dots, (x_{N}, y_{N})$ , the linear quantile regression model for the pth quantile is $y_{i} = x_{i}^{'} β + u_{i}$ , where $u_{i}$ are independent with pth quantile zero. Koenker and Bassett (Citation1978) proposed to obtain the estimators of the coefficient $β = (β_{1}, \dots, β_{m})^{'}$ by solving the following minimisation (1) $min_{β} \sum_{i = 1}^{N} ρ_{p} (y_{i} x_{i}^{'} β),$ (1) where $ρ_{p} (x)$ is called a check function with (2) $ρ_{p} (x) = \{\begin{cases} p x, & if x > 0, \\ (p 1) x, & if x \leq 0. \end{cases}$ (2) From a frequentist view, the consistency and the asymptotic normality of the estimators have been well established (Koenker, Citation2005). However, the Bayesian approach is preferred when the priority is making inference with a small sample size or obtaining credible intervals for all the parameters simultaneously. Under the Bayesian framework, Yu and Moyeed (Citation2001) assumed that the distribution of error term $u_{i}$ follows the asymmetric Laplace distribution (ALD). This choice has received tremendous attention for its guaranteed posterior consistency of Bayesian estimators (Sriram et al., Citation2013) and robustness (Yu & Moyeed, Citation2001) even if the random errors do not follow ALD. The density of ALD with location parameter $μ \in R$ , scale parameter $σ > 0$ and pth quantile, ALD $(μ, σ, p)$ , is (3) $p (y | μ, σ, p) = p (1 p) σ \exp (σ ρ_{p} (y μ)) .$ (3) With ALD, the likelihood function of $β$ is

(4) $\begin{aligned} p (y | β, σ, p) \\ = p^{N} (1 p)^{N} σ^{N} \exp (σ \sum_{i = 1}^{N} ρ_{p} (y_{i} x_{i}^{'} β)) . \end{aligned}$ (4)

Clearly, with a constant prior on $β$ , the posterior mode of $β$ given σ is the minimiser of (Equation1(1) $min_{β} \sum_{i = 1}^{N} ρ_{p} (y_{i} x_{i}^{'} β),$ (1) ).

Notice that (Equation4(4) $\begin{aligned} p (y | β, σ, p) \\ = p^{N} (1 p)^{N} σ^{N} \exp (σ \sum_{i = 1}^{N} ρ_{p} (y_{i} x_{i}^{'} β)) . \end{aligned}$ (4) ) is based on a linear relationship between the response variable and a set of predictors. In reality, the underlying relationship between univariate covariate x and y may be a nonlinear function $f (x)$ . In the discussion of Cole (Citation1988) and Jone (Citation1988) proposed to estimate quantile smoothing spline by (5) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{T} {(f^{''} (x))}^{2} d x\},$ (5) where $η \in R_{+}$ controls the smoothness of the spline. Koenker et al. (Citation1994) pointed out that the resulting quadratic program in (Equation5(5) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{T} {(f^{''} (x))}^{2} d x\},$ (5) ) poses some serious computational issues and they estimated $f (x)$ by minimising (6) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{1} | f^{''} (x) | d x\},$ (6) where $f (x)$ belongs to the space $U^{2}$ defined as follows, (7) $\begin{aligned} U^{2} & = \{g : g (x) = a_{0} + a_{1} x \\ + \int_{0}^{1} (x y)_{+} d μ (y), V (μ) < + \infty\} . \end{aligned}$ (7) They showed that the solution is the linear spline with the knots at $x_{i}$ and the computation enjoys linear programming virtues. However, the frequentist approaches cannot easily provide uncertainty evaluations for the estimation of $f (\cdot)$ or a stable estimator of η. From a Bayesian perspective, Thompson et al. (Citation2010) proposed original Bayesian quantile smoothing spline to solve (Equation5(5) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{T} {(f^{''} (x))}^{2} d x\},$ (5) ). Using ALD in (Equation3(3) $p (y | μ, σ, p) = p (1 p) σ \exp (σ ρ_{p} (y μ)) .$ (3) ) with $σ = 1$ serving as the likelihood, they imposed the prior on $z = (f (x_{1}), \dots, f (x_{n}))$ (8) $p (z | η) \propto \frac{η^{\frac{N 2}{2}}}{(2 π)^{\frac{N 2}{2}} (μ_{1} \dots μ_{N 2})^{1 / 2}} \exp (\frac{η}{2} z^{'} A z),$ (8) where $A$ is a smoothing matrix (See Section 2) and $μ_{1}, \dots, μ_{N 2}$ are the inverse of N−2 non-zero eigenvalues of $A$ . However, two undesirable issues exist in this method. Firstly, as demonstrated by Santos and Bolfarine (Citation2016), the likelihood with σ fixed is not flexible enough to capture the variability of the data, especially when p is away from 0.5, which leads to inaccurate estimates. Secondly, worse still, our extensive simulation studies and one real data example (Section 4) show that this method automatically poses either an extremely large or small value on η, hence the resulting curve tends to either too flat or overly fitted and fails to capture the true trend of the underlying curve.

One natural consideration to alleviate this problem is employing a random σ instead of a fixed value. We probe this option for Bayesian quantile smoothing spline using the likelihood (Equation4(4) $\begin{aligned} p (y | β, σ, p) \\ = p^{N} (1 p)^{N} σ^{N} \exp (σ \sum_{i = 1}^{N} ρ_{p} (y_{i} x_{i}^{'} β)) . \end{aligned}$ (4) ) and we refer to this option as the new Bayesian quantile smoothing spline (NBQSS). In fact, a random σ has been studied in other Bayesian quantile curve fitting methods. For instance, Yue and Rue (Citation2011) proposed Bayesian quantile regression for additive mixed models. Hu et al. (Citation2015) established semiparametric Bayesian quantile regression for partially linear additive models. However, they all employed the proper conjugate prior on scale parameter, which is not favourable when little information is available. In this paper, we explore the objective prior for the scale parameter σ and investigate the posterior propriety of the joint posterior with one general type prior including the invariant prior on σ. We also develop the partially collapsed Gibbs sampling algorithm for the joint posterior. Furthermore, to deal with the unobserved knots in some intervals, we extend our framework to the unobserved knots case seamlessly.

The remainder of paper is organised as follows. In Section 2, we present NBQSS and establish the corresponding conditions of posterior propriety under two common prior specifications. We also develop the related ready-to-use partially collapsed Gibbs sampling. In Section 3, we extend our conclusions to the case with unobserved knots out of practical consideration. In Section 4, to evaluate the performance of our method, we compare NBQSS with the other four competing curve fitting methods through extensive simulation studies. In addition, in the case of the unobserved knots, we investigate the behaviour of NBQSS under two unobserved mechanisms. In Section 5, we conclude with highlights in this paper, recommendations for the priors and research priorities in the future.

2. Bayesian quantile smoothing spline

2.1. Prior specification and posterior propriety

We formally address the minimisation (Equation5(5) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{T} {(f^{''} (x))}^{2} d x\},$ (5) ). To be more specific, we assume the knots $x_{i}$ are arranged in an increasing order, $0 \leq x_{1} < \dots < x_{n} \leq T$ . Suppose that $w_{i}$ is the number of observations at the knot $x_{i}$ , $i = 1, \dots, n$ with $\sum_{i = 1}^{n} w_{i} = N$ and $y_{i j}$ is the jth observation at $x_{i}$ , $j = 1, \dots, w_{i}$ . Then, (Equation5(5) $min_{f} \{\sum_{i = 1}^{N} ρ_{p} (y_{i} f (x_{i})) + η \int_{0}^{T} {(f^{''} (x))}^{2} d x\},$ (5) ) can be rewritten as (9) $\begin{aligned} min_{f \in W_{2}^{2}} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} f (x_{i})) + η \int_{0}^{T} (f^{''} (x))^{2} d x\}, \end{aligned}$ (9) where $W_{2}^{2} = {f : f, f^{'} is absolutely continuous, \int_{0}^{T} (f^{''} (t))^{2} d t < + \infty}$ . The solution to (Equation9(9) $\begin{aligned} min_{f \in W_{2}^{2}} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} f (x_{i})) + η \int_{0}^{T} (f^{''} (x))^{2} d x\}, \end{aligned}$ (9) ) is the natural cubic spline and the knots are $x_{i}, i = 1, \dots, n$ (Koenker et al., Citation1994; Wahba, Citation1990). Therefore, solving (Equation9(9) $\begin{aligned} min_{f \in W_{2}^{2}} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} f (x_{i})) + η \int_{0}^{T} (f^{''} (x))^{2} d x\}, \end{aligned}$ (9) ) is equivalent to solving (10) $\begin{aligned} min_{θ_{0}, θ_{1}, a} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} θ_{0} θ_{1} x_{i} \\ \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i})) + η a^{'} R a\}, \end{aligned}$ (10) where $R$ is a $n \times n$ positive matrix with $(i, j)$ element $R (x_{i}, x_{j})$ and $R (\cdot, \cdot)$ is a reproducing kernel (Gu, Citation2013) with the following form, (11) $\begin{aligned} R (x, y) & = \int_{0}^{1} (x u)_{+} (y u)_{+} d u \\ = \frac{1}{6} (x \land y)^{2} {3 (x \lor y) (x \land y)}, \end{aligned}$ (11) where $(x u)_{+} = max {x u, 0}$ , $x \lor y = max (x, y),$ $x \land y = min (x, y)$ , $a = (a_{1}, \dots, a_{n})^{'}$ and $a^{'} R a$ is the penalty term corresponding to $\int_{0}^{T} (f^{''} (t))^{2} d t$ in (Equation9(9) $\begin{aligned} min_{f \in W_{2}^{2}} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} f (x_{i})) + η \int_{0}^{T} (f^{''} (x))^{2} d x\}, \end{aligned}$ (9) ).

To handle the minimisation in (Equation10(10) $\begin{aligned} min_{θ_{0}, θ_{1}, a} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} θ_{0} θ_{1} x_{i} \\ \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i})) + η a^{'} R a\}, \end{aligned}$ (10) ) under the Bayesian framework, we assume the likelihood of $(θ_{0}, θ_{1}, σ, a)$ based on $y = (y_{11}, \dots, y_{1 w_{1}}, \dots, y_{n w_{n}})^{'}$ is (12) $\begin{aligned} p (y | θ_{0}, θ_{1}, a, σ, p) \\ \propto σ^{N} \exp \{σ \sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} \\ \times (y_{i j} θ_{0} θ_{1} x_{i} \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i}))\} . \end{aligned}$ (12) Define $θ = (θ_{0}, θ_{1})^{'}$ , we assign the prior for $(θ, a)$ to be (13) $π (θ, a | δ) \propto \frac{1}{δ^{n / 2}} \exp \{\frac{1}{2 δ} a^{'} R a\} .$ (13) Note that with the likelihood (Equation12(12) $\begin{aligned} p (y | θ_{0}, θ_{1}, a, σ, p) \\ \propto σ^{N} \exp \{σ \sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} \\ \times (y_{i j} θ_{0} θ_{1} x_{i} \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i}))\} . \end{aligned}$ (12) ) and the prior (Equation13(13) $π (θ, a | δ) \propto \frac{1}{δ^{n / 2}} \exp \{\frac{1}{2 δ} a^{'} R a\} .$ (13) ), the posterior mode of $(θ, a | δ, σ, y)$ is the solution to (Equation10(10) $\begin{aligned} min_{θ_{0}, θ_{1}, a} \{\sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} (y_{i j} θ_{0} θ_{1} x_{i} \\ \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i})) + η a^{'} R a\}, \end{aligned}$ (10) ).

As our primary interest lies in the fitted curves instead of the estimation of $(θ, a)$ , inspired by Speckman and Sun (Citation2003), we alternatively unify the prior of $(θ, a)$ as the prior of the unknown functional value, $z_{i} = θ_{0} + θ_{1} x_{i} + \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i})$ . Here, we denote $z = (z_{1}, \dots, z_{n})^{'} = T θ + R a, T = (1_{n}, x), x = (x_{1}, \dots, x_{n})^{'},$ then the likelihood (Equation12(12) $\begin{aligned} p (y | θ_{0}, θ_{1}, a, σ, p) \\ \propto σ^{N} \exp \{σ \sum_{i = 1}^{n} \sum_{j = 1}^{w_{i}} ρ_{p} \\ \times (y_{i j} θ_{0} θ_{1} x_{i} \sum_{k = 1}^{n} a_{k} R (x_{k}, x_{i}))\} . \end{aligned}$ (12) ) can be written in the following way, (14) $p (y | z, p, σ) \propto σ^{N} \exp \{σ ρ_{p} (y C z)\},$ (14) where $C = d i a g (1_{w_{1}}, \dots, 1_{w_{n}})$ and $C^{'} C = d i a g (w_{1}, w_{2}, \dots, w_{n}) .$ Notice that the reparameterisation reduces the number of parameters in the model.

Utilising (Equation13(13) $π (θ, a | δ) \propto \frac{1}{δ^{n / 2}} \exp \{\frac{1}{2 δ} a^{'} R a\} .$ (13) ) and the relationship between $z$ and $(θ, a)$ , we can obtain that the prior distribution of $z$ is a partial information normal prior (PIN) (Speckman & Sun, Citation2003) with density, (15) $π (z | δ) \propto \frac{| A |_{+}^{1 / 2}}{δ^{(n 2) / 2}} \exp (\frac{1}{2 δ} z^{'} A z),$ (15)

where $A = R^{1} R^{1} T (T^{'} R^{1} T)^{1} T^{'} R^{1}$ is a positive semi-definite matrix with rank n−2 and $| A |_{+}$ is the product of the positive eigenvalues of $A$ . According to Sun et al. (Citation1999), PIN can be interpreted as two parts, a constant prior on the null space of $A$ and a proper multivariate normal prior on the range of $A$ .

For the fully Bayesian analysis, one common prior of $(σ, δ)$ is the independent conjugate priors as follows, (16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) (Equation16(16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) ) has been widely used in Bayesian P-spline (Lang & Brezger, Citation2004) and Bayesian hierarchical linear mixed models (Hobert & Casella, Citation1996). Lang and Brezger (Citation2004) suggested the invariant prior ( $a_{0} = b_{0} = 0$ ) on σ, $a_{1} = 1$ and $b_{1}$ is a small quantity (such as $10^{2}$ ). However, there are two undesirable issues for introducing this specification in NBQSS. To start with, the posterior estimates may be sensitive to the choice of hyperparameter $b_{1}$ (see Section 4). More importantly, the joint posterior could be improper in some important cases. To demonstrate this point, we allow $a_{0}, a_{1}$ to be the real and $b_{0}, b_{1}$ to be nonnegative so that some limiting cases of Gamma priors such as the invariant prior could be included. With the likelihood (Equation14(14) $p (y | z, p, σ) \propto σ^{N} \exp \{σ ρ_{p} (y C z)\},$ (14) ), priors (Equation15(15) $π (z | δ) \propto \frac{| A |_{+}^{1 / 2}}{δ^{(n 2) / 2}} \exp (\frac{1}{2 δ} z^{'} A z),$ (15) ) and (Equation16(16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) ), the following theorem establishes the corresponding sufficient and necessary conditions for posterior propriety.

Theorem 2.1

Define $S S E = y^{'} (I_{N} C (C^{'} C)^{1} C) y$ , the posterior of $(z, σ, δ | y)$ is proper if Conditions A, B and C hold.

Condition A. One of the following holds:

$b_{1} > 0, N 2 + a_{0} > 0;$
$b_{1} = 0, a_{1} < 0$ .

Condition B. One of the following holds:

$S S E + 2 b_{0} > 0, n 2 + 2 a_{1} > 0;$
$S S E + 2 b_{0} = 0, N n + a_{0} < 0.$

Condition C. $N 2 + a_{0} + 2 a_{1} > 0.$

By Theorem 2.1, the posterior of $(z, σ, δ | y)$ does not exist for $a_{0} = b_{0} = 0$ and SSE = 0 or equivalently each $x_{i}$ corresponds to only one observation, since Condition (B2) in Theorem 2.1 is violated.

To overcome deficiencies in (Equation16(16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) ), instead of $(σ, δ)$ , we could consider priors on σ and the smoothing parameter $η = 1 / (2 σ δ)$ . To include a slightly more general class of independent priors of $(σ, η)$ , we specify (17) $π (σ, η) \propto \frac{1}{σ^{a + 1}} h (η), σ > 0, η > 0,$ (17) where $a \geq 0$ is fixed and $h (η)$ is one general prior for η. This idea of using prior $(σ, η)$ rather than $(σ, δ)$ is actually motivated by the prior in Bayesian additive smoothing splines (Sun & Speckman, Citation2008). With the likelihood (Equation14(14) $p (y | z, p, σ) \propto σ^{N} \exp \{σ ρ_{p} (y C z)\},$ (14) ), the priors (Equation15(15) $π (z | δ) \propto \frac{| A |_{+}^{1 / 2}}{δ^{(n 2) / 2}} \exp (\frac{1}{2 δ} z^{'} A z),$ (15) ) and (Equation17(17) $π (σ, η) \propto \frac{1}{σ^{a + 1}} h (η), σ > 0, η > 0,$ (17) ), we have the following theorem.

Theorem 2.2

We have the following two cases,

Under the priors (Equation15(15) $π (z | δ) \propto \frac{| A |_{+}^{1 / 2}}{δ^{(n 2) / 2}} \exp (\frac{1}{2 δ} z^{'} A z),$ (15) ) and (Equation17(17) $π (σ, η) \propto \frac{1}{σ^{a + 1}} h (η), σ > 0, η > 0,$ (17) ), the joint posterior of $(z, σ, η | y)$ is proper if $h (η)$ is proper and n>2 + a.
When SSE>0, suppose there exist L>0, b>0 such that (18) $h (η) \leq \frac{L}{η^{1 + b}}, η > 0,$ (18) the joint posterior of $(z, σ, η | y)$ is proper if n>2 + 2b and N>2 + a + b.

Theorem 2.2 establishes the sufficient condition for the posterior propriety of the joint posterior. Based on Theorem 2.2(a), when we use the invariant prior for σ $(a = 0)$ , despite of SSE = 0 or $> 0$ , the joint posterior is proper when $h (η)$ is proper and there are at least three knots. When SSE>0, Theorem 2.2(b) allows an improper prior on η. With a small quantity on a and b, for example a = 0 and b = 0.01, the joint posterior is proper when there are at least three knots and four total observations. For the necessary condition, we have the following theorem.

Theorem 2.3

Necessary condition

When SSE = 0, the necessary condition for the propriety of joint posterior $(z, σ, η | y)$ is n>2 + a and there exists a positive ϵ such that (19) $\int_{0}^{ϵ} η^{a} h (η) d η + \int_{ϵ}^{+ \infty} h (η) d η < + \infty .$ (19)

By Theorem 2.2(a) and 2.3, when SSE = 0 and the invariant prior is imposed on σ $(a = 0)$ , a proper $h (η)$ is sufficient and necessary condition for the propriety of joint posterior $(z, σ, η | y)$ .

Remark 2.1

When SSE = 0 and a = 0, the condition for the posterior propriety is the same as the case in the Bayesian smoothing spline (Tong et al., Citation2018).

Here, we imposed the invariant prior on σ for an objective purpose. At last, we need to specify a prior for η to complete the fully Bayesian analysis. In accordance with Theorem 2.2, when the invariant prior is used for σ, we need a proper prior for η to ensure a proper joint posterior. We adopt the scaled Pareto prior for η, (20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) In point of fact, (Equation20(20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) ) has been widely used in Bayesian smoothing spline (Cheng & Speckman, Citation2012; Tong et al., Citation2018) due to its heavy tail and straightforward explanation of hyperparameter c, which is the median of this distribution. Furthermore, (Equation20(20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) ) is equivalent to the hierarchical structure below (21) $η | s \sim e x p (s), s \sim e x p (c),$ (21) where s is the latent variable. This hierarchical structure offers computation benefits by the resulting conjugacy of the full conditional distribution of η. For the choice of the hyperparameter c, we select c to be the tuned smoothing parameter $η_{o p t}$ by generalised approximate cross-validation (GACV) (Yuan, Citation2006). The sensitivity test of hyperparameters c suggests the fitted curves are quite robust to the choice of c (see Section 4).

2.2. Partially collapsed gibbs sampling

To facilitate the Gibbs sampling involving ALD, we decomposed ALD as a mixture of an exponential and a scaled normal distribution (Kozumi & Kobayashi, Citation2011). To be specific, if Y is distributed as $A L D (μ, σ, p)$ , (22) $Y \overset{d}{=} ξ_{1} ν + ξ_{2} σ^{\frac{1}{2}} \sqrt{ν} z + μ,$ (22) where $ξ_{1} = \frac{1 2 p}{p (1 p)}, ξ_{2} = \sqrt{\frac{2}{p (1 p)}},$ $ν \sim E x p (σ), z \sim N (0, 1)$ , ν and z are independent. This representation allows us to express the quantile regression model as the normal regression with the latent variable ν distributed as exponential. Incorporating the latent variables s and $ν = (ν_{1 w_{1}}, \dots, ν_{n w_{n}})$ through (Equation21(21) $η | s \sim e x p (s), s \sim e x p (c),$ (21) ) and (Equation22(22) $Y \overset{d}{=} ξ_{1} ν + ξ_{2} σ^{\frac{1}{2}} \sqrt{ν} z + μ,$ (22) ), we can obtain the full conditional distributions of the unknown parameter including latent variables. We further applied partially collapsed Gibbs sampling (van Dyk & Park, Citation2008) by marginalising the latent variable $ν$ in $(σ, ν | z, σ, s, y)$ to achieve a better mixing property. The sampling procedure for the joint distribution of $(z, σ, η, s, ν | y)$ is as follows.

Sample σ from $G a m m a (N + \frac{n}{2} 1, ρ_{p} (y C z) + η z^{'} A z) .$
Sample $z$ from $N (μ, Σ),$ where $Σ = (Ω + 2 σ η A)^{1}, μ = Σ B,$ and $\begin{aligned} B & = (B_{1}, \dots, B_{n}), B_{i} = \sum_{j = 1}^{w_{i}} \frac{y_{i j} ξ_{1} ν_{i j}}{ξ_{2}^{2} ν_{i j}}, \\ Ω & = d i a g (C_{1}, \dots, C_{n}), C_{i} = \sum_{i = 1}^{w_{i}} \frac{1}{ξ_{2}^{2} ν_{i j}} . \end{aligned}$
Sample η from $G a m m a (\frac{n}{2}, s + σ z^{'} A z) .$
Sample s from $G a m m a (2, η + c) .$
The density of $ν_{i j} \in ν$ is proportional to $\begin{aligned} \frac{1}{\sqrt{ν_{i j}}} \exp \{(\frac{ξ_{1}^{2}}{2 ξ_{2}^{2}} + σ) ν_{i j} \frac{(y_{i j} z_{i})^{2}}{2 ξ_{2}^{2} ν_{i j}}\}, \\ i = 1, \dots, n, j = 1, \dots, w_{i} . \end{aligned}$ If we set $ν_{i j}^{*} = 1 / ν_{i j}$ , $ν_{i j}^{*}$ follows the inverse Gaussian (IG) distribution, IG $(σ ξ_{1} ξ_{2}^{2} + 2 σ, | y_{i j} z_{i} |^{1} \sqrt{ξ_{1} + 2 σ ξ_{2}^{2}})$ , where a random variable X follows IG distribution (Jørgensen, Citation1982), IG $(ν, λ)$ , if the density is $\sqrt{\frac{λ}{2 π x^{3}}} \exp (\frac{1}{2} \frac{λ (x ν)^{2}}{2 ν^{2} x}) .$

3. Smoothing spline with unobserved knots

In practice, it is crucial to consider situations where knots may be unobserved in several intervals but we are interested in the fitted values at some given knots. Here, the unobserved knots refer to the case where there is no observation at the knots. This situation is common. For instance, in the bond transaction data, the trading data for the short term is more abundant due to its good fluidity than the long term. However, we still need to know some fitted values at key knots without observations in the long term. Notice that a regular NBQSS could provide the whole fitted curve based on the observed knots and hence obtain an estimate for any given knot, including the unobserved. In contrast, in this section, we directly incorporate the functional values at the unobserved knots as unknown parameters into the model. Therefore, it is feasible to borrow the information from observed knots. It is within our expectation that NBQSS with unobserved knots exhibits less uncertainty compared with a regular NBQSS and give a better prediction at unobserved knots. We prespecified position and the number of the unobserved knots. Admittedly, the choice for optimal locations or the number of added knots remains a challenging problem but it is beyond the scope of our paper.

Consider ${x_{1}, x_{2}, \dots, x_{n}}$ as the complete knots, assume that there are m unobserved knots, say ${{\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{m}},$ in ${x_{1}, x_{2}, \dots, x_{n}}$ and the rest observed knots are marked as ${x_{1}, x_{2}, \dots, x_{n m}}$ . We still assume that there exist $w_{i}$ observations at $x_{i}$ , $i = 1, \dots, n m$ and $\sum_{i = 1}^{n m} w_{i} = N$ .

Notice that one nice aspect of the priors associated with smoothing spline is that they extend naturally to $f (x)$ for arbitrary unobserved knots (Nychka, Citation2000). Therefore, we introduce the incidence matrix, denoted as $C$ , to the likelihood of $(z, σ)$ with kernel (23) $\begin{aligned} σ^{N} \exp (σ ρ_{p} (y C z)), \\ z = (z_{1}, \dots, z_{n m}, {\tilde{z}}_{1}, \dots, {\tilde{z}}_{m}), \end{aligned}$ (23) where $(z_{1}, \dots, z_{n m})$ is the unknown functional value at the knots $(x_{1}, \dots, x_{n m})$ and $({\tilde{z}}_{1}, \dots, {\tilde{z}}_{m})$ corresponds to the knots $({\tilde{x}}_{1}, \dots, {\tilde{x}}_{m})$ . $C^{'} C = d i a g (w_{1}, \dots, w_{n})$ and $w_{i}$ is positive when $z_{i}$ corresponds to the observed knot, zero otherwise. $C$ is utilised to indicate the corresponding response variable for $z_{i}, i = 1, \dots, n m$ . For an intuitive understanding, we provide an illustration for $C$ . Suppose there are three knots $x_{1} < {\tilde{x}}_{1} < x_{2}$ , there are 2 observations at $x_{1}$ , 2 observations at $x_{2}$ , and no observation at ${\tilde{x}}_{1}$ . Then, $C$ is $(\begin{array}{ccc} 1 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 1 \end{array})$ and $C^{'} C = d i a g (2, 0, 2)$ .

The following theorem establishes the posterior propriety for NBQSS with unobserved knots.

Theorem 3.1

Under the likelihood (Equation23(23) $\begin{aligned} σ^{N} \exp (σ ρ_{p} (y C z)), \\ z = (z_{1}, \dots, z_{n m}, {\tilde{z}}_{1}, \dots, {\tilde{z}}_{m}), \end{aligned}$ (23) ) and PIN prior (Equation15(15) $π (z | δ) \propto \frac{| A |_{+}^{1 / 2}}{δ^{(n 2) / 2}} \exp (\frac{1}{2 δ} z^{'} A z),$ (15) ) on $z$ , the conditional posterior of $(z | σ, δ, y)$ or $(z | σ, η, y)$ after integrating $({\tilde{z}}_{1}, \dots, {\tilde{z}}_{m})$ has the same structure as the case where there is no additional knot.

By Theorem 3.1, it follows immediately that, if we specify the prior for $(σ, δ)$ or $(σ, η)$ , the conditions for posterior propriety of the joint posterior of all the parameters is the same as the regular NBQSS.

4. Numerical analysis

4.1. Sensitivity analyses of the hyperparameters

In this section, we perform the sensitivity analyses in NBQSS with respect to its two candidate prior specifications including the Gamma prior (Equation16(16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) ) for δ and the scaled Pareto prior (Equation20(20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) ) for η. Assume the data are generated from (24) $y_{i} = f (x_{i}) + ϵ_{i}, ϵ_{i} \overset{i . i . d .}{\sim} N (0, {0.06}^{2}),$ (24) with (25) $\begin{aligned} f (x) & = {(1 + e^{4 (x 0.3)})}^{1} + {(1 + e^{3 (x 0.2)})}^{1} \\ + {(1 + e^{4 (x 0.7)})}^{1} + {(1 + e^{5 (x 0.8)})}^{1} . \end{aligned}$ (25) We equally divide $x \in [2, 2]$ into 50 pieces and the knots are ${x_{1}, \dots, x_{50}}$ . For the knots ${x_{1}, \dots, x_{24}, x_{26}, \dots, x_{50}}$ , we generate one observation. For the knot $x_{25}$ , we generate 2 observations to ensure the existence of the posterior of $(z, σ, δ | y)$ . The function (Equation25(25) $\begin{aligned} f (x) & = {(1 + e^{4 (x 0.3)})}^{1} + {(1 + e^{3 (x 0.2)})}^{1} \\ + {(1 + e^{4 (x 0.7)})}^{1} + {(1 + e^{5 (x 0.8)})}^{1} . \end{aligned}$ (25) ) is also utilised by Jullion and Lambert (Citation2007) to check the sensitivity of the hyperparameter for Gamma prior in Bayesian P-spline. In Figure , we may find that the fitted curves fluctuate a lot to the choice of the hyperparameter $b_{1}$ in Gamma prior (Equation16(16) $π (σ, δ) \propto σ^{a_{0} 1} \exp (b_{0} σ) \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}) .$ (16) ) while they are quite robust to the choice of c in the scaled Pareto prior (Equation20(20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) ). Additionally, we compare the estimation performances using Gamma prior for δ with the scaled Pareto prior for η by integrated square error (ISE) defined as, (26) $I S E = \int_{2}^{2} {(\hat{f} (x) f (x))}^{2} d x .$ (26) A smaller ISE indicates a better estimation for $f (x)$ . The results are summarised in Table . From Table , we may find using scaled Pareto prior for η outperforms Gamma prior for δ, especially when the hyperparameter is chosen small.

Figure 1. The curves are fitted when p = 0.5 and the true curves are the solid lines. Graph (a): Gamma prior, Ga( $a_{1}, b_{1}$ ), for δ with $a_{1} = 1$ and $b_{1} = 0.01$ (dash-dotted), $b_{1} = 0.1$ (dotted), $b_{1} = 0.5$ (dashed); Graph (b): Scaled Pareto prior for η with c = 0.05 (dash-dotted), c = 0.1 (dashed) and c = 0.5 (dotted).

Table 1. ISEs for Gamma prior on δ and scaled Pareto prior on η.

Display Table

4.2. Comparsion studies

Our comparison studies are conducted from two perspectives. For one thing, we evaluate the performance of our method by comparing with other four popular competing methods. For another, we investigate how the proposed method works when there are unobserved knots. We generate the data by, (27) $y_{i} = f (x_{i}) + a_{p} + ϵ_{i}, i = 1, \dots, n,$ (27) where $f (\cdot)$ is a prespecified function and $a_{p}$ is the tuning number to ensure the pth quantile zero. We consider the following three distributions for the error term:

Normal distribution, N $(0, {0.1}^{2})$ ;
ALD $(0, 0.05, p)$ , where p is the chosen quantile;
Cauchy distribution with the location parameter 0 and the scale parameter 0.013, C $(0, 0.013)$ .

Given a quantile p and an error distribution above, we simulate 200 datasets. Within each dataset, we split it into one training dataset and one validation dataset. For the training dataset, we equally divide (0,1) into 20 spaces and simulate 5 observations at each knot. For the validation dataset, we equally divide (0,1) into 30 spaces and generate one observation at each knot. The validation dataset is employed to obtain the optimal regularisation parameter. We compare the different procedures by median of integrated absolute error (MIAE) and median of integrated square error (MISE), $\begin{aligned} M I A E & = m e d i a n (\int_{0}^{1} |\hat{f} (x) f (x)| d x), \\ M I S E & = m e d i a n (\int_{0}^{1} {(\hat{f} (x) f (x))}^{2} d x), \end{aligned}$ where median is taken over 200 simulation studies.

4.2.1. Comparison with candidate approaches

The Monte Carlo method is used to compare our method with other approaches including,

New Bayesian quantile smoothing spline (NBQSS);
Original Bayesian quantile smoothing spline (OBQSS) (Thompson et al., Citation2010);
Quantile smoothing spline (QSS) (Koenker et al., Citation1994);
Bayesian smoothing spline (BSS) (Tong et al., Citation2018);
Smoothing spline (SS) (Wahba, Citation1990).

We generate the data by (Equation27(27) $y_{i} = f (x_{i}) + a_{p} + ϵ_{i}, i = 1, \dots, n,$ (27) ) with the following two underlying curves (28) $\begin{aligned} Curve I : f (x) & = 2 + 0.25 \sin (0.25 π x) \\ + 0.25 \ln (x), 0 < x < 1, \end{aligned}$ (28) (29) $\begin{aligned} Curve II : f (x) & = 2 + 0.2 \sin (5 π x) \\ + 0.25 \ln (x), 0 < x < 1. \end{aligned}$ (29) The underlying Curve I is monotone and Curve II is a little bit twisted. For the smoothing parameter η in NBQSS, we adopt the scaled Pareto prior (Equation20(20) $h (η) \propto \frac{c}{(c + η)^{2}}, η > 0, c > 0.$ (20) ) and the hyperparameter is chosen by GACV. For the smoothing parameter η in OBQSS, a Gamma $(α, β)$ prior is utilised and its hyperparamters are selected in accordance with Thompson et al. (Citation2010)'s suggestion. Specifically, $β = 0.1 / G C V$ (mean spline) and $α = G C V$ (mean spline) $/ β$ , where GCV(mean spline) is the value of η chosen by the generalised cross-validation (Craven & Wahba, Citation1978). The numerical results for Curves I and II are listed in Tables and , respectively.

Table 2. MIAEs and MISEs for Curve I with p = 0.1, 0.3 and 0.5.

Display Table

Table 3. MIAEs and MISEs for Curve II with p = 0.1, 0.3 and 0.5.

Display Table

From Tables and , NBQSS uniformly outperforms OBQSS, which indicates our method serves as an efficient improvement over the original method. Among three quantile methods (NBQSS, OBQSS and QSS), NBQSS performs best with the lowest MIAE and MISE in 8 out of 9 comparisons. However, when the error is the normal distribution, two mean regression methods (BSS and SS) perform better than three quantile regression methods with lower MIAE and MISE. In contrast, when the error is the ALD or the Cauchy, two quantile approaches (NBQSS and QSS) have better performances for all the selected quantiles. In addition, OBQSS has the worst performance among the five candidate methods no matter which quantiles or random errors are used. Also, OBQSS tends to perform worse as p is away from 0.5, which echoes the findings in Santos and Bolfarine (Citation2016). Although a similar trend appears in all quantile methods, compared with other candidate models, our method still shows considerable advantages.

We also demonstrate the behaviours of fitted curves for Curves I and II with ALD error and p = 0.3. In the 200 simulation studies, we can obtain the fitted value for any point $x \in (0, 1)$ . We treated the median of the 200 simulation studies at the point x as the estimation and draw the fitted curves in Figures and . We also plot the pointwise empirical 2.5% and 97.5% quantiles of the 200 estimates to display the variability in estimation.

Figure 2. The curves are fitted for Curve I under p = 0.3 and ALD error. Graph (a) NBQSS method; Graph (b) OBQSS method; Graph (c) QSS method; Graph (d) BSS method; Graph (e) SS method. The solid lines are the true curve and the fitted curve. Two dashed lines are 2.5% and 97.5% pointwise empirical quantiles.

Figure 3. The curves are fitted for Curve II under p = 0.3 and ALD error. Graph (a) NBQSS method; Graph (b) OBQSS method; Graph (c) QSS method; Graph (d) BSS method; Graph (e) SS method. The solid lines are the true curve and the fitted curve. Two dashed lines are 2.5% and 97.5% pointwise empirical quantiles.

From Figures and , we may find that OBQSS tends to give a relatively flat curve, which fails to capture the true trend in the curve, especially when the curve is twisted. Based on our experience, in this case, OBQSS chooses large values for η, which results in a flat fitted curve. For example, for Curve I with p = 0.3 and ALD distributed random errors, the posterior median of η in NBQSS is 0.004 while in OBQSS is 1.135. This phenomenon requires further investigation.

4.2.2. Comparison of NBQSS with and without unobserved knots

We investigate how NBQSS performs with unobserved knots. We generate the data by (Equation27(27) $y_{i} = f (x_{i}) + a_{p} + ϵ_{i}, i = 1, \dots, n,$ (27) ) with the underlying curve (30) $f (x) = 0.8 \sin (π x), 0 < x < 1.$ (30) We equally divide (0,1) into 20 spaces and regard it as the complete knots. Two representative unobserved knots mechanisms are considered,

Mechanism I: α% of the complete knots are unobserved in the middle.
Mechanism II: two knots around 0.5 have observations and $(α / 2)$ % of the complete knots are unobserved in each of two intervals, (0,0.47) and (0.52,1).

We take $α % =$ 30%, 50% and 70% to represent low, medium and high unobserved proportion. For each observed knot, we randomly generate 5 observations. We illustrate the above two mechanisms in Figure . We implement the regular NBQSS (without unobserved knots) and NBQSS with unobserved knots (we refer it to NBQSSK, ‘K’ is for knot). For Mechanism I, we add 6 uniformly distributed knots in total for the interval without knots. For Mechanism II, we add 3 uniformly distributed knots in each of two intervals without knots, hence 6 unobserved knots in total. We consider p = 0.1, 0.5 and 0.9. MIAEs and MISEs are recorded in Tables and .

Figure 4. One illustration for Mechanisms I and II. Graphs (a)–(c) correspond to Mechanism I with $α % = 30 %, 50 %$ and 70%. Graphs (d) and (e) correspond to Mechanism II with $α % =$ 30%, 50% and 70%.

Table 4. MIAEs and MISEs for Mechanism I with quantiles p = 0.1, 0.5, 0.9 and unobserved proportions $α % = 30 %, 50 %, 70 %$ .

Display Table

Table 5. MIAEs and MISEs for Mechanism II with quantiles p = 0.1, 0.5, 0.9 and unobserved proportions $α % = 30 %, 50 %, 70 %$ .

Display Table

In Tables and , we may find two similarities for NBQSS and NBQSSK. Firstly, as α increases, the prediction errors increase for two methods. Secondly, both methods perform the best for p = 0.5 compared with other selected quantiles. In addition, based on Tables and , for brevity, Table summarises the percentage where NBQSSK outperforms NBQSS in each cell across different combinations of unobserved proportions and error distributions. From Table , we may find that NBQSSK defeats NBQSS in all the scenarios, especially for a large α, which substantiates that NBQSSK can provide the better prediction compared with regular NBQSS.

Table 6. Summary of cases where NBQSSK outperforms NBQSS.

Download CSV Display Table

Furthermore, to offer a more intuitive comparison of NBQSS and NBQSSK, we present the fitted curves and their empirical 95% credible intervals with quantile p = 0.1, ALD random error for Mechanisms I and II setting the unobserved proportion $α %$ =50% and 70%. From Figure , we have several main findings. Firstly, NBQSSK performs similarly compared with NBQSS in the interval with observed knots. Secondly, two methods provide wider empirical credible intervals for Mechanism I than Mechanism II. It implies that the uncertainty surges when there is no guided information in the middle. Thirdly, for Mechanism I, the empirical 95% credible intervals of the NBQSSK are shorter than the NBQSS, especially in the middle interval. It reveals that NBQSSK can provide a more efficient uncertainty evaluation for the unobserved knots. However, the advantage of NBQSSK is less obvious for Mechanism II. Besides, for Mechanism I, NBQSSK is more advantageous than NBQSS when the unobserved proportion $α % = 70 %$ compared with $α % = 50 %$ , which echoes our findings in Tables and . At last, for Mechanism I or II, compared with $α % = 50 %$ , $α % = 70 %$ has wider empirical 95% credible interval widths.

Figure 5. The solid lines are the true curve and fitted curves. The curves are fitted with p = 0.1, ALD random error. Dashed lines are empirical 95% credible interval corresponding to NBQSSK and dotted lines are empirical 95% credible interval corresponding to NBQSS. Graph (a): Mechanism I, $α % = 50 %$ ; Graph (b): corresponds to Mechanism II, $α % = 50 %$ ; Graph (c): Mechanism I, $α % = 70 %$ ; Graph (d): Mechanism II, $α % = 70 %$ .

Remark 4.1

Notice that the above simulation studies are based on 5 observations at each observed knot. In fact, we have also conducted simulation studies with unequal number of observations at each observed knot. The main findings are similar to Tables and . This implies that the relative performance of NBQSSK and NBQSS is irrespective of the number of observations at each observed knot.

4.3. Real dataset

4.3.1. Motorcycle dataset

We analysed the well-known data utilised by Silverman (Citation1985) to demonstrate nonparametric regression curve fitting. It has been frequently used to motivate and demonstrate the spline-based methodology, since the underlying curve makes polynomial modelling inappropriate. There are 113 observations in the data including accelerometer readings taken through time in an experiment on the efficacy of crash helmet. For NBQSS, we fit the quantile curves with p = 0.1, 0.5 and 0.9. To assess their performances, we fit the median curve with OBQSS and the mean curves with BSS and SS. Figure shows the fitted curves of these methods. In Graph (a), with p = 0.5, NBQSS captures the general trend. With more quantiles, NBQSS offers an overview of the distribution. Interestingly, in Graph (b), the curve fitted using OBQSS is jittering very much, which tends to overfit. Similar results can be found in OBQSS when p = 0.1 and 0.9 (not shown here). Compared with NBQSS, the estimation of smoothing parameter is $8.167 \times 10^{6}$ while is 0.068 in NBQSS. Compared with two mean curve fitting methods, we record the median of absolute deviation (MAD) defined as (31) $M A D = m e d i a n (|y_{i} {\hat{y}}_{i}|), i = 1, \dots, 113.$ (31) for each method. The MAD for NBQSS with p = 0.5, BSS and SS is 8.755, 12.979 and 13.02781, respectively. It indicates that the median curve fitted by NBQSS are relatively robust to the points with big deviation compared with mean curve fitting methods.

Figure 6. Graph (a): NBQSS method with p = 0.1, 0.5 and 0.9; Graph (b): OBQSS method with p = 0.5; Graph (c): BSS method; Graph (d): SS method.

4.3.2. China bond medium term note

The term structure of interest rates is the series of interest rates ordered by time to maturity at a given time. It is a fundamental concept in economic and financial theory such as fixed-income securities analysis, pricing derivatives, performing hedging operations, etc. The ‘term structure of interest rates’ is also known as a yield curve. In financial markets, there are a limited number of bonds traded, so an interpolation method is necessary to estimate the yield cuvre across the whole maturity spectrum. Tong et al. (Citation2018) employed the BSS to fit China bond yield curve and the simulation studies show that BSS outperforms the traditional yield curve fitting methods such as Nelson-Siegel model (Nelson & Siegel, Citation1987) and Svensson extension model (Svensson, Citation1994). However, when fitting the yield curve for China Bond Medium Term Note (CBMTN), which is one kind of bond issued by corporation or company to collect the capital, there always exist some underlying outliers disturbing the signals. In comparison to BSS, we employ NBQSS to obtain the quantile curves for CBMTN. The data set is downloaded from Wind (https://www.wind.com.cn/). This analysis focuses on the yield curve of CBMTN rating ‘CAAA’ on 25 September 2018, which is the highest rating of this bond and associated with lowest yield but lowest risk. There are 282 transaction data in CBMTN and there exist some obvious abnormal transactions in the data. Figure shows us the NBQSS curves for p = 0.1, 0.5 and 0.9 along with BSS curve. The transaction data are marked as grey points. We may find that the BSS curve is overall lifted up by the outliers compared with the NBQSS using p = 0.5, especially when the time to maturity belongs to [0,5]. The results imply that the quantile curves using NBQSS are robust to the possible outliers while BSS is not. In addition, we record MAD for NBQSS with p = 0.5 and BSS. The MAD for NBQSS with p = 0.5 is 0.451 and for BSS is 0.79, which indicates that NBQSS provides more accurate estimation compared with BSS.

Figure 7. BSS curve (solid) and NBQSS with $p =$ 0.1 (dashed), 0.5 (dotted) and 0.9 (dotted-dash) curves on 25 September 2018.

5. Comments

In this paper, we numerically demonstrate the issue associated with a fixed σ in OBQSS. To serve as a solution, we systematically investigate NBQSS with a random scale parameter. We establish conditions for the posterior propriety of the NBQSS under two common prior options, conjugate priors for $(σ, δ)$ and one general prior $1 / σ^{a + 1} h (η)$ on $(σ, η)$ . These conditions are easy for practitioners to verify, hence serve as a guide to specify priors. We recommend imposing the prior on $(σ, η)$ rather than $(σ, δ)$ to an ensured proper joint posterior and a relatively robust hyperparameter specification. In practice, it is often for researchers to face unobserved knots when dealing with curve fitting. Therefore, we extend our theoretical results to NBQSS with unobserved knots. Finally, our simulation studies imply that our NBQSS performs the best in candidate quantile methods and NBQSS with unobserved knots perform better than regular NBQSS for most of cases. As with any simulation study, these cannot cover all possibilities, but we believe that they are sufficiently wide ranging to provide useful insights into the comparative performance of the different procedures.

Several unsolved issues in our work still worth investigating. For example, as shown in the simulation studies, all quantile methods perform relatively worse near the boundary, such as p = 0.1, and more work is encouraged to improve their boundary behaviours under the Bayesian framework. Also, when there are unobserved knots, how to choose the positions of added knots and how to decide the number of added knots to obtain better fitted curves are of interest in NBQSS. Furthermore, we do not consider the crossing problem for different quantiles in this article. Without special restriction, quantile regression functions estimated at different orders can cross each other, which disobeys the rule of the probability. Therefore, another concern is to develop the theoretical results for the non-crossing Bayesian regularised regression quantile (Liu & Wu, Citation2011).

Acknowledgements

The authors sincerely thank two anonymous reviewers for the thoughtful and constructive suggestions they provided that led to considerable improvements in the presentation of our work.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The project was supported by the National Natural Science Foundation of China [Grant Number 11671146].

Notes on contributors

Zhongheng Cai

Zhongheng Cai is a PhD candidate in the school of statistics, East China Normal University, Shanghai, China. His research interests include Bayesian statistics.

Dongchu Sun

Dr. Dongchu Sun received his PhD. in 1991 from Department of Statistics, Purdue University, under the guidance of Professor James O. Berger.

References

Andrews, D. F., & Mallows, C. L. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society. Series B (Methodological), 36(3), 99–102. https://doi.org/https://doi.org/10.1111/rssb.1974.36
Google Scholar
Cheng, C. I., & Speckman, P. L. (2012). Bayesian smoothing spline analysis of variance. Computational Statistics and Data Analysis, 56(12), 3945–3958. https://doi.org/https://doi.org/10.1016/j.csda.2012.05.020
Web of Science ®Google Scholar
Chernozhukov, V. (2005). Extremal quantile regression. Annals of Statistics, 33(2), 806–839. https://doi.org/https://doi.org/10.1214/009053604000001165
Web of Science ®Google Scholar
Cole, T. J. (1988). Fitting smoothed centile curves to reference data. Journal of the Royal Statistical Society. Series A, 151(3), 385–418. https://doi.org/https://doi.org/10.2307/2982992
Web of Science ®Google Scholar
Craven, P., & Wahba, G. (1978). Smoothing noisy data with spline functions. Numerische Mathematik, 31(4), 377–403. https://doi.org/https://doi.org/10.1007/BF01404567
Web of Science ®Google Scholar
de Oliveira, V. (2007). Objective Bayesian analysis of spatial data with measurement error. The Canadian Journal of Statistics , 35(2), 283–301. https://doi.org/https://doi.org/10.1002/cjs.v35:2
Web of Science ®Google Scholar
Gordon, P. (1941). Values of Mills' ratio of area to bounding ordinate and of the normal probability integral for large values of the argument. Annals of Mathematical Statistics, 12(3), 364–366. https://doi.org/https://doi.org/10.1214/aoms/1177731721
Google Scholar
Gu, C. (2013). Smoothing spline ANOVA models (2nd ed.). Springer.
Google Scholar
Hobert, J. P., & Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. Journal of the American Statistical Association, 91(436), 1461–1473. https://doi.org/https://doi.org/10.1080/01621459.1996.10476714
Web of Science ®Google Scholar
Hu, Y., Zhao, K., & Lian, H. (2015). Bayesian quantile regression for partially linear additive models. Statistics and Computing, 25(3), 651–668. https://doi.org/https://doi.org/10.1007/s11222-013-9446-9
Web of Science ®Google Scholar
Jone, M. C. (1988). Discussion of paper by T. J. Cole. Journal of the Royal Statistical Society. Series A, 151(3), 412–413. https://doi.org/https://doi.org/10.2307/2982992
Google Scholar
Jørgensen, B. (1982). Statistical properties of the generalized inverse Gaussian distribution. Springer-Verlag New York Inc.
Google Scholar
Jullion, R., & Lambert, P. (2007). Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models. Computational Statistics and Data Analysis, 51(5), 2542–2558. https://doi.org/https://doi.org/10.1016/j.csda.2006.09.027
Web of Science ®Google Scholar
Koenker, R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis, 91(1), 74–89. https://doi.org/https://doi.org/10.1016/j.jmva.2004.05.006
Web of Science ®Google Scholar
Koenker, R. (2005). Quantile regression. Cambridge University Press.
Google Scholar
Koenker, R., & Bassett, G. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/https://doi.org/10.2307/1913643
Web of Science ®Google Scholar
Koenker, R., Ng, P., & Portnoy, S. (1994). Quantile smoothing splines. Biometrika, 81(4), 673–680. https://doi.org/https://doi.org/10.1093/biomet/81.4.673
Web of Science ®Google Scholar
Kozumi, H., & Kobayashi, G. (2011). Gibbs sampling methods for Bayesian quantile regression. Journal of Statistical Computation and Simulation, 81(11), 1565–1578. https://doi.org/https://doi.org/10.1080/00949655.2010.496117
Web of Science ®Google Scholar
Lang, S., & Brezger, A. (2004). Bayesian P-Splines. Journal of Computational and Graphical Statistics, 13(1), 183–212. https://doi.org/https://doi.org/10.1198/1061860043010
Web of Science ®Google Scholar
Li, Y, & Zhu, J. (2008). L1-Norm quantile regression. Journal of Computational and Graphical Statistics, 17(1), 163–185. https://doi.org/https://doi.org/10.1198/106186008X289155
Web of Science ®Google Scholar
Liu, Y., & Wu, Y. (2011). Simultaneous multiple non-crossing quantile regression estimation using kernel constraints. Journal of Nonparametric Statistic, 23(2), 415–437. https://doi.org/https://doi.org/10.1080/10485252.2010.537336
PubMed Web of Science ®Google Scholar
Nelson, C. B., & Siegel, A. F. (1987). Parsimonious modeling of yield curves. Journal of Business, 60(4), 473–489. https://doi.org/https://doi.org/10.1086/jb.1987.60
Web of Science ®Google Scholar
Nychka, D. (2000). Smoothing and regression: approaches, computation, and application. Wiley.
Google Scholar
Santos, B, & Bolfarine, H. (2016). On Bayesian quantile regression and outliers. https://arxiv.org/pdf/1601.07344.pdf
Google Scholar
Silverman, B. (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 47(1), 1–52. https://doi.org/https://doi.org/10.1111/j.2517-6161.1985.tb01327.x
Web of Science ®Google Scholar
Speckman, L. P., & Sun, D. (2003). Fully Bayesian spline smoothing and intrinsic autoregressive prior. Biometrika, 90(2), 289–302. https://doi.org/https://doi.org/10.1093/biomet/90.2.289
Web of Science ®Google Scholar
Sriram, K., Ramamoorthi, R. V., & Ghosh, P. (2013). Posterior consistency of bayesian quantile regression based on the misspecified asymmetric laplace density. Bayesian Analysis, 8(2), 479–504. https://doi.org/https://doi.org/10.1214/13-BA817
Web of Science ®Google Scholar
Sun, D., & Speckman, L. P. (2008). Bayesian hierarchical linear mixed models for additive smoothing splines. Annals of the Institute of Statistical Mathematics, 60(3), 499–517. https://doi.org/https://doi.org/10.1007/s10463-007-0127-3
Web of Science ®Google Scholar
Sun, D., Tsutakawa, R. K., & Speckman, P. L. (1999). Posterior distribution of hierarchical models using CAR(1) distributions. Biometrika, 86(2), 41–50. https://doi.org/https://doi.org/10.1093/biomet/86.2.341
Web of Science ®Google Scholar
Svensson, L. E. (1994). Estimating and interpreting forward interest rates: Sweden 1992–1994. Technical Report, NBER Working Paper, 4871, 1–27. https://doi.org/https://doi.org/10.3386/w4871
Google Scholar
Thompson, P., Cai, Y., Moyeed, R., Reeve, D., & Stander, J. (2010). Bayesian nonparametric quantile regression using splines. Computational Statistics and Data Analysis, 54(4), 1138–1150. https://doi.org/https://doi.org/10.1016/j.csda.2009.09.004
Web of Science ®Google Scholar
Tong, X., He, Z., & Sun, D. (2018). Estimating Chinese treasury yield curves with Bayesian smoothing splines. Econometrics and Statistics, 8, 94–124. https://doi.org/https://doi.org/10.1016/j.ecosta.2017.10.001
Web of Science ®Google Scholar
van Dyk, D. A., & Park, T. (2008). Partially collapsed gibbs samplers. Journal of the American Statistical Association, 103(482), 790–796. https://doi.org/https://doi.org/10.1198/016214508000000409
Web of Science ®Google Scholar
Wahba, G. (1990). Spline models for observational data. Society for Industrial and Applied Mathematics.
Google Scholar
Wang, H., Li, G., & Jiang, G. (2007). Robust regression shrinkage and consistent variable selection through the LAD-Lasso. Journal of Business and Economic Statistics, 25(3), 347–355. https://doi.org/https://doi.org/10.1198/073500106000000251
Web of Science ®Google Scholar
Wu, Y, & Liu, Y. (2009). Variable selection in quantile regression. Statistica Sinica, 19(2), 801–817.
Web of Science ®Google Scholar
Yu, K., & Moyeed, R. A. (2001). Bayesian quantile regression. Statistics and Probability Letters, 54(4), 437–447. https://doi.org/https://doi.org/10.1016/S0167-7152(01)00124-9
Web of Science ®Google Scholar
Yuan, M. (2006). GACV for quantile smoothing splines. Computational Statistics and Data Analysis, 5(3), 813–829. https://doi.org/https://doi.org/10.1016/j.csda.2004.10.008
Google Scholar
Yue, Y. R., & Rue, H. (2011). Bayesian inference for additive mixed quantile regression models. Computational Statistics and Data Analysis, 55(1), 84–96. https://doi.org/https://doi.org/10.1016/j.csda.2010.05.006
Web of Science ®Google Scholar
Zhang, F. (2011). Matrix theory-basic results and techniques. Springer.
Google Scholar

Appendices

Appendix 1. Proof of Theorem 2.1

Proof. With the likelihood (Equation14) and priors (Equation15), (Equation16), the posterior of

(z, σ, δ | y)

is proportional to

(A1)

\begin{aligned} σ^{N + a_{0} 1} \exp (σ ρ_{p} (y C z) \frac{1}{2 δ} z^{'} A z) \\ \times \exp (b_{0} σ) \frac{1}{δ^{a_{1} + \frac{n 2}{2} + 1}} \exp (\frac{b_{1}}{δ}) . \end{aligned}

(A1) Integrate σ, (EquationA1) is proportional to

(A2)

\frac{1}{{[ρ_{p} (y C z) + b_{0}]}^{N + a_{0}}} \frac{1}{δ^{\frac{n 2}{2} + a_{1} + 1}} \exp (\frac{1}{2 δ} z^{'} A z \frac{b_{1}}{δ}) .

(A2) We only prove the sufficiency, the necessity can be proved in the similar way.

Since $min {p, 1 p} | z |_{1} \leq ρ_{p} (z) \leq max {p, 1 p} | z |_{1},$ and $| | z | |_{2} \leq | z |_{1} \leq \sqrt{n} | | z | |_{2}, \forall z \in R^{n},$ there exists a positive number $M_{1}$ such that (A3) $ρ_{p} (y C z) + b_{0} \geq \sqrt{M_{1} (y C z)^{'} (y C z)} + b_{0} .$ (A3) Moreover, there exist $M_{2} > 0,$ such that (A4) $(A 3) \geq M_{2} \sqrt{M_{1} (y C z)^{'} (y C z) + b_{0}} .$ (A4) Hence, we just need to consider, (A5) $\begin{aligned} \frac{1}{{[\sqrt{M_{1} (y C z)^{'} (y C z) + b_{0}}]}^{N + a_{0}}} \frac{1}{δ^{\frac{n 2}{2} + a_{1} + 1}} \\ \times \exp (\frac{1}{2 δ} z^{'} A z \frac{b_{1}}{δ}) . \end{aligned}$ (A5) We consider (A6) $\begin{aligned} \frac{1}{σ_{1}^{\frac{N + a_{0}}{2} + 1}} \frac{1}{δ^{\frac{n 2}{2}}} \\ \times \exp (\frac{M_{1}}{σ_{1}} (y C z)^{'} (y C z) \frac{b_{0}}{σ_{1}} \frac{1}{2 δ} z^{'} A z) \\ \times \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}), \end{aligned}$ (A6) since the integration with respect to $σ_{1}$ is proportional to (EquationA5(A5) $\begin{aligned} \frac{1}{{[\sqrt{M_{1} (y C z)^{'} (y C z) + b_{0}}]}^{N + a_{0}}} \frac{1}{δ^{\frac{n 2}{2} + a_{1} + 1}} \\ \times \exp (\frac{1}{2 δ} z^{'} A z \frac{b_{1}}{δ}) . \end{aligned}$ (A5) ). Now, set $y_{1} = \sqrt{2 M_{1}} y, z_{1} = \sqrt{2 M_{1}} z, A_{1} = \frac{1}{2 M_{1}} A$ , (EquationA6(A6) $\begin{aligned} \frac{1}{σ_{1}^{\frac{N + a_{0}}{2} + 1}} \frac{1}{δ^{\frac{n 2}{2}}} \\ \times \exp (\frac{M_{1}}{σ_{1}} (y C z)^{'} (y C z) \frac{b_{0}}{σ_{1}} \frac{1}{2 δ} z^{'} A z) \\ \times \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}), \end{aligned}$ (A6) ) becomes $\begin{aligned} \frac{1}{σ_{1}^{\frac{N + a_{0}}{2} + 1}} \frac{1}{δ^{\frac{n 2}{2}}} \\ \times \exp (\frac{1}{2 σ_{1}} (y_{1} C z_{1})^{'} (y_{1} C z_{1}) \frac{b_{0}}{σ_{1}} \frac{1}{2 δ} z_{1}^{'} A_{1} z_{1}) \\ \times \frac{1}{δ^{a_{1} + 1}} \exp (\frac{b_{1}}{δ}), \end{aligned}$ which agrees with the case for normal error in Speckman and Sun (Citation2003), the results follow.

Appendix 2. Proof of Theorem 2.2

For $(a)$ , the following lemma is needed,

Lemma A.1

Assume $Z \sim N (0, σ^{2})$ , for any $t > 0, μ \in R$ ,

(A7) $\begin{aligned} 2 \sqrt{\frac{2}{π}} \frac{σ t}{1 + σ^{2} t^{2}} \exp (\frac{μ^{2}}{2 σ^{2}}) \\ \leq E (\exp (t | Z μ |)) \\ \leq min \{1, \sqrt{\frac{2}{e π}} \frac{1}{t | μ |}\} \leq \sqrt{\frac{2}{1 + \frac{π e}{2} μ^{2} t^{2}}}, \end{aligned}$ (A7)
(A8) $\begin{aligned} E (\exp (t | Z μ |)) \\ \leq \sqrt{\frac{2}{π}} \frac{1}{σ t} \exp (\frac{μ^{2}}{4 σ^{2}}) + \exp (\frac{μ t}{\sqrt{2}}) . \end{aligned}$ (A8)

Proof.

For (1), the third inequality is trivial, we focus on the other two inequalities. For the first one, we have, $\begin{aligned} E (\exp (t | Z μ |)) \\ = \int_{\infty}^{+ \infty} \frac{1}{σ \sqrt{2 π}} \exp (t | z μ | \frac{z^{2}}{2 σ^{2}}) d z \\ = \int_{\infty}^{+ \infty} \frac{1}{σ \sqrt{2 π}} \exp (t |z| \frac{(z + μ)^{2}}{2 σ^{2}}) d z \\ = \exp (\frac{μ^{2}}{2 σ^{2}}) \int_{\infty}^{+ \infty} \frac{1}{σ \sqrt{2 π}} \\ \times \exp (t | z | \frac{z^{2}}{2 σ} \frac{z μ}{σ^{2}}) d z \\ = \exp (\frac{μ^{2}}{2 σ^{2}}) \int_{0}^{+ \infty} \frac{1}{σ \sqrt{2 π}} \exp (t | z | \frac{z^{2}}{2 σ^{2}}) \\ \times (\exp (\frac{z μ}{σ^{2}}) + \exp (\frac{z μ}{σ^{2}})) d z . \end{aligned}$ By arithmetic-geometric mean inequality, $\exp (z μ / σ^{2}) + \exp (z μ / σ^{2}) \geq 2$ . Hence, $E (\exp (t | Z μ |)) \geq 2 \exp (\frac{μ^{2}}{2 σ^{2}}) E (\exp (t | Z |)) .$ We can obtain $E (\exp (t | Z |)) = \sqrt{\frac{2}{π}} \frac{(1 Φ (t σ))}{ϕ (t σ)},$ where Φ, ϕ are standard normal c.d.f and p.d.f respectively. By the inequality (Gordon, Citation1941) $\frac{t}{1 + t^{2}} \leq \frac{1 Φ (t)}{ϕ (t)} \leq \frac{1}{t}, \forall t > 0,$ the first inequality holds. For the second inequality, we need the hierarchical structure of the Laplace distribution. Andrews and Mallows (Citation1974) proposed that (A9) $\begin{aligned} \frac{t}{2} \exp (t | Z μ |) \\ = \int_{0}^{+ \infty} \frac{1}{\sqrt{2 π} y} \exp (\frac{(Z μ)^{2}}{2 y^{2}}) t^{2} y \exp (\frac{y^{2} t^{2}}{2}) d y . \end{aligned}$ (A9) Hence, by Fubini's theorem, (A10) $\begin{aligned} \frac{t}{2} E (\exp (t | Z μ |)) \\ = \int_{0}^{+ \infty} \int_{\infty}^{+ \infty} \{\frac{1}{\sqrt{2 π} σ} \frac{1}{\sqrt{2 π} y} \\ \times \exp (\frac{(z μ)^{2}}{2 y^{2}} \frac{z^{2}}{2 σ^{2}}) d z\} t^{2} y \exp (\frac{y^{2} t^{2}}{2}) d y . \end{aligned}$ (A10) Since $μ | z, y \sim N (z, y^{2})$ , $z | σ^{2} \sim N (0, σ^{2})$ , we have the marginal distribution of $μ | y, σ^{2}$ is $N (0, σ^{2} + y^{2})$ . The inner integral equals (A11) $I (μ, y) = \sqrt{\frac{1}{2 π (σ^{2} + y^{2})}} \exp (\frac{μ^{2}}{2 σ^{2} + 2 y^{2}}) .$ (A11) Since the function $f (w) = 1 / w \exp (μ^{2} / (2 w)), w = σ^{2} + y^{2},$ is unimodal with the maximum value $| μ |^{1} \exp (1 / 2),$ we have $I (μ, y) \leq \frac{1}{\sqrt{2 π e} | μ |} .$ Hence, $E (\exp (t | Z μ |)) \leq \sqrt{\frac{2}{π e}} \frac{1}{| μ | t} .$ Clearly, $E (\exp (t | x μ |)) \leq 1$ , so the second inequality holds.

For (2), since (A12) $\begin{aligned} \frac{t}{2} E (\exp (t | Z μ |)) \\ = \int_{0}^{+ \infty} \sqrt{\frac{1}{2 π (σ^{2} + y^{2})}} t^{2} y \\ \times \exp (\frac{μ^{2}}{2 σ^{2} + 2 y^{2}} \frac{y^{2} t^{2}}{2}) d y . \end{aligned}$ (A12) Set $x = y / σ, μ^{'} = μ / σ, t^{'} = σ t,$ we have (A13) $\begin{aligned} (A 12) & = \sqrt{\frac{1}{2 π}} \frac{t^{' 2}}{σ} \int_{0}^{+ \infty} \frac{1}{\sqrt{1 + x^{2}}} x \\ \times \exp (\frac{μ^{' 2}}{2 + 2 x^{2}} \frac{t^{' 2} x^{2}}{2}) d x \\ \leq \sqrt{\frac{1}{2 π}} \frac{t^{' 2}}{σ} [\int_{0}^{1} x \exp (\frac{μ^{' 2}}{4} \frac{t^{' 2} x^{2}}{2}) d x \\ + \int_{1}^{+ \infty} \exp (\frac{μ^{' 2}}{4 x^{2}} \frac{t^{' 2} x^{2}}{2}) d x] . \end{aligned}$ (A13) Since the following identity holds for a, b are positive, $\int_{0}^{+ \infty} f ({(a x \frac{b}{x})}^{2}) d x = \frac{1}{a} \int_{0}^{+ \infty} f (x^{2}) d x .$ we have (A14) $\begin{aligned} (A 13) & \leq \sqrt{\frac{1}{2 π}} \frac{1}{σ} \exp (\frac{μ^{' 2}}{4}) + \frac{t^{'}}{2 σ} \exp (\frac{| μ^{'} | t^{'}}{\sqrt{2}}) \\ = \sqrt{\frac{1}{2 π}} \frac{1}{σ} \exp (\frac{μ^{2}}{4 σ^{2}}) + \frac{t}{2} \exp (\frac{| μ | t}{\sqrt{2}}) . \end{aligned}$ (A14) This implies result.

Assume SSE $= 0$ . First, we consider a>0. The joint posterior of $(z, σ, η | y)$ is proportional to (A15) $σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \exp (σ ρ_{p} (y C z) σ η z^{'} A z) h (η) .$ (A15) By the spectral decomposition, (A16) $A = P Λ P^{'},$ (A16) where $P$ is the orthogonal matrix and $Λ = d i a g (0, 0, η_{3}, \dots, η_{n}) .$ Let $D_{p} = min (p, 1 p)$ (A17) $\begin{aligned} (A 15) & \leq σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{p} σ | | y z | |_{2} σ η z^{'} A z) h (η) . \end{aligned}$ (A17) Set $z^{*} = P^{'} z$ , $y^{*} = P^{'} y$ , (A18) $\begin{aligned} (A 17) & = σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{p} σ | | y^{*} z^{*} | |_{2} σ η z^{*'} Λ z^{*}) h (η) \\ \leq σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (\frac{D_{p}}{\sqrt{n}} σ | y^{*} z^{*} |_{1} σ η z^{*'} Λ z^{*}) h (η) \\ = σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \exp (\frac{D_{p}}{\sqrt{n}} σ \sum_{i = 1}^{2} | y_{i}^{*} z_{i}^{*} | \\ \frac{D_{p}}{\sqrt{n}} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A18) Integrating out $z_{1}^{*}, z_{2}^{*}$ , (EquationA18(A18) $\begin{aligned} (A 17) & = σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{p} σ | | y^{*} z^{*} | |_{2} σ η z^{*'} Λ z^{*}) h (η) \\ \leq σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (\frac{D_{p}}{\sqrt{n}} σ | y^{*} z^{*} |_{1} σ η z^{*'} Λ z^{*}) h (η) \\ = σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \exp (\frac{D_{p}}{\sqrt{n}} σ \sum_{i = 1}^{2} | y_{i}^{*} z_{i}^{*} | \\ \frac{D_{p}}{\sqrt{n}} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A18) ) is proportional to (A19) $\begin{aligned} σ^{\frac{3}{2} n 4 a} η^{\frac{n 2}{2}} \\ \times \exp (\frac{D_{p}}{\sqrt{n}} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A19) Moreover, (A20) $\begin{aligned} (A 19) & \propto σ^{n 3 a} \prod_{i = 3}^{n} E_{z_{i}^{*}} (\exp (\frac{D_{p}}{\sqrt{n}} σ |y_{i}^{*} z_{i}^{*}|)), \\ z_{i} \sim N (0, \frac{1}{2 η_{i} σ η}) . \end{aligned}$ (A20) By Lemma A.1(a), we know (EquationA20(A20) $\begin{aligned} (A 19) & \propto σ^{n 3 a} \prod_{i = 3}^{n} E_{z_{i}^{*}} (\exp (\frac{D_{p}}{\sqrt{n}} σ |y_{i}^{*} z_{i}^{*}|)), \\ z_{i} \sim N (0, \frac{1}{2 η_{i} σ η}) . \end{aligned}$ (A20) ) is smaller than or equal to an expression proportional to (A21) $σ^{n 3 a} \prod_{i = 3}^{n} \sqrt{\frac{2}{1 + \frac{π e D_{p}^{2}}{2 n} y_{i}^{* 2} σ^{2}}} h (η),$ (A21) When $n \geq 3$ , we know $σ^{n 3 a} \prod_{i = 3}^{n} \sqrt{\frac{2}{1 + \frac{π e D_{p}^{2}}{2 n} y_{i}^{* 2} σ^{2}}},$ has finite integral with respect to σ when a>0. Hence, if $h (η)$ is proper, the posterior of $(z, σ, η | y)$ is proper.

When a = 0, the joint posterior of $(z, σ, η | y)$ is proportional to (A22) $σ^{\frac{3}{2} n 2} η^{\frac{n 2}{2}} \exp (σ ρ_{p} (y z) σ η z^{'} A z) h (η) .$ (A22) Following the same procedure as the case a>0, we only need to show (A23) $σ^{\frac{3}{2} n 4} η^{\frac{n 2}{2}} \exp (D_{p} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) .$ (A23) has finite integral. By Lemma A.1(b), we know there exist a constant $D_{1}$ such that (EquationA23(A23) $σ^{\frac{3}{2} n 4} η^{\frac{n 2}{2}} \exp (D_{p} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) .$ (A23) ) is smaller than or equal to (A24) $\begin{aligned} D_{1} σ^{n 3} \prod_{i = 3}^{n} (\sqrt{\frac{4 η_{i} η}{π σ}} \exp (\frac{y_{i}^{* 2} η_{i} σ η}{2}) + \exp (\frac{y_{i}^{*} σ}{\sqrt{2}})) \\ \times h (η) . \end{aligned}$ (A24) We know there are $2^{n 2}$ terms after expansion. For the term involving both $\sqrt{\frac{4 η_{i} η}{π σ}} \exp (\frac{y_{i}^{* 2} η_{i} σ η}{2}), \exp (\frac{y_{i}^{*} σ}{\sqrt{2}}),$ we employ the inequality, $y \exp (\frac{μ^{2} y^{2}}{2}) \leq \frac{1}{\sqrt{e} | μ |}, f o r y > 0,$ for $\sqrt{4 η_{i} η / (π σ)} \exp (y_{i}^{* 2} η_{i} σ η / 2)$ . Then, the term will not relate to η and has finite integral with respect to σ. Hence, we only need to consider the term (A25) $σ^{n 3} {(\frac{η}{σ})}^{(n 2) / 2} \exp (σ η \sum_{i = 3}^{n} \frac{y_{i}^{* 2} η_{i}}{2}) .$ (A25) Set $σ_{1} = σ η$ , then (EquationA25(A25) $σ^{n 3} {(\frac{η}{σ})}^{(n 2) / 2} \exp (σ η \sum_{i = 3}^{n} \frac{y_{i}^{* 2} η_{i}}{2}) .$ (A25) ) is equal to $σ_{1}^{(n 4) / 2} \exp (σ_{1} \sum_{i = 3}^{n} \frac{y_{i}^{* 2} η_{i}}{2}),$ which is proper when $n \geq 3$ . Hence, if $h (η)$ is proper, the posterior of $(z, σ, η | y)$ is proper.

When SSE $> 0$ , the joint posterior of $(z, σ, η | y)$ is proportional to (A26) $\begin{aligned} σ^{N 1 a} σ^{(n 2) / 2} η^{(n 2) / 2} \\ \times \exp (σ ρ_{p} (y C z) σ η z^{'} A z) h (η) . \end{aligned}$ (A26) With the same argument in Theorem 2.1, (EquationA26(A26) $\begin{aligned} σ^{N 1 a} σ^{(n 2) / 2} η^{(n 2) / 2} \\ \times \exp (σ ρ_{p} (y C z) σ η z^{'} A z) h (η) . \end{aligned}$ (A26) ) is smaller than or equal to (A27) $\begin{aligned} σ^{(N + \frac{n}{2} 2 a)} η^{(n 2) / 2} \\ \times \exp (min \{p, 1 p\} σ \sqrt{(y C z)^{'} (y C z)} \\ σ η z^{'} A z) h (η) . \end{aligned}$ (A27) Let $P_{y} = (C^{'} C)^{1} C^{'} y$ and $z^{*} = z P_{y}$ . The integral of (EquationA27(A27) $\begin{aligned} σ^{(N + \frac{n}{2} 2 a)} η^{(n 2) / 2} \\ \times \exp (min \{p, 1 p\} σ \sqrt{(y C z)^{'} (y C z)} \\ σ η z^{'} A z) h (η) . \end{aligned}$ (A27) ) w.r.t $z$ equals to (A28) $\begin{aligned} σ^{N 1 a} σ^{(n 2) / 2} η^{(n 2) / 2} \\ \times \exp (min \{p, 1 p\} σ \sqrt{z^{*'} W z^{*} + S S E} \\ σ η (z^{*'} + P_{y}) A (z^{*} + P_{y})) h (η) \\ \leq σ^{N 1 a} σ^{(n 2) / 2} η^{(n 2) / 2} \\ \times \exp (C_{1} σ [\sum_{i = 1}^{n} \sqrt{w_{i}} | z_{i}^{*} | + \sqrt{S S E}] \\ σ η (z^{*'} + P_{y}) A (z^{*} + P_{y})) h (η) . \end{aligned}$ (A28) Since $σ^{N n} \exp (C_{1} σ \sqrt{S S E})$ is bounded, we only need to consider (A29) $\begin{aligned} σ^{(3 n 2 a) / 2} η^{(n 2) / 2} \\ \times \exp (C_{1} σ \sum_{i = 1}^{n} \sqrt{w_{i}} | z_{i}^{*} | \\ σ η (z^{*'} + P_{y}) A (z^{*} + P_{y})) h (η) . \end{aligned}$ (A29) Similarly, we can find the $C_{2}$ such that (30) $\begin{aligned} (A29) & \leq σ^{(3 n 2 a) / 2} η^{(n 2) / 2} \\ \times \exp (C_{2} σ ρ_{p} (z P_{y}) σ η z^{'} A z) h (η) . \end{aligned}$ (30) Let $z_{1} = C_{2} z, y_{1} = C_{2} P_{y}, A_{1} = A / C_{2}^{2}$ , (EquationA31(30) $\begin{aligned} (A29) & \leq σ^{(3 n 2 a) / 2} η^{(n 2) / 2} \\ \times \exp (C_{2} σ ρ_{p} (z P_{y}) σ η z^{'} A z) h (η) . \end{aligned}$ (30) ) is proportional to the joint posterior of $(z, σ, η | y)$ when SSE = 0. By Theorem 2.2(a), the result holds.

For $(b)$ , since the posterior of $(z, σ, δ | y)$ is proportional to (A31) $\begin{aligned} \frac{σ^{N a}}{δ^{(n 2) / 2}} \exp (σ ρ_{p} (y C z) \frac{1}{2 δ} z^{'} A z) \frac{1}{σ^{2} δ^{2}} h (\frac{1}{2 σ δ}) \\ \leq \frac{σ^{N a}}{δ^{(n 2) / 2}} \exp (σ ρ_{p} (y C z) \frac{1}{2 δ} z^{'} A z) \\ \times \frac{1}{σ^{2} δ^{2}} σ^{1 + b} δ^{1 + b} . \end{aligned}$ (A31) We know that (EquationA31(A31) $\begin{aligned} \frac{σ^{N a}}{δ^{(n 2) / 2}} \exp (σ ρ_{p} (y C z) \frac{1}{2 δ} z^{'} A z) \frac{1}{σ^{2} δ^{2}} h (\frac{1}{2 σ δ}) \\ \leq \frac{σ^{N a}}{δ^{(n 2) / 2}} \exp (σ ρ_{p} (y C z) \frac{1}{2 δ} z^{'} A z) \\ \times \frac{1}{σ^{2} δ^{2}} σ^{1 + b} δ^{1 + b} . \end{aligned}$ (A31) ) is the special case in Theorem 2.1. Hence, it has finite integral with respect to $(z, σ, δ)$ iff $n > 2 + 2 b, N > 2 + a + b$ .

Appendix 3. Proof of Theorem 2.3

With the similar argument in Theorem 2.2, there exist $D_{4}, D_{5}$ such that (A32) $\begin{aligned} (A15) & \geq D_{4} σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{5} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A32) By Lemma A.1(a) and the following inequality, $\frac{x y}{1 + x y} \geq \frac{x}{1 + x} \frac{y}{1 + y}, x > 0, y > 0,$ Ignoring the constant of the product, we have the right hand side of (EquationA32(A32) $\begin{aligned} (A15) & \geq D_{4} σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{5} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A32) ) is larger than or equal to (A33) $\begin{aligned} σ^{n 3 a} \frac{σ^{\frac{n 2}{2}}}{\prod_{i = 3}^{n} (1 + \frac{σ}{2 η_{i}})} \frac{η^{\frac{n 2}{2}}}{(1 + η)^{n 2}} \\ \times \exp (σ η \sum_{i = 3}^{n} η_{i} y_{i}^{* 2}) h (η), \end{aligned}$ (A33) Set $σ_{1} = σ η$ , (EquationA33(A33) $\begin{aligned} σ^{n 3 a} \frac{σ^{\frac{n 2}{2}}}{\prod_{i = 3}^{n} (1 + \frac{σ}{2 η_{i}})} \frac{η^{\frac{n 2}{2}}}{(1 + η)^{n 2}} \\ \times \exp (σ η \sum_{i = 3}^{n} η_{i} y_{i}^{* 2}) h (η), \end{aligned}$ (A33) ) is larger than or equal to (A34) $\begin{aligned} σ_{1}^{n 3 a} \frac{σ_{1}^{\frac{n 2}{2}}}{\prod_{i = 3}^{n} (1 + \frac{σ_{1}}{2 η_{i}})} \\ \times \exp (σ_{1} \sum_{i = 3}^{n} η_{i} y_{i}^{* 2}) \frac{η^{a}}{(1 + η)^{2 n 4}} h (η) . \end{aligned}$ (A34) Since $σ_{1}^{n 3 a} \frac{σ_{1}^{\frac{n 2}{2}}}{\prod_{i = 3}^{n} (1 + \frac{σ_{1}}{2 η_{i}})} \exp (σ_{1} \sum_{i = 3}^{n} η_{i} y_{i}^{* 2}),$ has finite integral if $n \geq 2 + 2 / 3 a$ . Hence the necessary condition for the posterior propriety of $(z, σ, η | y)$ is $\int_{0}^{+ \infty} \frac{η^{a}}{(1 + η)^{2 n 4}} h (η) < + \infty .$ Combining the fact that $1 / (1 + η)^{2 n 4}$ is bounded and away from zero in $(0, ϵ)$ for arbitrary large ϵ, The condition (A35) $\int_{0}^{ϵ} η^{a} h (η) d η < + \infty .$ (A35) is also necessary. Next, we focus on the integral of η in the interval $(ϵ, + \infty)$ . With loss of generality, assume $ϵ = 1$ . All together (EquationA12(A12) $\begin{aligned} \frac{t}{2} E (\exp (t | Z μ |)) \\ = \int_{0}^{+ \infty} \sqrt{\frac{1}{2 π (σ^{2} + y^{2})}} t^{2} y \\ \times \exp (\frac{μ^{2}}{2 σ^{2} + 2 y^{2}} \frac{y^{2} t^{2}}{2}) d y . \end{aligned}$ (A12) ) with (EquationA19(A19) $\begin{aligned} σ^{\frac{3}{2} n 4 a} η^{\frac{n 2}{2}} \\ \times \exp (\frac{D_{p}}{\sqrt{n}} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A19) ), the integration of (EquationA32(A32) $\begin{aligned} (A15) & \geq D_{4} σ^{\frac{3}{2} n 2 a} η^{\frac{n 2}{2}} \\ \times \exp (D_{5} σ \sum_{i = 3}^{n} | y_{i}^{*} z_{i}^{*} | σ η \sum_{i = 3}^{n} η_{i} z_{i}^{* 2}) h (η) . \end{aligned}$ (A32) ) with respect to $z$ is proportional to (A36) $\begin{aligned} \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} σ^{1 a} \\ \times \prod_{i = 3}^{n} (\frac{σ^{2} t_{i}}{\sqrt{\frac{1}{2 η_{i} σ η} + t_{i}^{2}}} \exp (\frac{y_{i}^{* 2}}{\frac{1}{η_{i} σ η} + 2 t_{i}^{2}} \frac{D_{p}^{2} σ^{2} t_{i}^{2}}{2 n})) \\ \times d t_{3} \dots d t_{n} . \end{aligned}$ (A36) Let $t_{i} = s_{i} / \sqrt{σ}$ , (A37) $\begin{aligned} (A36) & \propto \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} σ^{\frac{3}{2} n 4 a} \\ \times \exp (σ \sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{\frac{1}{η_{i} η} + 2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})) \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i} η} + s_{i}^{2}}}) d s_{3} \dots d s_{n} . \end{aligned}$ (A37) After integrating out σ in (EquationA37(A37) $\begin{aligned} (A36) & \propto \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} σ^{\frac{3}{2} n 4 a} \\ \times \exp (σ \sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{\frac{1}{η_{i} η} + 2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})) \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i} η} + s_{i}^{2}}}) d s_{3} \dots d s_{n} . \end{aligned}$ (A37) ), the result is proportional to (A38) $\begin{aligned} f (η) & = \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} \frac{1}{{[\sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{\frac{1}{η_{i} η} + 2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})]}^{\frac{3}{2} n 3 a}} \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i} η} + s_{i}^{2}}}) d s_{3} \dots d s_{n}, \end{aligned}$ (A38) since $η \geq 1$ , we have (A39) $\begin{aligned} f (η) & \geq \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} \frac{1}{{[\sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})]}^{\frac{3}{2} n 3 a}} \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i}} + s_{i}^{2}}}) d s_{3} \dots d s_{n} =: Q . \end{aligned}$ (A39) We will show right hand side of (EquationA40(A39) $\begin{aligned} f (η) & \geq \int_{0}^{+ \infty} \dots \int_{0}^{+ \infty} \frac{1}{{[\sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})]}^{\frac{3}{2} n 3 a}} \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i}} + s_{i}^{2}}}) d s_{3} \dots d s_{n} =: Q . \end{aligned}$ (A39) ), Q, is finite under n>2 + a. Divide the interval $(0, + \infty)$ into $(0, 1) \cup [0, + \infty)$ , Q turns to be the summation of $2^{n 2}$ integral. All the integrals can be dealt in the similar way. We take one for example. Consider $s_{3} \in (0, 1)$ and the others are in $(1, + \infty)$ . (A40) $\begin{aligned} \int_{0}^{1} \int_{1}^{+ \infty} \dots \int_{1}^{+ \infty} \frac{1}{{[\sum_{i = 3}^{n} (\frac{y_{i}^{* 2}}{2 s_{i}^{2}} + \frac{D_{p}^{2} s_{i}^{2}}{2 n})]}^{\frac{3}{2} n 3 a}} \\ \times \prod_{i = 3}^{n} (\frac{s_{i}}{\sqrt{\frac{1}{2 η_{i}} + s_{i}^{2}}}) d s_{3} \dots d s_{n}, \end{aligned}$ (A40) since $y_{3}^{* 2} / 2 s_{3}^{2} + D_{p}^{2} s_{3}^{2} / 2 n \geq D_{p} | y_{3}^{*} | / \sqrt{n}$ and $s_{i} / \sqrt{(2 η_{i})^{1} + s_{i}^{2}} \leq 1, i = 4, \dots, n$ , (A41) $\begin{aligned} (A40) & \leq \int_{1}^{+ \infty} \dots \\ \times \int_{1}^{+ \infty} \frac{1}{{[\frac{D_{p} | y_{3}^{*} |}{\sqrt{n}} + \frac{D_{p}^{2} \sum_{i = 4}^{n} s_{i}^{2}}{2 n}]}^{\frac{3}{2} n 3 a}} d s_{4} \dots d s_{n}, \end{aligned}$ (A41) Using the polar transformation for $(s_{4}, \dots, s_{n})$ (A42) $\{\begin{cases} s_{4} & = r \cos (ϕ_{1}), \\ s_{5} & = r \sin (ϕ_{1}) \cos (ϕ_{2}), \\ ⋮ \\ s_{n} & = r \sin (ϕ_{1}) \sin (ϕ_{2}) \dots \sin (ϕ_{n 4}), \end{cases}$ (A42) where $r^{2} \geq n 3$ and $0 \leq ϕ_{i} \leq π / 2$ . The determinant of Jacobian matrix is $r^{n 4} \sin^{n 5} (ϕ_{1}) \dots \sin (ϕ_{n 4})$ . Hence, the right hand side of (EquationA41(A41) $\begin{aligned} (A40) & \leq \int_{1}^{+ \infty} \dots \\ \times \int_{1}^{+ \infty} \frac{1}{{[\frac{D_{p} | y_{3}^{*} |}{\sqrt{n}} + \frac{D_{p}^{2} \sum_{i = 4}^{n} s_{i}^{2}}{2 n}]}^{\frac{3}{2} n 3 a}} d s_{4} \dots d s_{n}, \end{aligned}$ (A41) ) is proportional to $\int_{\sqrt{n 3}}^{+ \infty} \frac{r^{n 4}}{{[\frac{D_{p} | y_{3}^{*} |}{\sqrt{n}} + \frac{D_{p}^{2} r^{2}}{2 n}]}^{\frac{3}{2} n 3 a}} d r,$ which is finite when $n > 3 / 2 + a$ . Similarly, all the $2^{n 2}$ integrals can be handled in this way and the case when all the $s_{i} \in (1, + \infty)$ will give the most strict constraint for n, which is n>2 + a. Hence, we have (A43) $\int_{ϵ}^{+ \infty} h (η) d η < + \infty .$ (A43) Combing (EquationA35(A35) $\int_{0}^{ϵ} η^{a} h (η) d η < + \infty .$ (A35) ) with (EquationA43(A43) $\int_{ϵ}^{+ \infty} h (η) d η < + \infty .$ (A43) ), the result holds.

Appendix 4. Proof of Theorem 3.1

The following lemma can be found in de Oliveira (Citation2007),

Lemma A.2

There exists a full rank $n \times (n 2)$ matrix $L$ satisfying $L^{'} T = 0, T = (1, x)$ and $L^{'} L = I_{n 2}$ , for which the following hold,

$A = R^{1} R^{1} T (T^{'} R^{1} T)^{1} T^{'} R^{1} = L (L^{'} R L)^{1} L^{'},$
$L^{'} R L = D$ , where $D$ is an $(n 2) \times (n 2)$ diagonal matrix with positive diagonal elements.

For the special structure of $A$ , we can prove all its n−2 subprincipal matrice are positive definite. Let $T^{*}$ be the orthogonalisation of $T$ . We know $(L, T^{*})$ is an unitary matrix. Without loss of generality, we take the left upper subprincipal matrix of $A$ with order n−2 for example and denote it to be $A_{s u b}$ . The determinant is $| L_{s u b} |^{2} | D |^{1},$ where $L_{s u b}$ is the first n−2 rows of $L$ . In addition, since $(L, T^{*})$ is an unitary matrix, by the Theorem 6.3 in Zhang (Citation2011), we have $| L_{s u b} | = c |(\begin{matrix} 1 & x_{n 1} \\ 1 & x_{n} \end{matrix})|,$ where c is a positive number related to $x$ . Hence, we have $| A | = c^{2} {|(\begin{matrix} 1 & x_{n 1} \\ 1 & x_{n} \end{matrix})|}^{2} | D |^{1},$ which is positive. Hence, all the n−2 subprincipal matrice of $A$ are positive definite. Now, we can prove the theorem.

Here, we only consider the conditional posterior of $(z | σ, σ, y)$ . The similar argument can be applied to $(z | σ, η, y)$ . The conditional posterior of $(z | σ, σ, y)$ is proportional to (A44) $σ^{N} δ^{\frac{n 2}{2}} \exp (σ ρ_{p} (y C z_{o}) \frac{1}{2 δ} z^{'} A z) h (η),$ (A44) where $z_{o} = (z_{1}, \dots, z_{n m})$ . Partition the precision matrix $A$ , (A45) $A_{1} = (\begin{matrix} A_{11} & A_{12} \\ A_{12}^{'} & A_{22} \end{matrix}),$ (A45) where $A_{11}$ is the $(n m) \times (n m)$ matrix and $A_{22}$ is $m \times m$ matrix. In addition, we have $A_{22}$ is positive definite. Integrate out $({\tilde{z}}_{n m + 1}, \dots, {\tilde{z}}_{n})$ , we have (EquationA44(A44) $σ^{N} δ^{\frac{n 2}{2}} \exp (σ ρ_{p} (y C z_{o}) \frac{1}{2 δ} z^{'} A z) h (η),$ (A44) ) is proportional to (A46) $σ^{N} δ^{\frac{n m 2}{2}} \exp (σ ρ_{p} (y C z_{o}) \frac{1}{2 δ} z_{o}^{'} A_{11.2} z_{o}),$ (A46) where $A_{11.2} = A_{11} A_{12} A_{22}^{1} A_{12}^{'}$ whose rank is n−m−2. Then, (EquationA46(A46) $σ^{N} δ^{\frac{n m 2}{2}} \exp (σ ρ_{p} (y C z_{o}) \frac{1}{2 δ} z_{o}^{'} A_{11.2} z_{o}),$ (A46) ) has the same form as the case when $C^{'} C$ is full rank. Hence, the result holds.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Bayesian analysis for quantile smoothing spline

Abstract

1. Introduction

2. Bayesian quantile smoothing spline