525
Views
0
CrossRef citations to date
0
Altmetric
Articles

Locally R-optimal designs for a class of nonlinear multiple regression models

ORCID Icon & ORCID Icon
Pages 107-120 | Received 09 Jun 2022, Accepted 27 Nov 2022, Published online: 12 Dec 2022

Abstract

This paper concerns with optimal designs for a wide class of nonlinear models with information driven by the linear predictor. The aim of this study is to generate an R-optimal design which minimizes the product of the main diagonal entries of the inverse of the Fisher information matrix at certain values of the parameters. An equivalence theorem for the locally R-optimal designs is provided in terms of the intensity function. Analytic solutions for the locally saturated R-optimal designs are derived for the models having linear predictors with and without intercept, respectively. The particle swarm optimization method has been employed to generate locally non-saturated R-optimal designs. Numerical examples are presented for illustration of the locally R-optimal designs for Poisson regression models and proportional hazards regression models.

1. Introduction

Generalized linear models (GLMs) have been used quite effectively in statistical modelling but the associated design issues are undoubtedly challenging, since the intensity function within their information matrices depends on the value of the linear predictor, which means that the optimal designs are related to the unknown parameters. Following Konstantinou et al. (Citation2014), we note that the information matrix of the proportional hazards regression models used in survival analysis also has the same feature when type I and random censoring are considered, and their intensity functions, which are similar to Poisson regression models and negative binomial regression models, are strictly monotonic rather than a symmetric structure that appears in the logistic and probit models. In this paper, we focus on such models with monotonic intensity functions.

In this direction, there is increasing interest in determining optimal designs under various criteria, especially for models with multiple covariates. Initial work was done by Konstantinou et al. (Citation2014) who provided analytical results of D- and c-optimal designs for this class of models with only one covariate. Subsequently, Schmidt and Schwabe (Citation2015) extended the results concerning D-optimality to one-dimensional discrete design space. For multiple regression, Schmidt and Schwabe (Citation2017) determined D-optimal designs by identifying a complete subclass, which contains the results of Russell et al. (Citation2009) for the Poisson regression model as a special case. Radloff and Schwabe (Citation2019) gave a construction method of D-optimal designs when the design region is a k-dimensional ball. Recently, Schmidt (Citation2019) systematically characterized c-, L- and Φk-optimal designs for models with a single covariate and for multiple regression with an arbitrary number of covariates.

It should be noted that the linear predictor in the previously mentioned literature always includes an intercept term. The intercept in GLMs and proportional hazards regression models with censoring, respectively, characterizes the expected mean and expected survival time when all the explanatory variables are equal to zero. In this case, the linear predictor reflects the influence of all the unobserved fixed variables in these models. As pointed out by Idais and Schwabe (Citation2021), when the intercept is significantly zero, i.e., the average impact of all the unobserved fixed variables is significantly zero, one may claim that the model includes probably most variables which explain the outcome. For the gamma models without intercept, Idais and Schwabe (Citation2021) obtained some explicit solutions of D- and A-optimal designs in some multi-linear cases, including the two-factor model with interaction.

This paper aims to provide a characterization of R-optimality for multiple regression models with and without intercept. It is well known that the R-optimality criterion proposed by Dette (Citation1997) has a nice statistical interpretation, namely minimizing the volume of the Bonferroni rectangular confidence region of the regression parameters. Moreover, it satisfies an extremely useful invariance property which allows an easy calculation of optimal designs on many linearly transformed design spaces. This optimality has been frequently applied to the cases of multi-response experiments, multi-factor experiments and mixture experiments, see, e.g., X. Liu and Yue (Citation2020), P. Liu et al. (Citation2022) and Hao et al. (Citation2021) for some recent references.

In general, the dependence of designs on parameters whose values are unknown a priori occurs in nonlinear regression models. Hence, utilizing a pre-specified parameter value we can obtain the so-called locally optimal designs in accordance with Chernoff (Citation1953). In this paper, we concentrate on the construction of locally optimal designs and take R-optimal designs to replace locally R-optimal designs for simplicity. Some general notations and a brief introduction of the R-criterion are presented in Section 2. In Sections 3 and 4, we analytically and numerically determine R-optimal designs for models having an intensity function which only depends on the value of the linear predictor with and without intercept, respectively. A brief discussion is given in Section 5. All proofs are included in the Appendix.

2. Model specification and R-optimality criterion

Throughout the paper, we focus on a class of nonlinear multiple regression models with information driven by the linear predictor defined on a given design region X, and consider the approximate designs ξ of the form ξ={x1x2xmω1ω2ωm},xiX,0ωi1,i=1mωi=1or simply ξ={xi;ωi}i=1m (see Silvey, Citation1980, p. 15). The Fisher information matrix of ξ with independent observations is assumed to be (1) M(ξ,β)=XQ(f(x)β)f(x)f(x)dξ(x)=i=1mωiQ(f(xi)β)f(xi)f(xi),(1) where Q0 is the intensity/efficiency function (see Fedorov, Citation1972, p. 39) which only depends on the value of the linear predictor, f is a p×1 vector of known regression functions, and βRp denotes the vector of p unknown parameters.

This kind of information matrix is common in the widely used generalized linear models, while it may also arise in other models such as the exponential regression models in proportional hazards parametrization with various censoring, including type I and random censoring (see Konstantinou et al., Citation2014) as well as other censoring distributions (see Schmidt & Schwabe, Citation2017). Following Konstantinou et al. (Citation2014), we further assume that the intensity function Q satisfies the following conditions.

(A1)

Q(θ) is positive for all θR and twice continuously differentiable.

(A2)

The derivative Q(θ) is positive for all θR.

(A3)

The second derivative g(θ) of the function g(θ)=1/Q(θ) is injective.

(A4)

The function Q(θ)/Q(θ) is an increasing function.

It is clear that the intensity functions induced by Poisson regression models, negative binomial regression models and proportional hazards regression models with type I and random censoring abide by all the conditions above.

In what follows, we concentrate on the R-optimality of designs which minimizes the product of the diagonal entries of the inverse of the Fisher information matrix. A design ξΞ is called R-optimal if it minimizes (2) ψ(ξ,β)=j=1p(M(ξ,β)1)jj=j=1pej,pM(ξ,β)1ej,p(2) over Ξ, where Ξ is the set of all designs with a non-singular information matrix on X, and ej,p denotes the jth unit vector in Rp. An important tool in optimal design theory is equivalence theorems that not only provide a characterization of the optimal design but also are the basis of many algorithms for their numerical construction (see, e.g., Yang et al., Citation2013; Freise et al., Citation2021). The following result gives the equivalence theorem for R-optimality.

Theorem 2.1

A design ξΞ is R-optimal if and only if (3) ϕ(x,β)=Q(f(x)β)f(x)M(ξ,β)1(j=1pej,pej,pej,pM(ξ,β)1ej,p)M(ξ,β)1f(x)p(3) holds for all xX. Moreover, it is equal at the support points of the design ξ.

In Sections 3 and 4 the minimally supported designs (i.e., the so-called saturated designs) will appear as candidates for determining R-optimal designs. A design ξ={xi;ωi}i=1m has minimal support if the number of support points is equal to the number of parameters, i.e., m = p. Let Dω =diag(ω1,,ωm), X~=Q1/2X with X=(f(x1),,f(xm)) and Q1/2=diag(Q(f(x1)β), ,Q(f(xm)β)). Accordingly, the information matrix (Equation1) for the design ξ can be decomposed as M(ξ,β)=X~DωX~. Furthermore, if the design ξ is saturated and the regression vectors located at the rows of X are linearly independent, the following result exhibits the optimal weights of a design with minimal support for the R-optimality criterion.

Lemma 2.1

Pukelsheim & Torsney, Citation1993

The R-optimal weights for a saturated design ξ are given by (4) ωj=sjjp,j=1,,p,(4) where s11,,spp are the diagonal entries of the matrix (5) S=(X~1)(j=1pej,pej,pej,pM(ξ,β)1ej,p)X~1.(5)

Remark 2.1

Since the diagonal entries sjj of the matrix S depend on the weights, a fixed point iteration procedure is available to determine a solution of optimal weights. In step r + 1, the weight ωj(r+1) is equivalent to sjj(ω1(r),,ωp(r))/p under a given initial weight vector (ω1(0),,ωp(0)).

3. R-optimal designs for models with intercept

In this section, we will describe how to construct R-optimal designs for multiple regression models with information matrices of the form (Equation1) and satisfying the assumptions (A1)–(A4). More precisely, multiple regression with additive linear effects of the covariates in the linear predictor that contains an intercept term will be considered.

We consider the multi-linear case which has f=(1,x) with x=(x1,,xp1)XRp1 and denote the parameter vector β=(β0,β1,,βp1) for convenience. Suppose that the design region X is a multi-dimensional polyhedron. By applying the complete class results described in Theorem 2 and Lemma 1 of Schmidt (Citation2019), R-optimal designs can be found from the complete subclass, which has at most two support points on each edge of X when the hyperplanes Hη={xRp1:f(x)β=η} are assumed to be bounded on X for all ηR,

The following theorem provides an approach to generate R-optimal designs on the rectangular design region for the model under consideration, the proof of which can be established by using the similar arguments as in Theorem 9 in Schmidt (Citation2019) via the fact (j=1psjj)2=p. It should be pointed out that using the same reasoning as in the proof of Theorems 8 and 9 in Schmidt (Citation2019) the uniqueness of the solution of the common system of equations described in Theorem 3.1 follows. Here the case p = 2 with one covariate is also satisfied, so p2.

Theorem 3.1

Let X=[u1,v1]××[up1,vp1] and let the assumptions (A1)–(A4) be satisfied for a model with information matrices of the form (Equation1). Let βi0 for i=1,,p1. Define ai=vi if βi>0, and ai=ui if βi<0. Let sij (i,j=1,,p) denote the elements of the matrix S in (Equation5) evaluated at a design with support points a(x1/β1)e1,p1,,a(xp1/βp1)ep1,p1 and a=(a1,,ap1). If a solution exists, let xi(0,) and the weights ωi for i=1,,p1 be the unique solutions of the common system of Equations (Equation4) and (Equation6) given by (6) xi2Q(f(a)βxi)Q(f(a)βxi)[1Q(f(a)βxi)Q(f(a)β)s1,i+1s11si+1,i+1]=0.(6) If xi|βi|(viui) holds for i=1,,p1, then the design ξ={a(x1/β1)e1,p1a(xp1/βp1)ep1,p1aω1ωp1ωp}is a unique R-optimal design.

Remark 3.1

Let X=[u1,v1]××[up1,vp1], when some of the parameters β1,,βp1 are equal to zero. It follows from Lemma 1 of Schmidt (Citation2019) that the two endpoints of the corresponding edges must be the support points of optimal design, and the number of support points is possibly over p. This means that we need to use some numerical algorithms to generate the R-optimal design, such as the commonly used particle swarm optimization (PSO) method, see Section 4.2 for details.

The following two examples illustrate the results in Theorem 3.1 and Remark 3.1, and exhibit the performance of different designs in terms of the Bonferroni confidence intervals and the relative efficiencies of designs. In Example 3.1 the first-order Poisson regression model with intercept will be considered, and the difference of the averaged Bonferroni confidence intervals derived by R- and D-optimal designs as well as the balanced design ξb will be investigated by a numerical simulation. Example 3.2 considers the Poisson regression model with two covariants which was discussed in Schmidt (Citation2019), and the relative efficiency of R-optimal designs will be compared with other designs. In both of the examples, the intensity function Q is given by Q(θ)=exp(θ) for θR.

Example 3.1

Consider the first-order Poisson regression model with intercept and let X=[0,5]. If we set β=(6,1) and β=(1,1), the R- and D-optimal designs can be calculated numerically by Theorem 3.1 and Theorem 2 of Konstantinou et al. (Citation2014). For example, under the R-optimality criterion with β=(6,1) we obtain x1=2.1886 and ω2=0.5431 by solving the Equations (Equation4) and (Equation6). In Table , the results on optimal designs for both cases are reported. Furthermore, we carry out a simulation study in order to assess the difference of the Bonferroni confidence intervals derived by different designs. The averaged Bonferroni confidence intervals under different sample sizes are calculated by 10,000 simulation runs, which are shown in Table . We find from Table  that for both cases the coverage probabilities perform in a similar manner and much close to 0.95 as the sample size increases, and the width of the averaged Bonferroni confidence intervals obtained from R-optimal designs is narrower comparing with D-optimal designs and the balanced design ξb when there exists a serious loss of R-efficiency.

Table 1. The simulation results obtained from the locally R-, D-optimal designs and the balanced design on X=[0,5] for the first-order Poisson regression model with intercept.

Example 3.2

Assume that the linear predictor in the considered Poisson regression model includes an intercept term. Let X=[0,5]2 and β=(0,1,1). A numerical calculation yields immediately x1=x2=2.1785, ω1=ω2=0.3060 and ω3=0.3880 by the Equations (Equation4) and (Equation6). The R-optimal design is given in Table . The PSO method was used to find the corresponding D-, R-, and A-optimal designs when β=(0,1,0). The resulting designs are listed in Table . In this case, we find that the endpoints (0,0) and (0,5) must be the support points of optimal designs. Figure  displays the plot of the function ϕ(x,β) defined in (Equation3) for the cases β=(0,1,1) and β=(0,1,0), which indicates that the function ϕ attains its maximum 3 at each support point.

Figure 1. Plot of the function ϕ in (Equation3) for the R-optimal design on X=[0,5]2 for the Poisson regression model discussed in Example 3.2: (a) for β=(0,1,1) and (b) for β=(0,1,0).

Figure 1. Plot of the function ϕ in (Equation3(3) ϕ(x,β)=Q(f(x)⊤β)f(x)⊤M(ξ∗,β)−1(∑j=1pej,pej,p⊤ej,p⊤M(ξ∗,β)−1ej,p)M(ξ∗,β)−1f(x)⩽p(3) ) for the R-optimal design on X=[0,5]2 for the Poisson regression model discussed in Example 3.2: (a) for β=(0,−1,−1)⊤ and (b) for β=(0,−1,0)⊤.

Table 2. Comparison of R-optimal design with D- and A-optimal designs on X=[0,5]2 for the Poisson regression model discussed in Example 3.2.

To compare the performance of different designs, such as the common D- and A-optimality, we may calculate the related efficiency that usually defines the value of the criterion function for the optimal design relative to the value of the criterion function of a design and can thus take values between 0 and 1. For instance, the R-efficiency is defined as EffR(ξ)=ψ(ξ,β)/ψ(ξ,β). The results are summarized in Table . For the case β=(0,1,1), all the designs have relatively high efficiencies regarding the other optimality criteria. This occurs primarily because the designs have a similar structure. For the case β=(0,1,0), we observe that the R-optimal design still has relatively high D- and A-efficiencies, but the A-optimal design has a large loss of R-efficiency.

Theorem 3.2

Let X=[0,)p1 and let the assumptions (A1)–(A4) be satisfied for models with information matrices of the form (Equation1). If βi<0 for all i=1,,p1, then the solutions xi and ωi from the system of Equations (Equation4) and (Equation6) do not depend on the parameter vector (β1,,βp1), that is, the R-optimal weights are unchanged for different (β1,,βp1) when β0 is fixed.

Remark 3.2

The optimal weights of saturated designs for L- and Φk-optimality criteria, except for D-optimality, depend on the parameters (β1,,βp1) in the same settings of Theorem 3.2.

To further illustrate Theorem 3.2, we consider the proportional hazards regression models with two types of censoring. Under type I censoring with a fixed censoring time c the intensity function Q is given by Q(θ)=1exp(cexp(θ)). For random censoring the intensity function Q(θ) equals to 1[1exp(cexp(θ))]/[cexp(θ)] if the censoring times are assumed to follow a uniform distribution U(0,c) (see Konstantinou et al., Citation2014). For both the above-mentioned intensity functions the variable θ belongs to R.

Example 3.3

For the proportional hazards regression models with type I and random censoring Schmidt (Citation2019) discussed the behaviour of c- and Φk-optimal designs for different parameter values. Their results are consistent with Remark 3.2. Analogous to Schmidt (Citation2019), we investigate the performance of R-optimal designs on the design region X=[0,3]p1 by comparing with a balanced design, say ξb, which is supported at all vertices of X with equal weights. First, the censoring time c as an unknown parameter should be determined previously, which is closely related to the amount of censoring q. Here q is also called the overall probability of censoring and given by q=1ωiP(Yj<C), in which Yi is the survival time and C is the censoring distribution (see Kalish & Harrington, Citation1988). For the two-covariate case, we choose c = 32 for type I censoring and c = 69 for random censoring such that q is, respectively, equal to 60% for β=(3,2.5,2.5), 71% for β=(3,3,3) and 75% for β=(3,4,4) when the balanced design ξb on X=[0,3]2 had been used. The R-optimal designs on X=[0,3]2 and R-efficiencies of ξb are summarized in Table . We observe from Table  that the R-optimal designs shift the support points on the edges towards the vertex (0,0) and the R-efficiencies of the balanced design ξb decrease with increasing amounts of censoring. Note also that the R-efficiency of ξb is quite low for these censoring scenarios. In addition, the R-optimal weights are the same for each model by Theorem 3.2.

Table 3. R-optimal designs on X=[0,3]2 for proportional hazards regression models for different β and R-efficiencies of the balanced design ξb.

4. R-optimal designs for models without intercept

In the present section we turn to discuss the multi-linear case without intercept, where the vector of the regressor functions is specified as f=(x1,,xp)X, p2, and the parameter vector is denoted by β=(β1,,βp). The regression model without an intercept is common and it usually arises from the physical characteristics of the variables measured. The design issue for this model has attracted considerable attention in the literature (see, e.g., Idais & Schwabe, Citation2021; Li et al., Citation2005 and the references cited there).

Let X=[u1,v1]××[up,vp]. Then the support points of R-optimal designs are also at the edges of X by Theorem 2 in Schmidt (Citation2019). Moreover, from the proof of Theorem 8 in Schmidt (Citation2019) the support points of an R-optimal design ξ must be given by a(x1/β1)e1,p1,,a(xp/βp)ep,p and a=(a1,,ap) with ai=vi if βi>0 and ai=ui if βi<0 for i=1,,p. As a result, the optimality condition for approximate designs in Theorem 2.1 can be simplified to verify whether it is satisfied at the boundary of the design region, i.e., the design ξ is R-optimal if and only if (7) hi(xi,ξ)=i(xi,ξ)pQ(aβ+βi(xiai))0(7) holds for all xi, i=1,,p, where the function i(xi,ξ) is given by i(xi,ξ)=(a+(xiai)ei,p)M(ξ,β)1(j=1pej,pej,pej,pM(ξ,β)1ej,p)M(ξ,β)1(a+(xiai)ei,p).

4.1. Some theoretical results on saturated designs

In this subsection, we will provide two types of saturated R-optimal designs for the model considered by distinguishing whether the vertex a is a support point. The first result reveals designs which include the support point a, where the condition aτ0, τ{1,,p}, described in Theorem 4.1 is to guarantee the design ξτ in (Equation9) with a non-singular information matrix.

Theorem 4.1

Let the assumptions (A1)–(A4) be satisfied for models with information matrices of the form (Equation1) and without intercept. Let sij (i,j=1,,p) be the elements of the matrix S in (Equation5) evaluated at a design of the form {a(xi/βi)ei,p;ωi}i=1p. For a given index τ (τ{1,,p}) with aτ0, if a solution exists, let xi(0,) and the weights ωi be the unique solutions of the common system of Equations (Equation4) and (Equation8) given by (8) xi2Q(aβxi)Q(aβxi)[1Q(aβxi)Q(aβ)sτisττsii]=0(8) for all i (τ). If 0<xi|βi|(viui) holds, then the design (9) ξτ={a(x1/β1)e1,pa(xτ/βτ)eτ,pa(xp/βp)ep,pω1ωτωp}(9) with xτ=0 will be R-optimal on X provided that the condition hτ(xτ,ξτ)<0 (or ϕ(a+(xτaτ)eτ,p,β)<p) for xτ[uτ,vτ]{aτ} is satisfied, where hτ(,) is defined as in (Equation7) and ϕ(,) as in (Equation3).

Example 4.1

Consider the Poisson regression model with two covariates and without intercept, where the design region is X=[0,5]2. To find the R-optimal design we first fix β=(0.5,0.5). According to Theorem 4.1 we can only specify τ=2, i.e., x2=0. By solving the common system of Equations (Equation4) and (Equation10) we obtain x1=2.1886 and the design ξ2 is of the form ξ2={(4.3772,5)(0,5)0.45690.5431}.It is easily verified that the condition h2(x2,ξ2)<0 is satisfied for x2[0,5) and the design ξ2 is then R-optimal (see also Figure (a)). If we choose β=(1,1), however, it follows from Theorem 4.1 that the designs ξ1 and ξ2 are given by ξ1={(2.4678,5)(5,5)0.82340.1766},ξ2={(5,2.4678)(0,5)0.82340.1766},respectively. In this case, the conditions hi(xi,ξi)<0, i = 1, 2, are not satisfied, which means that both designs are not R-optimal. Accordingly, finding a saturated R-optimal design for this model that does not contain the support point a may be an alternative scheme. This type of R-optimal design will be elaborated in Theorem 4.2.

Figure 2. Plots of the functions ϕ in (Equation3) for the R-optimal designs on X=[0,5]2 for the Poisson regression models discussed in Examples 4.1 and 4.2: (a) for β=(0.5,0.5) and (b) for β=(1,1).

Figure 2. Plots of the functions ϕ in (Equation3(3) ϕ(x,β)=Q(f(x)⊤β)f(x)⊤M(ξ∗,β)−1(∑j=1pej,pej,p⊤ej,p⊤M(ξ∗,β)−1ej,p)M(ξ∗,β)−1f(x)⩽p(3) ) for the R-optimal designs on X=[0,5]2 for the Poisson regression models discussed in Examples 4.1 and 4.2: (a) for β=(−0.5,0.5)⊤ and (b) for β=(1,1)⊤.

Theorem 4.2

Let the assumptions (A1)–(A4) be satisfied for models with information matrices of the form (Equation1) and without intercept. Let sij (i,j=1,,p) be the elements of the matrix S in (Equation5) evaluated at a design of the form {a(xi/βi)ei,p;ωi}i=1p. If a solution exists, let xi(0,) and the weights ωi be the unique solutions of the common system of Equations (Equation4) and (Equation10) given by (10) xi2Q(aβxi)Q(aβxi)[11z1pj=1pzjQ(aβxi)Q(aβxj)sjisiisjj+1]=0(10) for all i, where z=(z1,,zp) with zj=ajβj/xj, 1p=(1,,1)Rp. If 0<xi|βi|(viui) holds, then the design (11) ξ={a(x1/β1)e1,pa(xp/βp)ep,pω1ωp}(11) will be R-optimal on X provided that the condition hi(ai,ξ)<0 (or ϕ(a,β)<p) is satisfied.

Example 4.2

Consider the same model given in Example 4.1. Here we consider β=(1,1). The common system of Equations (Equation4) and (Equation10) has the solutions x1=x2=1.8755 and ω1=ω2=0.5. Then the design ξ={(3.1245,5)(5,3.1245)0.50.5}is R-optimal due to ϕ(a,β)<2 (see also Figure (b)).

Corollary 4.1

Let X=[0,)p and let the assumptions (A1)–(A4) be satisfied for models with information matrices of the form (Equation1) and without intercept. Then the following design ξ is R-optimal for each βi<0, i=1,,p, (12) ξ={(ψ1(0)/β1)e1,p(ψ1(0)/βp)ep,p1/p1/p},(12) where the function ψ() is defined as ψ(x)=x2Q(x)/Q(x) for x>0.

Remark 4.1

It is worthwhile mentioning that the design (Equation12) shown in Corollary 4.1 is also D- and A-optimal.

Remark 4.2

For the case of p = 1, the one point design ξ={a;1} that simultaneously satisfies the additional conditions in Theorem 4.1 is saturated R-optimal. However, if the design ξ={a;1} is not R-optimal, the proposed method in Theorem 4.2 can be used to search for a saturated R-optimal design.

4.2. PSO-generated R-optimal designs

Only finding saturated designs may not be enough for determining the R-optimality of a design in the design class Ξ, since the optimal designs depend on the unknown parameters. For example, let β=(0.5,0.5) for the Poisson regression model discussed in Example 4.1. The R-optimal design (see Figure ) is then given by ξ={(1.4321,5)(5,1.4321)(5,5)0.46660.46660.0668}.This means that the number of support points of R-optimal designs for models without intercept may exceed the number of regression parameters. Thereby, the aforementioned method by solving equations is unable to determine an R-optimal design in Ξ and we require an effective algorithm to generate an R-optimal design. Here we employ the PSO algorithm to find optimal designs for the models under consideration and a pseudo code of PSO is described as below.

Figure 3. Plot of the function ϕ in (Equation3) for the R-optimal design on X=[0,5]2 for β=(0.5,0.5) in the Poisson regression model without intercept.

Figure 3. Plot of the function ϕ in (Equation3(3) ϕ(x,β)=Q(f(x)⊤β)f(x)⊤M(ξ∗,β)−1(∑j=1pej,pej,p⊤ej,p⊤M(ξ∗,β)−1ej,p)M(ξ∗,β)−1f(x)⩽p(3) ) for the R-optimal design on X=[0,5]2 for β=(0.5,0.5)⊤ in the Poisson regression model without intercept.

In Section 4.2, the support points of each position ξi must be given by a(x1/β1)e1,p1,,a(xp/βp)ep,p and a.

The index t is equal to 0,1,, and two criteria can be used to end iteration, achieving a maximum number of iterations or verifying whether the equivalence condition attains a pre-specified threshold.

The notations vi(t) and ξi(t) are, respectively, the current velocity and position for the ith particle. θt is the inertia weight that modulates the influence of the former velocity, which can be a constant or a decreasing function with values between 0 and 1. α1 and α2 are both random variables from U(0,1). γ1 and γ2 are two constants reflecting the cognitive learning level and social learning level, respectively.

Example 4.3

Consider the proportional hazards regression models with three covariates, in which the linear predictor does not include an intercept term. Let X=[0,3]3 and β=(2.5,0.5,0.5). In order to assess the effect of the amount of censoring q on R-optimal designs, we adjust the censoring time c to achieve overall censoring probabilities of 20%, 40%, 60% and 80% for the balanced design ξb. For instance, we choose c = 60 for type I censoring and c = 133 for random censoring when q = 0.4. In this example, the PSO algorithm with 150 particles and 100 iterations is able to find the R-optimal design with the required accuracy, which can be implemented by R software in less than 20 seconds on a standard PC. Table  summarizes the numerical results, including R-efficiencies of the balanced design ξb and the amount of censoring q under the corresponding R-optimal design. Some of the points are clear from the numerical results.

  • For the three-covariate case, the support points of R-optimal designs for type I censoring and random censoring exceed the number of parameters.

  • With increasing amounts of censoring, the R-optimal designs shift the support point on the edge (x1,3,3) towards the vertex a=(0,3,3), and the R-efficiency of the balanced design ξb is reduced gradually.

  • The overall probability of censoring q under the R-optimal design is less than for the balanced design ξb.

Table 4. The locally R-optimal designs on X=[0,3]3 for β=(2.5,0.5,0.5) in the proportional hazards regression models, the R-efficiencies of the balanced design ξb and the overall probability of censoring under the R-optimal design.

5. Discussion

The present paper investigates the construction of locally R-optimal designs for a large class of nonlinear multiple regression models. For the case of models with intercept, the R-optimal designs on a rectangular design region can be determined by utilizing the similar arguments in Schmidt (Citation2019) but finding its optimal weight is different. We notice that the structure of the R-optimal designs is similar to those criteria reported in Schmidt (Citation2019), especially in terms of the location of support points. For the case of models without intercept, however, with the same method we can determine the saturated R-optimal designs only from two design subclasses addressed in Section 4.1. Moreover, the PSO algorithm has been used to generate non-saturated R-optimal designs for both cases. Some conditions in Theorems 3.1, 4.1 and 4.2 are required, which ensure that the support points are located within the design region. If these conditions are not satisfied, optimal designs may then be having more support points. In addition, a nonlinear system of equations must be solved numerically in order to search for the saturated R-optimal designs. Although the existence of the solution to these equations is not proved theoretically, the numerical exploration shows that the solution in all considered examples always exists.

It is worthwhile mentioning that the locally optimal designs discussed so far are derived for a given value of the model parameter vector β. One might choose such a value of β from an initial guess or estimation when some historical observations can be obtained. It might be of interest to study how the design will be affected by wrongly specified parameters. For illustration, we consider the locally R-optimal designs for the first-order Poisson regression models with an intercept on the design region X=[0,5]p1 for p = 2, 3, 4, respectively. In specific, we assume that the true parameter vector is β=(0,1) for p = 2, β=(0,1,1) for p = 3, and β=(0,1,1,1) for p = 4. We generate the locally R-optimal designs for various misspecified values of β, and calculate their R-efficiencies with respect to the locally R-optimal design for the true parameter vector for each p = 2, 3, 4, which are shown in Table . It is observed from Table  that the efficiency of the locally R-optimal design for the misspecified value of β decreases as the value of β diverges from its true value, and the loss of R-efficiency is disastrous when the difference between the true value and the misspecified value of β is relatively serious. Numerical results with other examples yield similar conclusions, which are not reported here for the sake of saving space.

Table 5. The R-efficiencies of the locally R-optimal designs on X=[0,5]p1 for various misspecified β for the first-order Poisson regression models with intercept.

To overcome the parameter dependence of the locally optimal design, a commonly used approach to the computation of locally optimal designs is weighted designs (see, e.g., Atkinson et al., Citation2007, Chap. 18), where a prior distribution for β, which may be either discrete or continuous, is assumed in advance. Another method is the computation of maximin efficient designs, i.e., maximizing the minimal efficiency with respect to the parameters (see Dette, Citation1997; Konstantinou et al., Citation2014).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Lei He's work is supported by the National Natural Science Foundation of China [Grant Number 12101013] and the Natural Science Foundation of Anhui Province [Grant Number 2008085QA15]. Rong-Xian Yue's work is supported by the National Natural Science Foundation of China [Grant Numbers 11971318, 11871143].

References

  • Atkinson, A. C., Donev, A. N., & Tobias, R. D. (2007). Optimum experimental designs, with SAS. Oxford University Press.
  • Chernoff, H. (1953). Locally optimal designs for estimating parameters. The Annals of Mathematical Statistics, 24(4), 586–602. https://doi.org/10.1214/aoms/1177728915
  • Dette, H. (1997). Designing experiments with respect to ‘standardized’ optimality criteria. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(1), 97–110. https://doi.org/10.1111/rssb.1997.59.issue-1
  • Fedorov, V. V. (1972). Theory of optimal experiments. Academic Press.
  • Freise, F., Gaffke, N., & Schwabe, R. (2021). The adaptive Wynn algorithm in generalized linear models with univariate response. The Annals of Statistics, 49(2), 702–722. https://doi.org/10.1214/20-AOS1974
  • Hao, H., Zhu, X., Zhang, X., & Zhang, C. (2021). R-optimal design of the second-order Scheffé mixture model. Statistics & Probability Letters, 173(C), 109069. https://doi.org/10.1016/j.spl.2021.109069
  • Idais, O., & Schwabe, R. (2021). Analytic solutions for locally optimal designs for gamma models having linear predictors without intercept. Metrika, 84(1), 1–26. https://doi.org/10.1007/s00184-019-00760-3
  • Kalish, L. A., & Harrington, D. P. (1988). Efficiency of balanced treatment allocation for survival analysis. Biometrics, 44(3), 815–821. https://doi.org/10.2307/2531593
  • Konstantinou, M., Biedermann, S., & Kimber, A. (2014). Optimal designs for two-parameter nonlinear models with application to survival models. Statistica Sinica, 24(1), 415–428. https://doi.org/10.5705/ss.2011.271
  • Li, K. H., Lau, T. S., & Zhang, C. (2005). A note on D-optimal designs for models with and without an intercept. Statistical Papers, 46(3),451–458. https://doi.org/10.1007/BF02762844
  • Liu, P., Gao, L. L., & Zhou, J. (2022). R-optimal designs for multi-response regression models with multi-factors. Communications in Statistics – Theory and Methods, 51(2), 340–355. https://doi.org/10.1080/03610926.2020.1748655
  • Liu, X., & Yue, R.-X. (2020). Elfving's theorem for R-optimality of experimental designs. Metrika, 83(4), 485–498. https://doi.org/10.1007/s00184-019-00728-3
  • Pukelsheim, F., & Torsney, B. (1993). Optimal weights for experimental designs on linearly independent support points. The Annals of Statistics, 19(3), 1614–1625. https://doi.org/10.2307/2241966
  • Radloff, M., & Schwabe, R. (2019). Locally D-optimal designs for non-linear models on the k-dimensional ball. Journal of Statistical Planning and Inference, 203, 106–116. https://doi.org/10.1016/j.jspi.2019.03.004
  • Russell, K. G., Woods, D. C., Lewis, S. M., & Eccleston, E. C. (2009). D-optimal designs for Poisson regression models. Statistica Sinica, 19(2), 721–730. https://doi.org/10.2307/24308852
  • Schmidt, D. (2019). Characterization of c-, L- and ϕk-optimal designs for a class of non-linear multiple-regression models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 81(1), 101–120. https://doi.org/10.1111/rssb.12292
  • Schmidt, D., & Schwabe, R. (2015). On optimal designs for censored data. Metrika, 78(3), 237–257. https://doi.org/10.1007/s00184-014-0500-1
  • Schmidt, D., & Schwabe, R. (2017). Optimal design for multiple regression with information driven by the linear predictor. Statistica Sinica, 27(3), 1371–1384. https://doi.org/10.5705/ss.202015.0385
  • Silvey, S. D. (1980). Optimal design. Chapman and Hall.
  • Whittle, P. (1973). Some general points in the theory of optimal experimental designs. Journal of the Royal Statistical Society: Series B (Methodological), 35(1), 123–130. https://doi.org/10.1111/j.2517-6161.1973.tb00944.x
  • Yang, M., Biedermann, S., & Tang, E. (2013). On optimal designs for nonlinear models: A general and efficient algorithm. Journal of the American Statistical Association, 108(504), 1411–1420. https://doi.org/10.1080/01621459.2013.806268

Appendix

Proof

Proof of Theorem 2.1

Let fβ(x)=Q(f(x)β)f(x). The Fisher information matrix of approximate design ξ given by (Equation1) can then be written as M(ξ,β)=Xfβ(x)fβ(x)dξ(x).For any designs ξ,ξ¯Ξ and α(0,1), define ξα=(1α)ξ+αξ¯. We have M(ξα,β)=(1α)M(ξ,β)+αM(ξ¯,β) and ψ(ξα,β)α=α[j=1pej,pM1(ξα,β)ej,p]=j=1p(i=1ijpei,pM1(ξα,β)ei,p)α(ej,pM1(ξα,β)ej,p),where α(ej,pM1(ξα,β)ej,p)=ej,p[M1(ξα,β)(M(ξ,β)M(ξ¯,β))M1(ξα,β)]ej,p.The directional derivative of ψ at ξ in the direction of ξ¯, denoted by ψ(ξ,ξ¯), is then given by ψ(ξ,ξ¯)=limα0+ψ(ξα,β)α=j=1pψ(ξ)ej,pM1(ξ,β)ej,pej,p[M1(ξ,β)(M(ξ,β)M(ξ¯,β))M1(ξ,β)]ej,p=ψ(ξ)j=1p(1ej,pM1(ξ,β)M(ξ¯,β)M1(ξ,β)ej,pej,pM1(ξ,β)ej,p)=ψ(ξ)(ptr{M1(ξ,β)M(ξ¯,β)M1(ξ,β)j=1qej,pej,pej,pM1(ξ,β)ej,p}).Note that the directional derivative ψ(ξ,ξ¯) is linear in ξ¯ for any fixed ξΞ, i.e., ψ(ξ,ξ¯)=Xψ(ξ,δx)dξ¯(x),where δx is the Dirac measure at x. Following Whittle (Citation1973), the design ξΞ is R-optimal if and only if infxXψ(ξ,δx)=0. As a consequence the assertion of Theorem 2.1 follows.

Proof

Proof of Theorem 3.2

From Theorem 3.1, R-optimal design on X=[0,)p1 for βi<0, i=1,,p1, has the following form (A1) ξ={(x1/β1)e1,p1(xp1/βp1)ep1,p10p1ω1ωp1ωp},(A1) where 0p1 is a vector of (p1) 0's, xi and ωi, i=1,,p1, can be determined by the common system of Equations (Equation4) and (Equation6) for convenience. We denote Qi=Q(β0xi) for i=1,,p1 and Qp=Q(β0) according to the support points of the design (EquationA1). Employing the previously mentioned decomposition approach and letting y=(β1/x1,,βp1/xp1) and Λ=diag((Qpωp)1,(Q1ω1)1,,(Qp1ωp1)1), we can obtain that the inverse of the information matrix of ξ is given by M(ξ,β)1=X1Q1/2Dω1Q1/2(X)1=(10p1ydiag(y))Λ(1y0p1diag(y))=(1Qpωp1Qpωpy1Qpωpydiag(y~)),where y~=(β12λ~1p/x12,,βp12λ~p1,p/xp12) with λ~ip=(Qiωi)1+(Qpωp)1, i=1,,p1. Then the matrix S defined in (Equation5) is given by S=(ωp+1Qpi=1p11λ~ip1Q1Qpλ~1p1Qp1Qpλ~p1,p1Q1Qpλ~1p1Q1λ~1p01Qp1Qpλ~p1,p01Qp1λ~p1,p),which is entirely unrelated to the parameters (β1,,βp1). Hence, the solutions of xi and ωi for i=1,,p1 do not depend on the parameters (β1,,βp1).

In order to prove the following Theorems 4.1 and 4.2, the extended design region with xi(,vi] for βi>0 and xi[ui,) for βi<0 will be considered.

Proof

Proof of Theorem 4.1

We will only prove the case τ=1, and the others can be treated similarly. With the previous decomposition strategy the information matrix of ξ1 can be written as M(ξ1,β)=X~DωX~=XQ1/2DωQ1/2X,where X=(a1a1a11p11p1a1diag(x1)),with a1=(a2,,ap), 1p1=(1,,1) and x1=(x2/β2,,xp/βp). Denote y1=(β2/x2,,βp/xp) and z1=(a2β2/x2,,apβp/xp), and then we have X1=(1z11p1a11a1z1y1diag(y1)).It follows that (X)1a=e1,p and (X)1(a(xi/βi)ei,p)=ei,p, i=2,,p.

Now letting i(xi)=i(xi,ξ) and hi(xi)=hi(xi,ξ) which are defined in (Equation7) to simplify writing, and using the formulas (Equation4) and (Equation5) we obtain i(ai)=e1,pQ1/2Dω1SDω1Q1/2e1,p=s11(ω1)2Q(aβ)=pQ(aβ)=p1(a1),and i(aixi/βi)=p/Q(aβxi) for i=2,,p. Thus we have h1(a1)=hi(ai)=hi(aixi/βi)=0 for i=2,,p. As in the proof of Theorem 9 in Schmidt (Citation2019), it is easily shown that hi(xi) has at most two roots, limxi±hi(xi)=± for βi>0 and limxi±hi(xi)= for βi<0. To prove the R-optimality of ξ, it suffices here to show that hi(aixi/βi)=0 holds since h1(x1)<0 for x1[u1,v1]{a1} and hi(ai)=0 for all i=2,,p. We have i(x)=2ei,pM(ξ,β)1(j=1pej,pej,pej,pM(ξ,β)1ej,p)M(ξ,β)1(a+(xai)ei,p).For i=2,,p, utilizing ei,pX1=(βi/xi)(1,ei,p1) we obtain i(aixiβi)=2βixi(1,ei,p1)Q1/2Dω1SDω1Q1/2ei,p=2(βi/xi)s1iω1ωiQ(aβ)Q(aβxi)2(βi/xi)siiωi2Q(aβxi).Hence hi(aixiβi)=p[2(βi/xi)Q(aβ)Q(aβxi)s1is11sii2(βi/xi)Q(aβxi)+βiQ(aβxi)(Q(aβxi))2].The equations hi(aixi/βi)=0 are equivalent to the equations xi2Q(aβxi)Q(aβxi)[1Q(aβxi)Q(aβ)s1is11sii]=0for i=2,,p. If the xi and ωi are the solutions of this system of equations combined with Equation (Equation4), then the design ξ1 is R-optimal.

Proof

Proof of Theorem 4.2

With the same decomposition method we have M(ξ,β)=X~DωX~=XQ1/2DωQ1/2X,where X=1padiag(x) and x=(x1/β1,,xp/βp). Let y=(β1/x1,,βp/xp) and z=(a1β1/x1,,apβp/xp), and then we get X1=diag(y)yz1z1pand (X)1(a(xi/βi)ei,p)=ei,p. We still let i(xi)=i(xi,ξ) and hi(xi)=hi(xi,ξ) to simplify writing. It follows from the formulas (Equation4) and (Equation5) that i(aixi/βi)=ei,pQ1/2Dω1SDω1Q1/2ei,p=si,i(ωi)2Q(aβxi)=pQ(aβxi).Thus we have hi(aixi/βi)=0 for i=1,,p. Due to the fact that hi(xi) has at most two roots, limxi±hi(xi)=± for βi>0 and limxi±hi(xi)= for βi<0. To prove the R-optimality of ξ, it suffices here to show that hi(aixi/βi)=0 holds since hi(ai)<0 for all i=1,,p. We have i(x)=2ei,pM(ξ,β)1(j=1pej,pej,pej,pM(ξ,β)1ej,p)M(ξ,β)1(a+(xai)ei,p).With ei,pX1=(βi/xi)(ei,p+z/(1z1p)) we have i(aixiβi)=2βixi(ei,p+z1z1p)Q1/2Dω1SDω1Q1/2ei,p=2(βi/xi)siiωi2Q(aβxi)2(βi/xi)(1zT1p)Q(aβxi)j=1pzjsijωiωjQ(aβxj).Hence hi(aixiβi)=p[2(βi/xi)Q(aβxi)2(βi/xi)(1zT1p)Q(aβxi)j=1pzjsij/siisjjQ(aβxj)+βiQ(aβxi)(Q(aβxi))2].The equations hi(aixi/βi)=0 (i=1,,p) are equivalent to xi2Q(aβxi)Q(aβxi)[11z1pj=1pzjQ(aβxi)Q(aβxj)sijsiisjj+1]=0,i=1,,p.If the xi and ωi are the solutions of this system of equations combined with Equation (Equation4), then the design ξ is R-optimal.