Publication Cover
Journal of Quality Technology
A Quarterly Journal of Methods, Applications and Related Topics
Volume 55, 2023 - Issue 3
2,094
Views
2
CrossRef citations to date
0
Altmetric
Articles

Analyzing dispersion effects from replicated order-of-addition experiments

ORCID Icon

Abstract

Dispersion effects may play a vital role, in addition to location effects, in exploring optimal addition orders of several materials in some chemical, industrial and pharmaceutical studies. Two replication-based statistical methods developed using frequentist and fiducial probability arguments are introduced in this paper to identify active dispersion effects from replicated order-of-addition experiments. Simulation results show that both approaches can maintain empirical sizes sufficiently close to the nominal level while their finite-sample performances are very similar. From a statistical perspective, the fiducial method can provide a unified probability framework to analyze dispersion effects as well as location effects. However, it is computationally more expensive than the frequentist method. Consequently, the frequentist method is recommended for real-world applications due to its low computational cost. A drug combination study is used to illustrate these two approaches. In addition, some eligible order-of-addition designs are collected in a catalogue for future work.

1. Introduction

Physical and/or chemical properties of process outputs can be very different when materials are added in different orders in some chemical, industrial and pharmaceutical studies. To explore the optimal addition order, researchers may conduct several trials in which the materials are deliberately added in different orders. Often, these trials are called order-of-addition experiments, and the materials of interest are called the components. After analyzing the observed data, active order effects can be first identified to study how the process output depends on the addition order of these components. Next, based on the identified effects, an empirical model can be built to predict the process outputs of those untested addition orders for the purpose of determining the optimal addition order. Efficient experimental designs and analysis methods for order-of-addition experiments have received considerable attention in recent years. Van Nostrand (Citation1995) first established the relationship between order-of-addition designs and two-level factorial designs using the concept of pairwise order factors. Based on this key concept, Voelkel (Citation2019) defined order-of-addition orthogonal arrays for designing such experiments. By definition, a pairwise order matrix, whose entries are either +1 or −1 determined by all component addition orders, is called an order-of-addition orthogonal array of strength two if, for any two columns, the frequencies of the ordered pairs (1,1), (+1,1), (1,+1) and (+1,+1) are proportional to those of the full design. Furthermore, Peng, Mukerjee, and Lin (Citation2019) developed several theoretical results to characterize and construct optimal order-of-addition designs. Most remarkably, they showed that an order-of-addition design is optimal for estimating the overall mean and all main effects of pairwise order factors with respect to a wide range of optimality criteria if its pairwise order matrix is an order-of-addition orthogonal array of strength two. Later, many combinatorial and computational methods were used to construct optimal or near-optimal order-of-addition designs. Recent proposals include those by Chen, Mukerjee, and Lin (Citation2020), Winker, Chen, and Lin (Citation2020), Zhao, Lin, and Liu (Citation2021), Tsai (Citation2022), Wang and Mee (Citation2022) and Zhao, Lin, and Liu (Citation2022). The reader is referred to Lin and Peng (Citation2019) for a comprehensive review of order-of-addition experiments.

Conventionally, active effects can be classified into two types. If an effect has a significant impact on the response mean, then it is called an active location effect. After characterizing the relationship between response mean and all active location effects, the expected process output can be adjusted by a practitioner toward a target value to address a nominal-the-best problem. Similarly, a practitioner can determine a setting such that the expected process output will be as large/small as possible to address a larger-/smaller-the-better problem. On the other hand, an effect is said to be an active dispersion effect if it has a significant impact on the response variance. To address a nominal-the-best problem, after relating the response variance to all active dispersion effects, a practitioner can reduce process variability around a target value. Those null effects can be set to their most economical levels for the purpose of reducing the cost. With these two kinds of important effects in hand, a practitioner can optimize the process to address a nominal-the-best problem, that is, the expected process output is arbitrarily close to a target value with minimum run-to-run variability. By modeling the response mean and response variance as linear functions of pairwise order factors, Chen, Zhang, and Lin (Citation2021) developed a systematic procedure to determine the optimal addition order in the presence of active dispersion effects. However, it is not clear how to identify these active dispersion effects. As far as I know, statistical methods for identifying active dispersion effects from order-of-addition experiments have not yet been studied systematically. The following is a brief review of some existing methods for factorial experiments. Box and Meyer (Citation1986) introduced a residual-based test to identify active dispersion effects using unreplicated two-level factorial designs. Bergman and Hynén (Citation1997) and McGrath and Lin (Citation2001a) proposed further refinements. Pan (Citation1999) and McGrath and Lin (Citation2001b) noted that location effects could be confounded with dispersion effects in unreplicated experiments. This problem can be readily solved by conducting a replicated experiment. Nair and Pregibon (1988) compared three different statistical methods for estimating dispersion effects using replicated two-level factorial designs. Nelder and Lee (Citation1991) and Lee and Nelder (Citation1998) used different mean and variance models to achieve the same goal. Additionally, related inference procedures for these generalized linear models have been developed using asymptotic theory. Consequently, their performance can be unsatisfactory when the sample size is relatively small. Based on the concept of fiducial generalized pivotal quantities, Tsai, Liao, and Chai (Citation2015) proposed a replication-based method to identify active dispersion effects from fully replicated and partially replicated two-level factorial experiments. A necessary prerequisite for applying their approach is that the experimental design must be a two-level orthogonal array of strength two. Specifically, a matrix, whose entries are either +1 or −1, is called a two-level orthogonal array of strength two if, for any two columns, the ordered pairs (1,1), (+1,1), (1,+1) and (+1,+1) occur equally often. Orthogonal arrays are important combinatorial tools that have been widely used to design multifactorial experiments. The reader is referred to Hedayat, Sloane, and Stufken (Citation1999) for a comprehensive introduction to orthogonal arrays and their applications. However, when pairwise order factors are used to decode component addition orders into real-valued levels, the pairwise order matrix cannot be column-orthogonal, that is, it cannot be a two-level orthogonal array of strength two. As a result, the fiducial method developed by Tsai, Liao, and Chai (Citation2015) cannot be applied directly. This paper aims to provide feasible solutions to identify active dispersion effects from replicated order-of-addition experiments.

The remainder of this paper is organized as follows. Section 2 presents the pairwise order mean and variance models. Section 3 introduces two replication-based statistical methods to test whether or not a dispersion effect has a significant impact on the response variance. These two methods are evaluated in Section 4 via a series of simulation studies. Section 5 uses a drug combination study to show that active dispersion effects may exist in real-world order-of-addition experiments. Some eligible order-of-addition designs are provided in Section 6 for future work. Concluding remarks are given in the final section.

2. Pairwise order mean and variance models

Suppose that an n-treatment order-of-addition experiment is conducted to study m components of interest. Every treatment has r replicates, so the total run size is equal to N=n×r. Specifically, a treatment can be represented by a permutation of the component labels 1,2,,m. To implement the hth treatment th, a practitioner must add these m components sequentially according to the order specified by th. Assume that the responses Yh1,Yh2,,Yhr are independent and identically distributed normal random variables, denoted by Yh1,Yh2,,YhriidN(μh,σh2). The observations of different treatments are also assumed to be independent. To characterize the potential impact of location effects on the response mean, assume that [1] μh=γ0+i=1m1j=i+1mγijzh,ij,[1] where γ0 represents the overall mean, γij denotes the pairwise order location effect of components i and j, and zh,ij represents the pairwise order factor given by zh,ij={+1 if i precedes j in th;1 if j precedes i in th. The concept of pairwise order factors was proposed by Van Nostrand (Citation1995). Based on this key concept, the pairwise order mean model in [1] was first used by Voelkel (Citation2019) to construct order-of-addition orthogonal arrays. Later, Peng, Mukerjee, and Lin (Citation2019) proposed further generalizations of this model to take component distances into account. In fact, pairwise order factors are pseudo factors which are designed for the purpose of decoding component addition orders into real-valued levels. This intuitive and simple decoding scheme can be particularly useful for identifying active order effects. Given the number of components m, there are q=m(m1)/2 pairwise order factors in [1], with the result that the pairwise order mean model has p=q+1 parameters in total. Suppose that the pairwise order location effect γij is positive. The expected response μh may increase by γij if component i is added before component j in th. Similarly, the expected response μh may decrease by γij if component i is added after component j in th. Because all parameters have straightforward interpretations, this model is often used in practice. Some real-world applications can be found in the papers by Voelkel and Gallagher (Citation2019) and Mee (Citation2020).

To establish the relationship between response variance and pairwise order factors, assume that [2] σh2=δ0i=1m1j=i+1mδijzh,ij2,[2] where δ0 represents the baseline variance, and δij denotes the pairwise order dispersion effect of components i and j. Accordingly, if some pairwise order dispersion effects are not equal to one, then the response variance would be treatment-dependent. Suppose that δij=4 and the remaining pairwise order dispersion effects are all equal to one. The response variance σh2 would be twice as large as the baseline variance δ0 if component i is added before component j in th. Similarly, if component i is added after component j in th, then the response variance σh2 would be half as large as the baseline variance δ0. The pairwise order variance model in [2] also provides straightforward interpretations of all parameters. A similar multiplicative model was used by McGrath and Lin (Citation2001a) and Tsai, Liao, and Chai (Citation2015) to analyze two-level factorial experiments. In particular, the pairwise order mean and variance models in [1] and [2] are both main effects models. They take only the main effects of pairwise order factors into account. In practice, incorporating interaction effects into these main effects models may yield better fitting results. This analysis strategy will be briefly discussed in the final section. In addition to statistical models, another key to the success of identifying active effects is the experimental design. A well-designed experiment enables us to extract as much information as possible from the observed data. Related design issues will be addressed in Section 6.

3. Two replication-based statistical methods

Two replication-based statistical methods developed using different probability arguments are introduced in this section to identify active dispersion effects from replicated order-of-addition experiments.

3.1. Frequentist method

Let Wh denote the hth logarithmic sample variance given by Wh=logSh2, where Sh2 represents the sample variance of all observations of the hth treatment th. Based on the first-order approximation, the logarithmic sample variance Wh=logSh2 has approximately mean logσh2. Furthermore, based on Bartlett and Kendall (Citation1946) approximation, the random variable Wh approximately follows a normal distribution, denoted by [3] WhN(logσh2,ψ(r12)),[3] where ψ(·) represents the trigamma function. Note that the R function trigamma can be used to compute the trigamma function. A similar approximation to [3] can be found in (4.29) of Wu and Hamada (Citation2021) in which the variance of Wh is approximated by 2/(r1). Taking the logarithm of the multiplicative model in [2] yields [4] logσh2=α0+i=1m1j=i+1mαijzh,ij,[4] where α0=logδ0 and αij=(logδij)/2. The logarithmic transformation conducted in [4] is a frequently used approach to induce additivity and variance stabilization. Let W represent the n×1 vector consisting of all logarithmic sample variances W1,W2,,Wn. Based on the normal approximation in [3] and additive model in [4], the distribution of W can be approximated using a multivariate normal distribution, denoted by WN(,ψ(r12)I), where X represents the n × p model matrix, α denotes the p×1 vector consisting of all parameters on the right-hand-side of [4], and I represents the identity matrix of order n. Specifically, the first column of X is an n×1 vector of ones, and the last q columns of X constitute the n × q pairwise order matrix. Accordingly, the least squares estimator of α can be expressed in the following matrix form: α̂=(XX)1XW, where its variance-covariance matrix has the form Var(α̂)=(XX)1ψ(r12). Given a pairwise order dispersion effect δst, where 1s<tm, because the null and alternative hypotheses given by H0:δst=1 and H1:δst1 can be reformulated as H0:αst=0andH1:αst0, the frequentist test variable given by Zαst=α̂stcst,stψ(r12) can be used to evaluate whether or not αst is significantly different from zero, where α̂st denotes the least squares estimator of αst, and cst,st represents the diagonal entry of (XX)1 corresponding to α̂st. The distribution of Zαst can be approximated by the standard normal distribution. According to Theorem 2 of Peng, Mukerjee, and Lin (Citation2019), when the pairwise order matrix is an order-of-addition orthogonal array of strength two, the information matrix has the form XX=[n00nI+n3V]. Specifically, 0 is the q×1 zero vector, I is the identity matrix of order q, and V is a q × q symmetric matrix with rows and columns indexed by ij and kl, where 1i<jm, 1k<lm, and an entry of V is vij,kl={+1if i=k,jl or ik,j=l;1if i=l or j=k;0otherwise. Furthermore, it is straightforward to verify that the inverse of information matrix has the form (XX)1=[1n003(m1)n(m+1)I3n(m+1)V]. Therefore, the constant term cst,st can be explicitly characterized, so the frequentist test variable Zαst can be further simplified as [5] Zαst=α̂st3(m1)n(m+1)ψ(r12).[5] Sometimes, a research question can be addressed by testing whether or not the baseline variance δ0 is equal to a particular constant. To achieve this goal, a frequentist test variable for α0=logδ0 is provided in Appendix A. Bartlett and Kendall (Citation1946) noted that the normal approximation in [3] would be inaccurate if the number of replicates r is smaller than five. However, it is straightforward to see that α̂st is a linear combination of all logarithmic sample variances W1,W2,,Wn. The normal approximation for the distribution of Zαst in [5] can be further strengthened by a central limit theorem-like argument. Further details are shown in Appendix B. Based on the simulation results in Section 4, this approximation often yields satisfactory results. Most importantly, the frequentist test variable Zαst in [5] is easy to compute.

3.2. Fiducial method

Let Y denote a random sample obtained from a distribution indexed by a parameter vector ξ, where a realization of Y is denoted by y. Suppose that a research question can be addressed by estimating and/or testing a complicated function of ξ, denoted by θ=g(ξ).

Definition 1.

A random quantity Rθ=R(Y,y,ξ) is called a fiducial generalized pivotal quantity, abbreviated as FGPQ hereinafter, for θ if it satisfies the following two conditions.

  1. The distribution of Rθ is free of all unknown parameters.

  2. R(y,y,ξ)=θ for every allowable y.

Given an FGPQ Rθ, a fiducial test variable given by Tθ=Rθθ or Tθ=Rθ/θ can be used to evaluate hypotheses about θ. The concept of FGPQs has been widely used to develop inference procedures for complicated functions of parameters. Recent applications include those by Wang and Wu (Citation2018), Al-Sarraj, von Brömssen, and Forkman (Citation2019), Malekzadeh and Kharrati-Kopaei (Citation2020), and Wang et al. (Citation2021). The reader is referred to Hannig, Iyer, and Patterson (Citation2006) for a comprehensive introduction to FGPQs. Before presenting the fiducial method for testing whether or not δst is significantly different from one, the replication-based statistical method proposed by Tsai, Liao, and Chai (Citation2015) is introduced first. Let Pst={h:zh,st=+1}, and Nst={h:zh,st=1}, where P represents “positive” and N represents “negative.” The numbers of elements in Pst and Nst are denoted by pst and nst, respectively. These two index sets Pst and Nst can be used to classify the n sample variances into two groups according to the levels of pairwise order factor zst. Under the assumption of normality, when the pairwise order variance model in [2] is used, one has [6] Vh=(r1)Sh2σh2=(r1)Sh2δ0i=1m1j=i+1mδijzh,ij2χr12.[6] Tsai, Liao, and Chai (Citation2015) verified that the random quantity [7] Rδst=[hPst(r1)sh2Vh]1pst[hNst(r1)sh2Vh]1nst[7] would be an FGPQ for δst if [8] hPstzh,ij=hNstzh,ij=0[8] for every ijst. Based on the FGPQ Rδst, the fiducial test variable given by [9] Tδst=Rδstδst[9] can be used to identify active dispersion effects from fully replicated and partially replicated two-level factorial experiments. Although the pairwise order factors are two-level factors, the conditions in [8] can never be satisfied for an order-of-addition design due to the transitive property of pairwise order factors. The reader can consult Lin and Peng (Citation2019) for a brief discussion regarding the transitive property. Consequently, when the pairwise order factors are used to decode an order-of-addition design into a two-level design, the random quantity Rδst in [7] does not satisfy condition (B) of Definition 1, so some nuisance parameters may have a substantial impact on the fiducial test variable Tδst in [9]. This negative impact will be observed later from the simulation results in Section 4. Actually, the fiducial test variable Tδst in [9] is not recommended for analyzing real-world order-of-addition experiments. It is introduced only for comparative purposes. Below, another two sets are proposed to replace Pst and Nst, respectively, to eliminate the potential impact of nuisance parameters.

Definition 2.

Given two treatments ta and tb, their pairwise order vectors za and zb are called a pair of positive quasi-foldover vectors with respect to a pairwise order factor zst if za,st=zb,st=+1 and za,ij+zb,ij=0 for all ijst. Similarly, za and zb are called a pair of negative quasi-foldover vectors with respect to zst if za,st=zb,st=1 and za,ij+zb,ij=0 for all ijst.

Given a pair of positive quasi-foldover vectors za and zb with respect to zst, component s is added before component t both in ta and tb, and the remaining pairs of components are added in reverse order. A pair of negative quasi-foldover vectors with respect to zst has similar symmetric properties. Therefore, when an inference procedure for δst is developed based on positive and negative quasi-foldover vectors, the potential impact of nuisance parameters can be eliminated systematically. Let Pst and Nst denote the sets consisting of all index pairs of positive and negative quasi-foldover vectors with respect to zst. Formally, Pst and Nst can be written as Pst={(a,b):za,st=zb,st=+1 and za,ij+zb,ij=0 for every ijst}, and Nst={(a,b):za,st=zb,st=1 and za,ij+zb,ij=0 for every ijst}. Given an order-of-addition design, the two sets Pst and Nst can be uniquely determined for every pairwise order factor zst, where all index pairs (a, b) in Pst and Nst are unordered pairs. The following example is given to illustrate positive and negative quasi-foldover vectors.

Example 1.

Suppose that the full design is conducted to explore the optimal addition order of three components of interest. All treatments and their pairwise order vectors are listed in . It is straightforward to see that the two vectors z1 and z5 constitute a pair of positive quasi-foldover vectors with respect to z12. Additionally, another two vectors z3 and z6 constitute a pair of negative quasi-foldover vectors with respect to z12. Consequently, one has P12={(1,5)} and N12={(3,6)}. Similarly, the sets P13={(2,3)}, N13={(4,5)}, P23={(1,4)} and N23={(2,6)} can be obtained by searching for positive or negative quasi-foldover vectors over all pairwise order vectors.

Table 1. Treatments and pairwise order vectors of the three-component full design.

Assume that an order-of-addition design with nonempty sets Pst and Nst is conducted, where every treatment has r replicates. The two sets Pst and Nst would be nonempty if the experiment is designed carefully. This design issue will be addressed in Section 6. By replacing Pst and Nst in [7] with Pst and Nst, respectively, and replacing h with either a or b, a random quantity can be obtained as [10] Rδst=(a,b)Pst[(r1)sa2Va(r1)sb2Vb]12pst(a,b)Nst[(r1)sa2Va(r1)sb2Vb]12nst,[10] where pst and nst denote the numbers of index pairs in Pst and Nst, respectively. First, it is straightforward to see that the distribution of Rδst is free of all parameters. Therefore, condition (A) of Definition 1 is satisfied. Next, replacing Sh2 in [6] with its realization sh2 and replacing h with either a or b. Plugging the resulting term into [10] yields (a,b)Pst[(r1)sa2Va(r1)sb2Vb]12pst(a,b)Nst[(r1)sa2Va(r1)sb2Vb]12nst=(a,b)Pst[(δ0i=1m1j=i+1mδijza,ij2)(δ0i=1m1j=i+1mδijzb,ij2)]12pst(a,b)Nst[(δ0i=1m1j=i+1mδijza,ij2)(δ0i=1m1j=i+1mδijzb,ij2)]12nst=δstijst[δij12(a,b)Pst(za,ij+zb,ij)]12pstijst[δij12(a,b)Nst(za,ij+zb,ij)]12nst=δst, because (a,b)Pst(za,ij+zb,ij)=(a,b)Nst(za,ij+zb,ij)=0 for every ijst. That is, condition (B) of Definition 1 is also satisfied. Consequently, Rδst in [10] is an FGPQ for δst. Based on the FGPQ Rδst, the fiducial test variable given by [11] Tδst=Rδstδst[11] can be used to evaluate H0:δst=1 versus H1:δst1. Because the distribution of Tδst in [11] has no explicit form, the following Monte-Carlo algorithm is designed to estimate the required p-value.

  • Step 1: Choose a Monte-Carlo sample size K, e.g., set K = 5,000. Generate two independent chi-square random variables Va,k and Vb,k with r − 1 degrees of freedom for every index pair (a, b) in Pst and Nst for 1kK, respectively.

  • Step 2: Calculate the fiducial test variable Tδst,k in [11], where δst=1.

  • Step 3: The required p-value can be estimated using Pδst=2Qδst/K, where

Qδst=min[k=1KI(Tδst,k>1),k=1KI(Tδst,k<1)],

and I(·) represents the indicator function. Given a nominal level α, the null hypothesis H0:δst=1 should be rejected if the corresponding p-value Pδst is smaller than α. Specifically, a fiducial test variable for δ0 is presented in Appendix A to infer the baseline variance.

4. Simulation studies

To evaluate the frequentist and fiducial methods, the eight scenarios in with different combinations of active effects are chosen for simulation studies. These scenarios include all possible combinations of two mean models and four variance models. The four-component full design with n = 24 treatments is used to simulate experimental data, where the number of replicates r is set to either 2 or 3. Consequently, the run sizes N are equal to 48 and 72, respectively. The overall mean γ0 is fixed at 100, and the effect sizes of all active location effects γ are set to 25 over all scenarios. Additionally, the baseline variance δ0 is fixed at 10, and the effect sizes of all active dispersion effects δ are set to 4, 6 and 8, respectively. After specifying all parameter values, an N×1 response vector Y can be simulated from a multivariate normal distribution, denoted by YN(μ,Σ), where the entries of mean vector μ are determined by the pairwise order mean model in [1], the diagonal entries of variance-covariance matrix Σ are determined by the pairwise order variance model in [2], and all off-diagonal entries of Σ are equal to zero. Specifically, each entry of Y is generated via [12] Yh=μh+σhZh,[12] where Zh represents a standard normal random variable. Based on this data-generating scheme, 5,000 simulated response vectors are generated for each set of simulation parameters consisting of scenario, r and δ. All simulated data are then analyzed using the test variables Zαst in [5], Tδst in [9] and Tδst in [11], respectively. In fact, as noted in Section 3.2, the fiducial test variable Tδst in [9] is not valid for analyzing order-of-addition experiments. It is included only for comparative purposes. The empirical powers are calculated for all pairwise order dispersion effects, where the nominal level is set to 0.05. The empirical power for each effect is defined as the proportion of the 5,000 p-values that are smaller than the nominal level. In addition, the empirical power of a null effect is called the empirical size. All simulation results are presented in Appendix C.

Table 2. Active effects in pairwise order mean and variance models for simulation studies.

By comparing the empirical powers of the first four and last four scenarios in , the finite-sample performances of these test variables are observed to be independent of the pairwise order mean models because they are all replication-based methods. Although the fiducial test variable Tδst in [9] provides larger empirical powers for active effects, as shown in , it cannot maintain the empirical sizes for null effects. Because some nuisance parameters may have a substantial impact on Tδst in [9], many null effects are mistakenly identified as active effects, such that their empirical sizes are considerably larger than the nominal level. Most often, such high false positive rates are not acceptable. To obtain more reliable results, the fiducial test variable Tδst in [9] should be superseded by another two test variables. On the other hand, as shown in , the frequentist test variable Zαst in [5] and the fiducial test variable Tδst in [11] can maintain empirical sizes sufficiently close to the nominal level. Moreover, as expected, their empirical powers increase as the dispersion effect size δ increases and as the number of replicates r increases. These two test variables have very similar finite-sample behaviors in all scenarios in . In conclusion, both test variables are feasible for identifying active dispersion effects.

To evaluate whether or not the frequentist and fiducial methods are robust to the assumption of normality, a similar data-generating scheme to [12] given by [13] Yh=μh+σh[UhE(Uh)Var(Uh)][13] is used to simulate non-normal data. In the brackets of [13], the non-normal random variable Uh is standardized to ensure that the simulated data generated by [12] and [13] have the same mean and variance, such that the additional simulation results can be compared with those presented in . Five different distributions are assumed for the non-normal random variable Uh in [13], including the t-distribution with three degrees of freedom, denoted by T3, and four gamma distributions, denoted by Γ1,2, Γ2,2, Γ3,2 and Γ6,2, respectively, where the subscripts indicate the shape and scale parameters. These gamma distributions, with excess kurtosis parameters equal to 6, 3, 2 and 1, respectively, are chosen to explore the potential impact of the excess kurtosis parameter on these two methods. Given a non-normal distribution for Uh, 5,000 simulated response vectors are generated for each set of simulation parameters. All simulated data are then analyzed using the test variables Zαst in [5] and Tδst in [11], respectively, to calculate their average sizes and average powers. As shown in , when the assumption of normality is violated, the empirical sizes are all larger than the nominal level 0.05, and the frequentist and fiducial methods have very similar behaviors. The differences between empirical sizes and nominal level are relatively large when the excess kurtosis parameter is equal to 6. However, the empirical size approaches the nominal level as the excess kurtosis parameter is decreased in every scenario in . In particular, when the simulated data are generated from the gamma distribution Γ6,2, where the excess kurtosis parameter is equal to 1, the differences between empirical sizes and nominal level can be negligible. Additionally, as shown in , these two methods have similar empirical powers for analyzing non-normal data. Overall, both frequentist and fiducial methods appear to be robust to the assumption of normality.

5. Drug combination data

Wang, Xu, and Ding (Citation2020) conducted a series of in vitro experiments to study whether or not treatment order of chemotherapeutics and their dosages have a significant impact on lymphoma cell growth. Three FDA-approved chemotherapeutics for clinical trials, namely, paclitaxel, doxorubicin and mitoxantrone, denoted by 1, 2 and 3, respectively, were sequentially added into lymphoma cell cultures in the first experiment. In particular, chemotherapeutics 1 and 2 were set at two different concentration levels and the level of chemotherapeutic 3 was held fixed throughout the experiment. Accordingly, six treatment orders and four level combinations were taken into account. A 24-run cross-product array was used to design this experiment, that is, all order-level combinations were conducted, where each treatment had three replicates. All responses were then measured by a tumor inhibition index. Based on Table S3 of the original paper, the dosage effects are negligible. Therefore, observations with the same treatment order are further merged, such that each treatment has 12 replicates. In addition, this experiment had some observations generated by adding these three chemotherapeutics into lymphoma cell cultures simultaneously. These observations are not included in the later analysis. The adapted design and related summary statistics are shown in .

Table 3. Adapted drug combination data.

As shown in , when the three chemotherapeutics are added in different orders, sample averages and sample variances are quite different. This interesting finding indicates that treatment order may have a substantial impact on both the response mean and response variance. To understand how the response mean and response variance depend on the treatment order, identifying active effects can be practical. In particular, it is straightforward to see that the three sample variances s12, s32 and s42 are larger than three other sample variances s22, s52 and s62. One might suspect that some active dispersion effects may affect the response variance. The frequentist test variable Zαst in [5] and fiducial test variable Tδst in [11] are used to analyze the drug combination data. The positive and negative quasi-foldover vectors required by the fiducial method are given in Example 1, and the corresponding p-values are estimated using the Monte-Carlo algorithm introduced in Section 3.2. When the nominal level is set to 0.05, as shown in , both methods indicate that the pairwise order dispersion effect δ23 differs significantly from one. To reduce patient-to-patient variability in tumor cell inhibition, chemotherapeutic 3 should be administered before chemotherapeutic 2.

Table 4. p-values for adapted drug combination data

After exploring potentially active dispersion effects, two other methods are used to identify active location effects under the heterogeneous variance assumption, where the response variances are related to the levels of pairwise order factor z23. The first method is the Student’s t-test, where the degrees of freedom is determined according to the Satterthwaite’s (Citation1946) approximation. The second method is the fiducial test developed by Tsai and Liao (Citation2020), where the reduced pairwise order variance model is set to σh2=δ0δ23zh,232. Under the pairwise order mean and variance models in [1] and [2], this fiducial test remains valid for identifying active location effects from fully replicated order-of-addition experiments. The reader is referred to the original paper for more details regarding this approach. According to the p-values in , the pairwise order location effects γ12 and γ23 are identified as active effects by both methods when the nominal level is set to 0.05. To maximize the expected response, chemotherapeutic 1 should be administered before chemotherapeutic 2 and chemotherapeutic 3 should be administered before chemotherapeutic 2. By integrating all screening results in , conducting the second treatment is expected to yield optimal efficacy of tumor cell inhibition with minimum patient-to-patient variability. This statistical analysis provides a possible explanation of the adapted drug combination data in . The analysis results may be useful for researchers to develop a novel combination therapy for treating lymphoma. Most interestingly, this example shows that active dispersion effects may exist in order-of-addition experiments. Therefore, related statistical methods can be practical for real-world applications.

6. Eligible designs

Given an order-of-addition design, if either Pst or Nst is empty, then the dispersion effect δst would not be testable by the fiducial test variable Tδst in [11]. Therefore, an order-of-addition design is said to be eligible for testing all dispersion effects if these two sets Pst and Nst are nonempty for every pairwise order factor zst. From a practical perspective, conducting an eligible design allows us to analyze the observed data using both frequentist and fiducial methods, such that more insights can be gained by comparing their results. To construct eligible designs, an intuitive but important result is introduced first.

Theorem 1.

Given the number of components m, the full design is eligible for testing all pairwise order dispersion effects using the fiducial test variable Tδst in [11], where pst=nst=(m1)!/2 for every pairwise order factor zst.

Proof.

Given a treatment ta, its pairwise order vector za belongs to a pair of positive quasi-foldover vectors with respect to a pairwise order factor zst only if component s is added right before component t in ta. Given such a treatment ta, let tb denote the treatment obtained by reversing the treatment ta except for the two components s and t. It is straightforward to verify that their pairwise order vectors za and zb constitute a pair of positive quasi-foldover vectors with respect to zst. There are (m1)! such treatments in the full design for every pairwise order factor zst, with the result that pst=(m1)!/2. By similar arguments, one gets nst=(m1)!/2. This completes the proof. □

According to Theorem 1, when analyzing dispersion effects using the full design, the number of treatments involving in the fiducial test variable Tδst in [11] is equal to 2(m1)!, with the result that 2/m of the treatments are used to test a particular dispersion effect. The proportion 2/m decreases as the number of components m increases. However, as noted by Voelkel (Citation2019) and Peng, Mukerjee, and Lin (Citation2019), when m is moderately large, the full design can be expensive and/or time-consuming due to its excessively large run size. For example, when there are seven components to be investigated, i.e., m = 7, a practitioner must add these seven components sequentially in 7!=5,040 different orders to conduct the full design. Because all treatments must be replicated two or more times to apply a replication-based method, at least 10,080 runs must be conducted. Such a large-scale experiment is rarely feasible in practice. Given resource constraints, an eligible fractional design consisting of a subset of selected treatments is a cost-effective alternative. The following example is given to illustrate ineligible fractional designs.

Example 2.

Suppose that an eligible fractional design is required to study four components of interest. Each treatment will be replicated several times for the purpose of identifying active dispersion effects. Designs 1 and 2 in of Voelkel (Citation2019) are considered for this experiment, where their treatments and pairwise order vectors are listed in . In particular, the pairwise order matrices of these two designs are order-of-addition orthogonal arrays of strength two. After searching for positive and negative quasi-foldover vectors with respect to every pairwise order factor, Design 2 is found to have empty P14 and N14. Accordingly, the pairwise order dispersion effect δ14 would not be testable by the fiducial test variable Tδst in [11] if Design 2 is chosen for the experiment. On the other hand, Design 1 is found to be eligible for testing all dispersion effects. Most interestingly, as noted by Voelkel (Citation2019), Design 1 is superior to Design 2 according to some other optimality criteria. Therefore, Design 1 is recommended for this experiment.

Given an arbitrary fractional design, as shown by Example 2, it can be ineligible for testing certain dispersion effects. Therefore, generating a series of eligible fractional designs can be practical. Because order-of-addition orthogonal arrays are well-studied in the current literature, they are a good starting point when searching for eligible fractional designs. Below, two software packages are used to generate eligible fractional designs. Given the number of components m and the number of treatments n, the R function optFederov in the package AlgDesign developed by Wheeler (Citation2022) is used to generate 10,000 candidate designs whose pairwise order matrices are order-of-addition orthogonal arrays of strength two. Typically, optFederov is a time-consuming procedure to generate candidate designs for those cases with more than five components. Another program Latinoa developed by Tsai (Citation2022) is implemented for such cases. Next, an eligible design is identified from these candidate designs by searching for positive and negative quasi-foldover vectors with respect to every pairwise order factor. All obtained designs are collected in the supplementary data of this paper and their parameters are shown in . Based on this design catalogue, researchers can choose eligible fractional designs with the most suitable run sizes for their experiments.

Table 5. Designs 1 and 2 in of Voelkel (Citation2019).

Table 6. Parameters of eligible order-of-addition orthogonal arrays of strength two.

The numbers of quasi-foldover vector pairs in each design in are also reported in the supplementary data. Because pst and nst are often not equal over all pairwise order factors, the minimum number of quasi-foldover vector pairs, denoted by s, is reported in . Specifically, s indicates that at least s pairs of positive and negative quasi-foldover vectors can be used to test a particular dispersion effect using the fiducial test variable Tδst in [11]. Given an order-of-addition orthogonal array of strength two, quasi-foldover vector pairs can be very sparse, especially when m is moderately large and n is relatively small. A theoretical upper bound for the number of quasi-foldover vector pairs can be practical in searching for eligible fractional designs. This topic is beyond the scope of this paper. However, it deserves further investigation.

7. Concluding remarks

Because both frequentist and fiducial methods require an n-treatment order-of-addition design to be replicated r times, the run size is equal to N=n×r. The run size N can be quite large when n is moderately large, with the result that it can be infeasible to conduct such a large-scale experiment. As pointed out by an anonymous paper reviewer, a practitioner can project an unreplicated design onto a subset of components, such that all reduced treatments have two or more replicates. According to Theorem 3 of Zhao, Lin, and Liu (Citation2022), after projecting onto any two or three components of an order-of-addition orthogonal array of strength two, every reduced treatment has two or more replicates. By fitting a reduced pairwise order variance model to the observed data, the frequentist method is then applicable, where the observations of each reduced treatment must be assumed to have the same mean. On the other hand, the fiducial method is also applicable under such an assumption. To see this, consider the two four-component treatments t1 and t2 in . First, it is straightforward to see that the two vectors z1 and z2 constitute a pair of positive quasi-foldover vectors with respect to z12. Next, assume that δ14=δ24=δ34=1, that is, all dispersion effects involving component 4 are inactive. After deleting the three pairwise order factors z14, z24 and z34, the two reduced vectors z1 and z2 constitute a pair of positive quasi-foldover vectors with respect to z12. Note that the symbol · in is used to indicate that the pairwise order factor is deleted. Based on this observation, it is straightforward to verify that an eligible design for m components is still eligible for m components after leaving out mm components. Consequently, the fiducial method can be used when some components are left out of the original design.

Table 7. Full and reduced positive quasi-foldover vectors.

Mee (Citation2020) noted that adding interaction effects to the pairwise order mean model in [1] may yield better fitting results. Some real-world examples can be found in his paper. Because the frequentist and fiducial methods are based on sample variances of replicated treatments which are free of the location parameters μ1,μ2,,μn, these two methods can still be used to identify active dispersion effects when some interaction effects are incorporated into the pairwise order mean model in [1]. Sometimes, when conducting order-of-addition experiments, different treatments may have different numbers of replicates due to resource constraints. The fiducial method can be easily extended to analyze such unequi-replicated data. Suppose that the treatment th now has rh replicates. Under the assumption of normality, based on the pairwise order variance model in [2], one has [14] Vh=(rh1)Sh2σh2=(rh1)Sh2δ0i=1m1j=i+1mδijzh,ij2χrh12.[14] The key difference between [14] and [6] is that the constant term r is replaced with rh, with the result that the degrees of freedom of Vh can now be varied according to the number of replicates. Based on the chi-square random variables V1,V2,,Vn having the form in [14], when r1,r2,,rn are different, the random quantity Rδst in [10] can be verified as an FGPQ for δst using similar arguments. That is, the fiducial method remains valid for analyzing unequi-replicated data. On the other hand, when r1,r2,,rn are not equal, the distributions of all logarithmic sample variances W1,W2,,Wn can be first approximated by normal distributions with heterogeneous variances. Next, the weighted least squares method can be used to estimate all parameters on the right-hand-side of [4]. However, when the pairwise order matrix is an order-of-addition orthogonal array of strength two, related frequentist test variables have no compact form such as that in [5]. In my view, the fiducial method appears to be more conceptually systematic for analyzing both equi-replicated and unequi-replicated data.

In order-of-addition experiments, as pointed out by another anonymous paper reviewer, active dispersion effects could be due to noise factors whose values are hard to control. For example, when exploring the optimal addition order of several food ingredients, certain chemical properties of these food ingredients can be very different in various difficult-to-control environments, with the result that the response variances can be heterogeneous. To address such order-of-addition problems, robust parameter design methodology with manipulated noise factors can be applied. Two replication-based statistical methods developed using frequentist and fiducial probability arguments are introduced in this paper to identify active dispersion effects from replicated order-of-addition experiments. Simulation results show that both approaches can maintain empirical sizes sufficiently close to the nominal level while their finite-sample performances are very similar. Because the fiducial test proposed by Tsai and Liao (Citation2020) remains valid to identify active location effects under the pairwise order mean and variance models in [1] and [2], it can be incorporated into the experimental data analysis as shown in Section 5. From a statistical perspective, the fiducial method can provide a unified probability framework to analyze dispersion effects as well as location effects. However, its computational cost is larger than that of the frequentist method. Therefore, the frequentist method is further recommended for real-world applications due to its low computational cost.

Supplemental material

Supplemental Material

Download Zip (68.4 KB)

Acknowledgements

I would like to express my thanks to the editor and two anonymous referees for their valuable comments and constructive suggestions. I am grateful to Computer and Information Networking Center of National Taiwan University for the support of high-performance computing facilities. Thanks also go to Xianting Ding (Professor, Shanghai Jiao Tong University) for providing the original drug combination data.

Disclosure statement

No potential conflict of interest was reported by the author.

Additional information

Funding

Shin-Fu Tsai’s research was supported in part by National Science and Technology Council of Taiwan (Grant Number NSTC 111-2118-M-002-005-MY2).

Notes on contributors

Shin-Fu Tsai

Shin-Fu Tsai is an associate professor in the Department of Agronomy at National Taiwan University. His email address is [email protected].

References

  • Al-Sarraj, R., C. von Brömssen, and J. Forkman. 2019. Generalized prediction intervals for treatment effects in random-effects models. Biometrical Journal. Biometrische Zeitschrift 61 (5):1242–57. doi: 10.1002/bimj.201700255.
  • Bartlett, M. S., and D. G. Kendall. 1946. The statistical analysis of variance-heterogeneity and the logarithmic transformation. Supplement to the Journal of the Royal Statistical Society 8 (1):128–38. doi: 10.2307/2983618.
  • Bergman, B., and A. Hynén. 1997. Dispersion effects from unreplicated designs in the 2k−p series. Technometrics 39:191–8.
  • Box, G. E. P., and D. Meyer. 1986. Dispersion effects from fractional designs. Technometrics 28 (1):19–27. doi: 10.1080/00401706.1986.10488094.
  • Chen, J., R. Mukerjee, and D. K.-J. Lin. 2020. Construction of optimal fractional order-of-addition designs via block designs. Statistics and Probability Letters 161:108728. doi: 10.1016/j.spl.2020.108728.
  • Chen, J., X.-R. Zhang, and D. K.-J. Lin. 2021. Analysis of replicated order-of-addition experiments. Statistics and Applications 19:453–66.
  • Hannig, J., H. Iyer, and P. Patterson. 2006. Fiducial generalized confidence intervals. Journal of the American Statistical Association 101 (473):254–69. doi: 10.1198/016214505000000736.
  • Hedayat, A. S., N. J. A. Sloane, and J. Stufken. 1999. Orthogonal arrays: Theory and applications. New York, NY: Springer-Verlag.
  • Lee, Y., and J. A. Nelder. 1998. Generalized linear models for the analysis of quality-improvement experiments. Canadian Journal of Statistics 26 (1):95–105. doi: 10.2307/3315676.
  • Lin, D. K.-J., and J. Peng. 2019. Order-of-addition experiments: A review and some new thoughts. Quality Engineering 31 (1):49–59. doi: 10.1080/08982112.2018.1548021.
  • Malekzadeh, A., and M. Kharrati-Kopaei. 2020. Simultaneous confidence intervals for the quantile differences of several two-parameter exponential distributions under the progressive type II censoring scheme. Journal of Statistical Computation and Simulation 90 (11):2037–56. doi: 10.1080/00949655.2020.1762084.
  • McGrath, R. N., and D. K.-J. Lin. 2001a. Testing multiple dispersion effects in unreplicated fractional factorial designs. Technometrics 43 (4):406–14. doi: 10.1198/00401700152672492.
  • McGrath, R. N., and D. K.-J. Lin. 2001b. Confounding of location and dispersion effects in unreplicated fractional factorials. Journal of Quality Technology 33 (2):129–39. doi: 10.1080/00224065.2001.11980062.
  • Mee, R. W. 2020. Order-of-addition modeling. Statistica Sinica 30:1543–59. doi: 10.5705/ss.202018.0210.
  • Nelder, J. A., and Y. Lee. 1991. Generalized linear models for the analysis of Taguchi-type experiments. Applied Stochastic Models and Data Analysis 7 (1):107–20. doi: 10.1002/asm.3150070110.
  • Pan, G. 1999. The impact of unidentified location effects on dispersion-effects identification from unreplicated factorial designs. Technometrics 41 (4):313–26. doi: 10.1080/00401706.1999.10485931.
  • Peng, J., R. Mukerjee, and D. K.-J. Lin. 2019. Design of order-of-addition experiments. Biometrika 106 (3):683–94. doi: 10.1093/biomet/asz025.
  • Satterthwaite, F. E. 1946. An approximate distribution of estimates of variance components. Biometrics Bulletin 2 (6):110–4. doi: 10.2307/3002019.
  • Tsai, S.-F. 2022. Generating optimal order-of-addition designs with flexible run sizes. Journal of Statistical Planning and Inference 218:147–63. doi: 10.1016/j.jspi.2021.11.001.
  • Tsai, S.-F., and C.-T. Liao. 2020. Detection of location and dispersion effects from partially replicated two-level factorial designs. Journal of Quality Technology 52 (3):281–92. doi: 10.1080/00224065.2019.1571349.
  • Tsai, S.-F., C.-T. Liao, and F.-S. Chai. 2015. Identification of dispersion effects from partially replicated two-level factorial designs. Journal of Quality Technology 47 (1):43–53. doi: 10.1080/00224065.2015.11918105.
  • Van Nostrand, R. C. 1995. Design of experiments where the order of addition is important. In ASA Proceedings of the Section on Physical and Engineering Sciences, 155–60. Alexandria, VA: American Statistical Association.
  • Voelkel, J. G. 2019. The design of order-of-addition experiments. Journal of Quality Technology 51 (3):230–41. doi: 10.1080/00224065.2019.1569958.
  • Voelkel, J. G., and K. P. Gallagher. 2019. The design and analysis of order-of-addition experiments: An introduction and case study. Quality Engineering 31 (4):627–38. doi: 10.1080/08982112.2019.1578374.
  • Wang, A., H. Xu, and X. Ding. 2020. Simultaneous optimization of drug combination dose-ratio sequence with innovative design and active learning. Advanced Therapeutics 3 (4):1900135. doi: 10.1002/adtp.201900135.
  • Wang, B. X., and F. Wu. 2018. Inference on the gamma distribution. Technometrics 60 (2):235–44. doi: 10.1080/00401706.2017.1328377.
  • Wang, C., and R. W. Mee. 2022. Saturated and supersaturated order-of-addition designs. Journal of Statistical Planning and Inference 219:204–15. doi: 10.1016/j.jspi.2021.12.006.
  • Wang, X., B. X. Wang, Y. Hong, and P. H. Jiang. 2021. Degradation data analysis based on gamma process with random effects. European Journal of Operational Research 292 (3):1200–8. doi: 10.1016/j.ejor.2020.11.036.
  • Wheeler, B. 2022. AlgDesign: Algorithmic experimental design. R package version 1.2.1. https://CRAN.R-project.org/package=AlgDesign.
  • Winker, P., J. Chen, and D. K.-J. Lin. 2020. The construction of optimal design for order-of-addition experiment via threshold accepting. In Contemporary Experimental Design, Multivariate Analysis and Data Mining, ed. J. Fan and J. Pan, 93–109. Cham, Switzerland: Springer-Verlag.
  • Wu, C. F. J., and M. S. Hamada. 2021. Experiments: Planning, analysis, and optimization. 3rd ed. Hoboken, NJ: John Wiley and Sons.
  • Zhao, Y., D. K.-J. Lin, and M.-Q. Liu. 2021. Designs for order-of-addition experiments. Journal of Applied Statistics 48 (8):1475–95. doi: 10.1080/02664763.2020.1801607.
  • Zhao, Y., D. K.-J. Lin, and M.-Q. Liu. 2022. Optimal designs for order-of-addition experiments. Computational Statistics and Data Analysis 165:107320. doi: 10.1016/j.csda.2021.107320.

Appendix A:

Test variables for the baseline variance

Based on the inverse of information matrix in Section 3.1, when the pairwise order matrix is an order-of-addition orthogonal array of strength two, one has c0,0=1/n. The frequentist test variable given by Zα0=α̂0e1nψ(r12) can be used to evaluate H0:α0=e versus H1:α0e, where e represents an arbitrary constant. Similarly, the distribution of Zα0 can be approximated by the standard normal distribution. On the other hand, a fiducial test variable for δ0 is introduced to address this testing problem. Let F denote the set consisting of all index pairs of full-foldover vectors over all pairwise order vectors. Formally, F can be written as F={(a,b):za,ij+zb,ij=0 for every ij}. Define [A1] Rδ0=(a,b)F[(r1)sa2Va(r1)sb2Vb]12f,[A1] where f denotes the number of index pairs in F. First, it is straightforward to see that the distribution of Rδ0 is free of all parameters. Therefore, condition (A) of Definition 1 is satisfied. Next, replacing Sh2 in [6] with its realization sh2 and replacing h with either a or b. Plugging the resulting term into [A1] yields (a,b)F[(r1)sa2Va(r1)sb2Vb]12f=(a,b)F[(δ0i=1m1j=i+1mδijza,ij2)(δ0i=1m1j=i+1mδijzb,ij2)]12f=δ0[i=1m1j=i+1mδij12(a,b)F(za,ij+zb,ij)]12f=δ0, because (a,b)F(za,ij+zb,ij)=0 for every ij. That is, condition (B) of Definition 1 is also satisfied. Consequently, Rδ0 is an FGPQ for δ0. Based on the FGPQ Rδ0, the fiducial test variable given by

Tδ0=Rδ0δ0 can be used to evaluate H0:δ0=d versus H1:δ0d, where d denotes an arbitrary constant. Similarly, the corresponding p-value can be estimated using a Monte-Carlo algorithm.

Appendix B:

Non-matrix forms of least squares estimators

Based on the inverse of information matrix in Section 3.1, when the pairwise order matrix is an order-of-addition orthogonal array of strength two, one has α̂=(XX)1XW=[1n003(m1)n(m+1)I3n(m+1)V][1WZW], where Z denotes the n × q pairwise order matrix. First, it is straightforward to see that α̂0=1nh=1nWh. The intercept estimator can be expressed in non-matrix form. Next, let U=ZW. Given a pairwise order factor zst, its corresponding entry in U can be expressed as Ust=h{h:zh,st=+1}Whh{h:zh,st=1}Wh. Define Ast={kl:k=s,lt or ks,l=t} and Bst={kl:k=t or l=s}. Based on these two index sets, the least squares estimator of αst can be written as α̂st=3(m1)n(m+1)Ust3n(m+1)(klAstUklklBstUkl). The least squares estimator of αst can also be expressed in non-matrix form. Furthermore, if the number of components m is very large, the term 3n(m+1) would be very small, with the result that all logarithmic sample variances W1,W2,,Wn would be nearly equally weighted up to sign in the least squares fit. Therefore, as n goes to infinity, the normal approximation for the distribution of Zαst in [5] can be further strengthened by a central limit theorem-like argument.

Appendix C:

Simulation results

Table C1. Empirical powers of the frequentist test variable Zαst.

Table C2. Empirical powers of the fiducial test variable Tδst.

Table C3. Empirical powers of the fiducial test variable Tδst.

Table C4. Average sizes of frequentist and fiducial methods under non-normal distributions.

Table C5. Average powers of frequentist and fiducial methods under non-normal distributions.