Full article: Rank-based regression with bootstrapping and the least square regression for analysis of small sample interrupted time-series data: simulation studies and application to illegal organ transplants

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

For interrupted time series with small samples and outliers, the standard least squares regression for estimating intervention effects can be biased when the assumption of normality is violated. This article proposes a rank-based regression method combined with bootstrapping to address outliers and non-normality. The proposed method minimizes the Jaeckel dispersion function based on ranked residuals, offering robustness against outliers, skewed error distributions, and correlated errors. Simulation studies and a real-world example compare the performance of the proposed method with the least squares method under various conditions. Both produce unbiased estimates for the intervention effects under normality. However, with non-normal errors, outliers, and correlated errors, the rank-based bootstrapping method outperforms least squares producing narrower confidence intervals, higher efficiency, lower type I error rates, and greater power. Application to the Istanbul declaration on illegal organ activities confirms the improved robustness and precision of the rank-based method with bootstrapping over least squares.

KEYWORDS:

1. Introduction

In various fields, such as public health, clinical practice, economics, political sciences and road safety analysis, it is often necessary to assess the success of an intervention or a program [Citation1–5]. Evaluating policy interventions involves analyzing interrupted time series data (ITS) – data collected before and after the interventions. In public health, medicine and road safety analysis, ITS analysis is commonly used to evaluate the effectiveness of policy interventions [Citation1–4,Citation6].

Various statistical approaches are used to measure these effects in ITS [Citation2,Citation3,Citation5], but the two most frequently used approaches are either regression-based methods, such as standard regression and generalized least squares, or ARIMA (autoregressive, integrated, moving average) models [Citation5]. Among the regression methods, segmented regression analysis is the most widely used [Citation2]. For a detailed description of segmented regression analysis, see [Citation2–4].

Studies often compared segmented regression models with other models. Wang et al. [Citation3] compared the segmented regression model to both a log linear binomial regression model, and a full Bayesian regression model to estimate the effect of the mandatory helmet law on the number of head injuries. Turner et al. [Citation7] compared six different commonly used statistical models on 180 published data series in the public health area to see how the choice of the statistical model impacted statistical significance and estimation. Models examined in the Turner et al. [Citation7] study included ordinary least squares (OLS) without adjustment for autocorrelation, OLS with Newey-West standard errors, Prais-Winsten generalized least squares, restricted maximum likelihood and autoregressive integrated moving average (ARIMA).

Through simulation, Turner et al. [Citation8] also investigated the effect of autocorrelation, one of the important features of interrupted time series data [Citation9], on inferences for the intervention effect. Although the impact of autocorrelation on standard regression individual parameter tests is minimal for small sample ITS data [Citation10,Citation11], methods that do not account for the autocorrelation provide less-reliable standard errors of the estimates, specifically downwardly biased standard errors [Citation10]. Turner et al. [Citation8] showed that autocorrelation is difficult to detect in small samples, so even if a formula-based method is employed to adjust for autocorrelation (OLS with Newey-West standard errors, Prais-Winsten generalized least squares) it won’t be effective if there is not an accurate estimate of the true autocorrelation. Short time series are common in public health, ecological and environmental studies [Citation12], and Turner et al. [Citation13] also showed that public health ITS data often have small samples.

The accuracy of segmented regression depends on the absence of outliers in the data, and the normality of the error distribution [Citation1–3,Citation14,Citation15]. When the error distribution deviates from normality or outliers are present in the data, confidence intervals and hypothesis testing for the intervention effects become inefficient and biased using a segmented regression model [Citation14,Citation16]. However, with large samples, normality assumptions are robust, and valid results can be obtained from the segmented regression model using the normal approximation [Citation14]. But the normality assumption is not robust in small sample studies, and the normal approximation is not appropriate.

To address the influence of outliers, various robust regression techniques are used [Citation17–20], including the least absolute deviation (LAD) method, Huber’s M-estimator, least median square, least trimmed square estimators (LTS), S-estimator, MM-estimators [Citation21] and Rank-based robust regression [Citation22–28]. Robust regression analysis in ITS is limited [Citation7] and rank-based regression using the Jaeckel dispersion score function, which is highly robust to outliers and distributional anomalies [Citation5,Citation16,Citation24–27,Citation29–37], is currently in limited use in ITS [Citation7].

In this article, we propose and explore a rank-based regression with bootstrapping to address the problems of small sample sizes, outliers in the data, non-normality in the error distribution and any potential autocorrelation in ITS data analysis. Through simulation, we investigated the effectiveness of our proposed statistical procedure on the estimation of the intervention effects, their confidence intervals and type I errors when there are deviations from the normality assumption, autocorrelated errors are present and outliers exist in the data. We also analyzed a real dataset on illegal organ transplants to examine the effect of the Istanbul Declaration on reducing such transplants using both the LS and RB with bootstrapping methods.

The article is organized as follows. Section 2 introduces the estimation and hypothesis procedures for intervention effects in interrupted time-series data using both the least square (LS) method and the method of rank-based (RB) regression with bootstrapping. In Section 3, comprehensive simulation studies are conducted under various conditions to compare the performances of the LS method and the RB with bootstrapping method. The simulations assessed the estimates of the intervention effects, the efficiency of the estimates using mean square error, type I error and power. In Section 4, the LS and RB methods are used for the analysis of the real dataset on illegal organ transplants to evaluate the effectiveness of the 2008 Istanbul Declaration in reducing illegal organ trafficking. Finally, Section 5 presents discussions, practical implications of the study and conclusions. All analyses were conducted using R software version 4.3.0, including the Rfit package [Citation36] for estimating the intervention effects using rank-based regression.

2. Methods

2.1. Estimation of intervention effects

The segmented regression model for interrupted time-series data is defined as (1) $Y_{t} = β_{0} + β_{1} T_{t} + β_{2} X_{t} + β_{3} X_{t} (T_{t} - \frac{n}{2}) + ε_{t}, t = 1, 2, \dots, n .$ (1) In this model, $Y_{t}$ represents the outcome variable measured at equally spaced time points $t$ . $T_{t}$ denotes the time elapsed since the start of the study. The variable $X_{t}$ is a dummy variable that takes a value of 0 during pre-intervention periods, and 1 otherwise, indicating the presence of the intervention. The term $X_{t} (T_{t} - \frac{n}{2})$ represents the interaction between $X_{t}$ and $T_{t}$ , with $T_{t}$ centered to prevent multicollinearity.

The validity of the model depends on some assumptions, such as (i) there must be time points before and after an intervention, with evenly spaced intervals between time points, (ii) the relationship between the outcome and time is linear within pre-intervention and post-intervention segments, (iii) the errors $ε_{1}, ε_{2}, \dots, ε_{n}$ are independent and identically normally distributed with mean zero and constant variance, (iv) there are no outliers or missing data points, (v) the intervention is expected to immediately change the level or slope, not gradually over time.

Model (1) can be decomposed into two parts: $Y_{t} = β_{0} + β_{1} T_{t} + ε_{t}$ for the pre-intervention outcomes, and $Y_{t} = (β_{0} + β_{2} - β_{3} \frac{n}{2}) + (β_{1} + β_{3}) T_{t} + ε_{t}$ for the outcomes after the intervention. Consequently, $β_{0}$ represents the initial level of the outcome variable and $β_{1}$ corresponds to the slope or trend of the outcome variable before the intervention. The term $β_{2} - β_{3} \frac{n}{2}$ denotes the change in the intercept between the pre-and post-intervention models, while $β_{3}$ represents the difference in slopes or trends between the pre- and post-intervention periods. Generally, $β_{2}$ does not possess a direct interpretation but contributes to adjusting the intercept in the post-intervention model. However, in the special case where $β_{3} = 0$ but $β_{2} \neq 0$ , $β_{2}$ represents a flat and steady increase or decrease in $Y_{t}$ , in addition to the constant trend.

For convenience, we can express model (1) in matrix notation as follows: $Y = β_{0} 1 + Xβ + ε,$ where $Y = (y_{1}, y_{2}, \dots, y_{n})^{T}$ is the $n \times 1$ vector of responses, $X$ is a $n \times 3$ design matrix, $β = (β_{1}, β_{2}, β_{3})^{T}$ , and $ε = (ε_{1}, ε_{2}, \dots, ε_{n})^{T}$ is a $n \times 1$ vector of error terms.

The least squares (LS) estimate of $β$ is obtained by minimizing the sum of squared errors defined as $L = \sum_{i = 1}^{n} ε_{i}^{2} = \sum_{i = 1}^{n} (y_{i} - β_{0} - x_{i}^{'} β)^{2} .$ The least squares estimate of $β$ is given by $\hat{β} = (X^{'} X)^{- 1} X^{'} y,$ where ${\hat{β}}^{T} = (β_{0}, β_{1}, β_{2}, β_{3})$ . This method performs well when the data contains no outliers, and the errors follow a normal distribution.

Alternatively, for the rank-based estimates of the parameter vector $β,$ the Jaeckel dispersion score function [Citation16] is minimized using Wilcoxon scores. The dispersion score function, denoted as $D (β)$ , is defined as: (2) $D (β) = \sqrt{{12} {(n + 1)}^{\frac{1}{2}} \sum_{i = 1}^{n} [R (y_{i} - x_{i}^{'} β) - \frac{n + 1}{2}] (y_{i} - x_{i}^{'} β)},$ (2) where $R (.)$ represents the rank of $y_{i} - x_{i}^{'} β$ among the residuals, $y_{1} - x_{1}^{'} β, y_{2} - x_{2}^{'} β, \dots, y_{n} - x_{n}^{'} β$ . The dispersion function $D (β)$ is a non-negative, piece-wise linear, continuous convex function of $β$ . Minimizing this dispersion function provides an estimate of $β$ , which generally does not have a closed-form expression. Therefore, iterative computer methods are typically required to obtain numerical solutions.

The rank-based estimation approach offers robustness to outliers and deviations from normality in the error distribution. By minimizing a linear function of the residuals with rank-based coefficients, the effects of outliers are incorporated in a linear rather than quadratic manner. This robustness makes the rank-based estimates more reliable compared to the least squares (LS) method.

It is important to note that the Jaeckel dispersion function [Citation16] is invariant to the intercept term $β_{0}$ . Consequently, $β_{0}$ cannot be simultaneously estimated with the other elements of $β .$ Hattmansperger and McKean [Citation32] proposed using the full-model residuals, denoted as ${\hat{ε}}_{i} = y_{i} - x_{i}^{'} \hat{β}$ , for $i = 1, 2, \dots, n,$ to estimate $β_{0}$ . The estimator for $β_{0}$ is obtained as the median of the residuals, denoted as ${\hat{β}}_{0} = median {ε_{i}, i = 1, 2, \dots, n}$ .

2.2. Inference

The parameters $β_{2}$ and $β_{3}$ in model (1) quantify the intervention effect. If there is no intervention effect, their values will be zero. Thus, the hypothesis of interest can be stated as $H_{0} : β_{2} = β_{3} = 0$ versus $H_{a} :$ at least one of $β_{2}$ and $β_{3}$ is not zero.

Under the null hypothesis, model (1) reduces to a simplified form: (3) $Y_{t} = β_{0} + β_{1} T_{t} + ε_{t}, t = 1, 2, \dots, n .$ (3)

Let ${\hat{β}}_{F}$ and ${\hat{β}}_{R}$ be the value of $β$ that minimizes the Jaeckel dispersion score function $D (β)$ in (2) under the full model (1) and the reduced model (3), respectively. Consequently, $D ({\hat{β}}_{F})$ and $D ({\hat{β}}_{R})$ represent the overall minimum of the Jaeckel dispersion measure under the full model and the reduced model, respectively. We define $D^{*} = D ({\hat{β}}_{R}) - D ({\hat{β}}_{F})$ , as the drop or reduction in Jaeckel dispersion when fitting the full model (intervention effect) rather than the reduced model (no intervention effect) under the null hypothesis.

The Jaeckel-Hettmansperger-McKean test statistic [Citation16,Citation32] is $HM = \frac{D^{*} / 2}{\hat{τ} / 2} \sim F (2, n - 3 - 1),$ where $τ$ is the scale parameter and for the Wilcoxon score, $τ = [12]^{- \frac{1}{2}} {[\int_{- \infty}^{\infty} f^{2} (t) dt]}^{- 1}$ . We use the consistent estimator of $τ$ proposed by Koul et al. [Citation30] to calculate the HM test. The estimation of $τ$ can be based on a window- or kernel-type estimation of the probability density function $f (.)$ of errors [Citation7,Citation10].

The decision rule is to reject $H_{0}$ if $HM \geq F_{2, n - 4, α}$ , where $F_{2, n - 4, α}$ is the upper $α$ percentile of an $F$ distribution with 2, and $n - 4$ degrees of freedom. The estimate of $β$ is consistent and asymptotically normal [Citation35] for large sample sizes and no autocorrelation: $\hat{β} \sim N (β, τ^{2} {(X^{T} X)}^{- 1}) .$

We can approximate a $(1 - α)$ 100% confidence interval for $β_{i}$ as ${\hat{β}}_{i} \pm t_{\frac{α}{2}, n - 4} se ({\hat{β}}_{i}), i = 2, 3,$ where $se ({\hat{β}}_{i}) = \hat{τ} \sqrt{(X^{T} X)_{ii}^{- 1}),}$ and $(X^{T} X)_{ii}^{- 1}$ represents the $i th$ diagonal element of $(X^{T} X)^{- 1}$ .

Similarly, under the least squares (LS) method with large sample sizes and no autocorrelation, the estimate of $β$ is also consistent and normally distributed: $\hat{β} \sim N (β, σ^{2} {(X^{'} X)}^{- 1}) .$ For the parameters, $β_{2}$ and $β_{3}$ , the $(1 - α) 100 %$ confidence interval is defined as ${\hat{β}}_{i} \pm t_{\frac{α}{2}, n - 4} se ({\hat{β}}_{i}), i = 2, 3.$

Here, $se ({\hat{β}}_{i}) = \hat{σ} \sqrt{(X^{T} X)_{ii}^{- 1})}$ where $(X^{T} X)_{ii}^{- 1}$ is the $i th$ diagonal element of $(X^{T} X)^{- 1}$ .

Moreover, the test procedure for testing $H_{0} : β_{2} = β_{3} = 0$ versus $H_{a} :$ at least one of $β_{2}$ and $β_{3}$ is not zero under the LS method involves the $F$ -test statistic: $F = \frac{\frac{SSE (R) - SSE (F)}{dfE (R) - dfE (F)}}{MSE (F) / dfE (F)} \sim F (2, n - 4) .$

We reject $H_{0}$ if $F \geq F_{2, n - 4, α}$ ; otherwise, we do not reject $H_{0}$ .

The confidence interval and hypothesis testing procedures defined above are based on the normality assumption or large sample sizes, and no autocorrelation. However, when only small sample sizes are available, the approximation may not work well. We recommend using bootstrapping to construct confidence intervals and carry out hypothesis testing for the intervention effect (parameters $β_{2}$ and $β_{3}$ ) for both rank-based and standard LS regression approaches when there are small samples, and deviations from normality. If outliers and correlated errors exist as well, we only recommend using the rank-based method with bootstrapping. A brief description of the general algorithm for the bootstrap confidence intervals for $β_{2}$ and $β_{3}$ and the bootstrap p-value for the linear hypothesis of $H_{0} : β_{2} = β_{3} = 0$ is presented below. For more details about bootstrapping, see [Citation38–42].

To construct $(1 - α) 100 %$ bootstrap confidence intervals for $β_{2}$ and $β_{3}$ , the following steps are followed:

Fit the full model $Y_{t} = β_{0} + β_{1} T_{t} + β_{2} X_{t} + β_{3} X_{t} (T_{t} - \frac{n}{2}) + ε_{t}$ , $t = 1, 2, \dots, n$ using LS and RB methods. Then calculate the fitted response values, ${\hat{Y}}_{i}$ and the estimated residuals, ${\hat{ε}}_{i}$ , from the full model.
Obtain a bootstrap sample of size $n$ with replacement from the estimated residuals, ${\hat{ε}}_{i}$ . Denote the bootstrap sampled residuals as $ε_{i}^{*}$ .
Find $Y_{i}^{*}$ using $\hat{Y} + ε_{i}^{*}$ and fit the full model again using the LS and RB methods. Extract the estimate of $β_{2}$ and $β_{3}$ from the full model results. Label them ${\hat{β}}_{2}^{* i}$ and ${\hat{β}}_{3}^{* i}$ .
Repeat steps 1 through 3 $R$ times and construct a confidence interval by finding the associated percentiles in the bootstrap distributions of ${\hat{β}}_{2}^{*}$ and ${\hat{β}}_{3}^{*}$ .

Similarly, to calculate the bootstrap p-value for the linear hypothesis of $H_{0} : β_{2} = β_{3} = 0$ , the general algorithm for the bootstrap p-value is as follows:

Fit the full model in (1) and the null model in (3) using LS and RB methods. Then calculate the observed value of the test statistics: $HM$ and $F$ . Denote it as $F_{0}$ .
Take a bootstrap sample with replacement from the residuals obtained under the null model. Denote the bootstrap sampled residuals from the null model as $ε_{i}^{*}$ , and define $Y_{i}^{*} = \hat{Y} + ε_{i}^{*}$ , where $\hat{Y}$ is the fitted value from the null model. Fit the full and null model using $Y_{i}^{*}$ and calculate the Jaeckel-Hettmansperger-McKean test statistic, and the $F$ -statistics. Denote it as $F^{*}$ . It implies that fitting, and testing are performed under the assumption that $H_{0}$ is true.
Repeat steps 1 and 2 R times. The bootstrap p-value is defined as $p^{*} = \frac{I (F^{*} \geq F_{0})}{R}$ .

3. Simulation study

This section begins with a simulation study designed to evaluate the performance of two methods, RB and LS regression with formula-based asymptotic SEs, for estimating intervention effects in time series data. This study examines their performance under different error distributions, with and without the presence of outliers, and different levels of autocorrelation.

To calculate bias, mean squared error (MSE) and relative mean squared error (RMSE) of the estimates for the intervention effect parameters, $β_{2}$ and $β_{3}$ , the response variable $y_{i}, i = 1, 2, \dots, n$ is generated using the model: $Y_{t} = β_{0} + β_{1} T_{t} + β_{2} X_{t} + β_{3} X_{t} (T_{t} - \frac{n}{2}) + ε_{t}$ with the parameter vector $β = (β_{0}, β_{1}, β_{2}, β_{3})^{T} = (1, 1, 0, 0)^{T}$ and design matrix:

Table

Display Table

To mimic the autocorrelation commonly seen in time series data, the errors

ε_{t}

follow the first-order autoregressive AR (1) process:

ε_{t} = ρ ε_{t - 1} + u_{t}

where

u_{t}

are independent and identically distributed with mean zero and finite variance

σ^{2}

; and

ρ

is the lag-1 autocorrelation of the errors. The variance-covariance matrix of the error vector of

ε = (ε_{1}, ε_{2}, \dots, ε_{t})^{T} is V (ε) = σ^{2} Ω = \frac{σ^{2}}{1 - ρ^{2}} [\begin{matrix} \begin{matrix} \begin{matrix} 1 \\ ρ \end{matrix} & \begin{matrix} ρ \\ 1 \end{matrix} \end{matrix} & \dots & \begin{matrix} ρ^{n - 1} \\ ρ^{n - 2} \end{matrix} \\ ⋮ & ⋱ & ⋮ \\ \begin{matrix} ρ^{n - 1} & ρ^{n - 2} \end{matrix} & \dots & 1 \end{matrix}] .

The white noise terms $u_{t}$ are generated from three different symmetric distributions: (i) a normal distribution with a mean of 0, and standard deviation of 1, (ii) a $t -$ distribution with 5 degrees of freedom and (iii) a double exponential distribution (Laplace distribution) with a location parameter of 0 and a scale parameter of 1. The distributions are scaled by $\sqrt{1 - ρ^{2}}$ to achieve the specified variance at different levels of $ρ = (0, 0.2, 0.4, 0.6, 0.8)$ .

To introduce outliers, approximately 13% of the sample are generated using the uniform distribution over (0,30), and the responses are replaced with the generated uniform data. The RB and LS regression methods are then applied to the generated response variable $y_{i}, i = 1, 2, \dots, n$ to estimate the parameter vector $β$ and its standard error. The analytical form of the variance of the estimates obtained through both the LS and RB is $σ^{2} (X^{T} X)^{- 1} X^{T} Ω X (X^{T} X)^{- 1}$ . The standard error of the estimates of $β_{2}$ and $β_{3}$ obtained through both bootstrapping and asymptotic formula under LS and RB, are then compared to the true value found through the analytical formula under a given known scenario.

The bias of the estimate is defined as the difference between the expected value of the estimator and its true value. $Bias = \frac{1}{R} \sum_{i = 1}^{R} {\hat{β}}_{ji} - β_{j}, j = 2, 3.$ The MSE of the bias is defined as the average squared difference between the estimated values and the actual value of the parameter. $MSE = \frac{1}{R} \sum_{i = 1}^{R} ({\hat{β}}_{ij} - β_{j})^{2}, j = 2, 3.$

The relative mean squared error (RMSE) is the ratio of the mean squared error of the RB method to the LS method for the intervention effects $β_{2}$ and $β_{3}$ . The gain in efficiency is defined as (100-RMSE) *100.

For this simulation study, two different sample sizes – a relatively small sample, $n = 16$ and a large sample, $n = 50$ are considered and the simulation is repeated 1000 times.

Tables and present the gain in efficiency for $β_{2}$ and $β_{3}$ of RB method with bootstrapping with respect to the LS method under $H_{0} : β_{2} = β_{3} = 0$ . Instead of using formula-based SEs through the LS and RB methods, bootstrapping SEs are calculated. Comparisons are made for different error distributions, two different levels of sample sizes, different levels of autocorrelation, and with and without outliers in the data.

Table 1. Efficiency gain when no outliers exist in the data and errors follows AR (1).

Display Table

Table 2. Efficiency gain when outliers exist in the data and errors follow AR (1).

Display Table

In Table , the results indicate that when there are no outliers in the data, and the normality assumption of errors holds, the LS method demonstrates modest gains in efficiency relative to the RB method, but the relative efficiency for both parameters $β_{2}$ and $β_{3}$ of the LS method with respect to RB method decreases as autocorrelation and sample sizes increase. On the other hand, when errors are non-normal, the relative efficiency for both $β_{2}$ and $β_{3}$ of the RB method are positive relative to the LS method. The gain in efficiency for $β_{3}$ decreases with autocorrelation but increases as sample sizes increases. For $β_{2}$ , the gain in efficiency is negative when autocorrelation is within a range of 0.6–0.9.

Table shows the most significant gains in efficiency for the RB method. It reveals that whenever outliers are present in the data, and no matter how much autocorrelation exists, the RB method achieves remarkably high gains in efficiency ranging between 71% and 96% for $β_{2}$ and $β_{3}$ . These substantial improvements are consistent across all error structures, including normality with higher efficiency gain than other long-tailed error distributions. The gain slightly decreases with autocorrelation but increases as the sample size increases.

Bias and MSE of the estimates, ${\hat{β}}_{2}$ and ${\hat{β}}_{3}$ , under the null hypothesis $H_{0} : β_{2} = β_{3} = 0$ are presented in Tables and . Table indicates that in the absence of outliers, and at different levels of autocorrelation, and different sample sizes, the MSE is relatively similar for the two methods, consistent with the efficiency comparisons in Table . It is slightly lower for the LS method under normality, while the RB method exhibits slightly lower MSE under non-normality and lower levels of autocorrelation (0 to 0.4). For higher levels of autocorrelation (0.6 to 0.8) and non-normality, the MSE is slightly lower for the LS method when estimating $β_{2}$ . Both methods produced almost the same values for the MSE when estimating $β_{3}$ under these conditions. But as the sample size increases, MSE in both methods for both $β_{2}$ and $β_{3}$ decreases, but the decrease in MSE for $β_{3}$ is significantly higher. The bias is also quite comparable for both methods and remains close to 0 across all error structures.

Table 3. Bias of the estimate and MSE when no outliers exist, and errors follow AR (1).

Display Table

Table 4. Bias of the estimate and MSE when outliers exist, and errors follow AR (1).

Display Table

However, when outliers are present in the data, Table shows the RB method performs exceedingly better than the LS method in terms of MSE for both the $β_{2}$ and $β_{3}$ parameters. The MSE is much lower for the RB method under all three error structures and all levels of autocorrelation. As expected, the MSE declines as sample sizes increase. The bias is again near 0 for both methods under all three error structures and is consistent with the efficiency comparisons in Table .

Tables and show the type I errors of the global HM and F test as well as the individual t-test through both methods for testing the hypothesis $H_{0} : β_{2} = β_{3} = 0$ and $H_{0} : β_{i} = 0, i = 2, 3$ , respectively. The type I error measures the rate at which intervention effects are detected when in fact, there are no such effects. In the absence of outliers, both methods have type I error rates almost equal to the nominal level $α = 5 %$ when errors are independent. Type I error is slightly better for the LS method under normality, but slightly better for the RB method under non-normality when there are no outliers in the data. Interestingly, the type I error rates decreased for both methods under all three error structures if outliers were present in the data and no autocorrelation was present. However, the decrease is more evident for the RB method, resulting in rates as low as around 3%. When autocorrelation is present, with and without outliers, the type I error inflates in both methods and becomes more pronounced as the autocorrelation increases.

Table 5. Type I error of asymptotic test HM and F for $H_{0} : β_{2} = β_{3} = 0$ when error follows AR (1).

Display Table

Table 6. Type I error for individual t-test for testing $H_{0} : β_{2} = 0$ and $H_{0} : β_{3} = 0$ under AR(1).

Display Table

When there is no autocorrelation, with and without outliers, the empirical type I error rates are equal to the nominal level of 0.05 for both the HM and F statistics for the general hypothesis $H_{0} : β_{2} = β_{3} = 0$ . When outliers were present, and autocorrelation increased from 0 to 0.2, the empirical type I error remained around the nominal level of 0.05 for the F test under the LS method. For higher levels of autocorrelation (0.4 to 0.8), and outliers in the data across all error structures, type I error rates increased only slightly to 0.09 for the F test. In contrast, the performance of the HM test under the RB method is not satisfactory when autocorrelation increases and outliers are present. Type I error rates for the HM test got as high as 0.53 when the autocorrelation was 0.8 and the sample size was 50. Even with a small sample size ( $n = 16$ ), the type I error rate of the HM test at autocorrelation of 0.8 is 0.52. This shows that for small samples, the normal approximation of the HM test does not perform well compared to the F test under LS. Specifically, since the HM test involves the tau parameter, which is estimated through the error distribution, if the errors are not independent then the estimate of the distribution of tau might be affected. Our study has found the performance of the HM test to be unsatisfactory in the presence of autocorrelated errors. Similar results are found for the individual t-test for testing $H_{0} : β_{i} = 0, i = 2, 3$ , as can be seen in Table .

When no outliers are present, but autocorrelated errors exist, then neither the HM test nor the F test perform well. Both methods have similar and extremely inflated type I error rates at all levels of autocorrelation, reaching about 0.68 when the autocorrelation is 0.8.

Additional simulations were conducted to compare SEs calculated through bootstrapping, and the asymptotic formula under both RB and LS methods, to the analytical values of SE of ${\hat{β}}_{2}$ and ${\hat{β}}_{3}$ when errors are AR (1). The SEs of the estimates of $β_{2}$ and $β_{3}$ obtained through bootstrapping, the asymptotic formula, and the analytical formula under different autocorrelation levels, with and without outliers, and different sample sizes are shown in Tables and . When errors are not autocorrelated ( $ρ = 0$ ), there are no outliers in the data, and under all three different error distributions, the formula-based asymptotic SEs and Bootstrap SEs for the estimates of $β_{2}$ and $β_{3}$ are approximately the same for both the LS and RB regression methods, and are equal to the value of analytical formula of SEs. It is also important to note that if the errors are normally distributed, the bootstrap SEs for RB regression are almost as good as the bootstrap SEs for LS regression, even under these ideal conditions.

Table 7. Comparison of SE estimated through asymptotic and bootstrap approaches when no outliers.

Display Table

Table 8. Comparison of SE estimated through asymptotic and bootstrap approaches when outliers.

Display Table

However, as the autocorrelation increases, the formula-based asymptotic SEs, which are not adjusted for autocorrelation under either approach, decreases. In contrast, the bootstrap SEs increase as autocorrelation increases. This indicates that the formula-based asymptotic SEs for both LS and RB are downwardly biased, as shown in Table . Bootstrap estimates of standard errors should be considered if there are any signs of autocorrelation in the data.

When there are outliers in the data, the formula-based asymptotic SEs under the LS method are almost twice the asymptotic SEs under the RB method. As before, asymptotic SEs under both methods remain the same as autocorrelation increases, which again produces downwardly biased SEs. However, the bootstrap SEs for $β_{2}$ and $β_{3}$ in the RB method again increase as autocorrelation increases, and these SEs are relatively smaller than the bootstrap SEs obtained through the LS method (Table ). The SEs through bootstrapping under the RB method are almost equal to the true value of the SEs, which indicates bootstrap SEs captures autocorrelation in the data and SE estimates through RB are more reliable. But bootstrap SEs through the LS method are not close to the true value of the SEs, indicating SE estimates obtained through the LS method are less reliable.

The asymptotic SEs from LS regression are the best under normality with no outliers or autocorrelation. However, the bootstrapping methods are almost as good under these conditions and the difference might be considered negligible. Computing formula-based asymptotic SEs using either LS or RB regression is not appropriate in the presence of autocorrelation, and some type of bootstrapping is recommended. If no outliers are present, then bootstrap SEs from LS and RB regression will be very similar. If outliers are present, then much better results are obtained using RB with bootstrapping.

For completeness, power comparisons were also made between the LS and RB methods using asymptotic SEs. The power of a test is the probability of rejecting the null hypothesis when it is false. The power for the HM and F-test, as described in Section 2, is estimated through Monte Carlo simulations when there is no autocorrelation in the error structure. The simulations for power comparisons consider two error distributions: a standard normal distribution without outliers, and a non-normal distribution with outliers (t-distribution with 5 degrees of freedom). Approximately 13% of the sample observations are generated using the uniform distribution over (0,30) and added to the response variable as outliers. The response variable $Y_{t}$ is generated using model (1) with different effect sizes for $β_{2}$ and $β_{3}$ . The null regression coefficients configuration $(β_{0}, β_{1}, β_{2}, β_{3}) =$ (1, 1, 0, 0), and effect sizes (0, 0.2, 0.4, … ,1.4) for $β_{2}$ and $β_{3}$ are considered.

Figure shows the power comparisons in two situations: when errors follow a normal distribution without outliers (left panel) and when errors deviate from normality with the presence of outliers (right panel). In the ideal condition, the LS method outperforms the RB method as expected, but the difference is very small. However, when the error distribution deviates from normality and outliers are present, the RB method significantly outperforms the LS method. The LS method has very little power even at very large effect sizes when outliers are present.

Figure 1. Power with normally distributed errors and no outliers (left panel) and non-normality for errors and outliers existing (right panel).

If there is no autocorrelation present, the simulation study shows that the LS method outperforms the RB method only under normality for the error structure, and no outliers in the data. If there exist correlated and non-normal errors and/or outliers exist in the data, then the RB method outperforms the LS method. The difference is drastic if outliers are present in the data. If autocorrelated errors are present, and sample size is small, a bootstrap hypothesis technique would be appropriate because type I errors associated with using HM or F or the individual t-test for hypothesis testing of intervention effects can become drastically inflated for both methods in the presence of autocorrelation.

4. Application to illegal organ transplants data

In this section, we use a dataset collected and analyzed by Islam et al. [Citation14,Citation43] that consists of incidences of illegal organ transplants from 11 countries over the period of 2002–2015. This dataset includes data on organ transplantation-related crimes reported on the Internet from 11 countries, which were randomly selected from among the original participants of the Istanbul Declaration. The selected countries are Brazil, Columbia, Egypt, India, Iran, Mexico, Nigeria, Pakistan, Philippines, Thailand and Turkey. Additionally, the dataset contains each county’s population size, which was obtained from the World Bank database [Citation44]. To adjust for population size, the number of illegal organ transplants are normalized by the population size of each country.

The 2008 Istanbul Declaration is the consensus of the participants to develop a legal and professional framework that governs organ donation and transplantation activities [Citation45,Citation46]. It also establishes a transparent regulatory oversight system that ensures donor and recipient safety, enforces standards and prohibits unethical practices [Citation45,Citation46]. The signee countries are encouraged to take actions to protect the poorest and most vulnerable groups from transplant tourism and the sale of tissues and organs. Additionally, the declaration seeks to reduce the prevalence of international trafficking in human tissues and organs [Citation45–48] . The signee countries are obligated to introduce programs that control and reduce unethical practices related to organ trafficking within the Istanbul Declaration framework [Citation49].

Despite these efforts, numerous reports indicate that organ trafficking, and transplant tourism continue to increase [Citation43]. Therefore, it is of interest to determine whether the Declaration has had a significant impact. In this study, we estimate the intervention effects of the 2008 Istanbul Declaration on illegal organ transplants using both the RB regression and the LS regression approaches, as described in Section 2. The estimated intervention effects are then used to test $H_{0} : β_{2} = β_{3} = 0$ (i.e. the Declaration had no impact on organ trafficking) versus $H_{a} :$ at least one of $β_{2}$ and $β_{3}$ is not zero (i.e. the Declaration had a significant impact on organ trafficking).

Before estimating the effects of the Istanbul Declaration and conducting hypothesis testing, it is necessary to check the linearity condition of the RB and LS regression techniques described in Section 2. To do so, we present the time series plots for the number of organ transplants reported over the period 2002–2015 for each country in Figure . The data reveals a clear upward trend in organ trafficking for the period 2002 to 2015. However, the increase in organ trafficking is exponential over time in some countries, resulting in a non-linear trend. However, in other countries, the upward trend is linear. To achieve linearity, we took the logarithm of the rate of the incidences per million population. Figure displays the nearly linear trend after the transformation of the data. In addition, the Durbin-Watson test at the 5% level of significance was used to check for serial correlation in each country’s data. The test results indicated no evidence of serial correlation in the data for any country. Based on results from the simulation (Table ), this would imply that inflated type I errors should not be a major concern for this data set. We estimated $β_{2}$ and $β_{3}$ for the organ transplant data using both LS and RB methods and computed 95% confidence intervals using procedures based on normality, as well as bootstrap procedures as described in Section 2.

Figure 2. Time series plot for the rate of organ trafficking per million population.

Figure 3. Time series plot for the log-rate of organ trafficking per million population.

Tables and present LS estimates, standard errors and 95% confidence intervals for the impact of the Istanbul Declaration on illegal organ transplants. Specifically, Table shows the results for LS estimates computed using the standard LS techniques and Table shows the LS results using the bootstrap procedures described in Section 2. The small negative values of $β_{3}$ in Tables and show that while the number of illegal transplants has continued to increase over time, the Declaration has at least been successful in slowing down the rate of increase in illegal transplants in most countries. Except for Columbia, Nigeria and Turkey, almost all other countries experienced a statistically significant reduction in the rate of increase in organ transplants after the Declaration. Most of the values for $β_{2}$ are positive, but a direct interpretation is difficult. It is best just to think of $β_{2}$ as part of the adjustment to the intercept for the post-intervention model. For Turkey however, the coefficient for $β_{3}$ was almost 0, but $β_{2}$ was positive. This indicates that there was a modest flat increase in organ trafficking post-intervention in Turkey, which was in addition to the continuing steady trend.

Table 9. Results of standard ls regression analysis.

Display Table

Table 10. Bootstrap results of standard ls regression analysis.

Display Table

The estimates obtained through asymptotic and Bootstrap methods under LS are consistent with each other, but the bootstrap method produced slightly narrower confidence intervals. Notably, the impact of the intervention with regards to the change in trend $(β_{3})$ in India is significant according to the bootstrap method, which differs from the LS method.

Similarly, Tables and present the findings from RB regression. Table presents the results of RB regression using the asymptotic normality procedures and Table presents the RB regression results using the bootstrap procedures. The estimate of differential changes in the slope of the linear model before and after the declaration, measured by ${\hat{β}}_{3}$ , suggests that all countries except Turkey, Columbia and Nigeria experienced a reduction in the rate of increase in illegal organ transplants following the 2008 Istanbul Declaration. Iran, Mexico, Pakistan, Philippines and Thailand were the only countries that had a statistically significant decline in the rate of increase in illegal organ transplants. Consistent with LS regression, all countries had positive estimates of $β_{2}$ . Table shows that RB regression using bootstrap methods produced shorter confidence intervals than RB regression based on asymptotic normality. This resulted in identifying one more country, India, as having a statistically significant reduction in the rate of increase in illegal organ transplants. The biggest difference between Tables and is with regards to $β_{2}$ . All estimated coefficients are still positive and similar in magnitude, but the smaller standard errors from bootstrapping resulted in nine statistically significant results instead of only five.

Table 11. Results of rank-based regression analysis.

Display Table

Table 12. Bootstrap results of rank-based regression analysis.

Display Table

To assist with the interpretation of the estimated parameters in context with organ trafficking, examples using the data for both Thailand and Turkey are discussed. Figure shows the fitted model for Thailand both pre- and post-intervention. The trend is clearly still increasing, but at a lower rate post-intervention. This coincides with the estimate for $β_{3} = - 0.15$ . The estimate for $β_{2} = 0.25$ , but this should be interpreted as part of the overall adjustment to the intercept for the post-intervention model given below: $β_{2} - 7 β_{3} = 0.25 - 7 (- .15) = 1.10$ While $β_{2}$ will normally not be interpreted directly, there is an exception if $β_{3} = 0$ . In this case $β_{2}$ represents a flat and consistent increase in addition to the original and continuing trend. This is the case in Figure for Turkey where the estimate for $β_{2} = 0.37$ and the estimate for $β_{3} = 0.$

Figure 4. Thailand and Turkey fitted model.

Table shows the p-values obtained through the $HM$ , $F$ and Bootstrap tests for the general linear hypothesis $H_{0} : β_{2} = β_{3} = 0$ . The results of the general linear hypothesis show that using RB regression, countries such as Columbia, India and Turkey experienced a non-significant change in illegal organ transplants, while Brazil and Turkey were non-significant countries through the standard LS regression method.

Table 13. Comparisons of HM and F-test for testing $H_{0} : β_{2} = β_{3} = 0$ .

Display Table

LS and RB regression procedures, as outlined in Section 2, yielded mixed results with the organ trafficking data (Tables and ). In some cases, LS regression produced smaller confidence intervals but in other cases RB regression produced smaller confidence intervals. Based on the simulation results of Section 3, this would suggest that the error distributions were approximately normal, or close to it, and that there were no extreme outliers. Interestingly, the best results were obtained using RB regression with bootstrapping procedures. This produced the smallest confidence intervals for $β_{2}$ and $β_{3}$ , and the most statistically significant results.

5. Discussions

Over the past two decades, interrupted time-series (ITS) designs have emerged as a significant approach in healthcare research and social sciences [Citation5,Citation7,Citation8], though there is considerable variation in the statistical analysis of ITS designs. For large sample ITS studies, common methods include ordinary least squares with standard error adjustments, generalized least squares and ARIMA models. These models account for autocorrelation and provide better inferences with accurate calculation of standard errors, and reliable hypothesis tests. Normality assumptions are often robust in large sample studies, and valid results can be obtained through the normal approximation. But for small sample ITS studies, normal approximations are not appropriate when errors are non-normal or contain outliers. Confidence intervals and hypothesis tests become inefficient and biased using existing standard regression-based methods, even with autocorrelation corrections indicating that alternative methods are required. To address issues with small samples, particularly outliers and non-normality in the error distribution, this article introduces rank-based (RB) regression with bootstrapping to estimate the intervention effects and make inferences in ITS analysis.

Simulations compared LS regression with RB regression using relative efficiency and MSE. The performance of both methods depended on the error distribution and outliers. If no outliers are present, LS regression is slightly better if errors are normally distributed, while RB regression performs slightly better when the errors are non-normal. However, in the presence of outliers, RB regression significantly outperforms LS regression, regardless of the error distribution. Both methods produce low bias for trend and intervention effects under all error structures, and with and without outliers. Without outliers, LS is more efficient than RB, but with outliers, RB achieves substantially higher efficiency gains. These gains range from 71% to 96% across all error distributions, including normality as well as in the presence of autocorrelation.

Type I error rates were also examined for formula-based LS and RB regression. When no outliers are present, both methods have similar type I error rates that are near the nominal level of 0.05 across all error structures. When outliers are present in the data, the type I error rates decrease from the nominal rate for both methods under all three error structures. However, the tests derived for testing for the intervention effect under the RB method are more conservative than the tests obtained under the LS method when outliers exist, regardless of the underlying error distribution. This suggests the LS tests are more likely to incorrectly reject the null hypothesis when outliers are present, while the RB method provides more rigorous control of false positives.

Note that the standard formula-based methods of LS and RB regression do not account for any autocorrelation in the error structure. If autocorrelation is detected, standard errors can be severely underestimated and lead to inflated type I errors. To that end, additional simulations were conducted and included bootstrap estimation of the standard errors for both LS and RB regression, and under autocorrelation levels of 0, 0.2, 0.4, 0.6 and 0.8. As expected, the standard errors computed with the formula-based LS and RB regression models stayed relatively constant even as autocorrelation increased. However, the bootstrap estimation of the standard errors for both LS and RB correctly accounted for any autocorrelation present and increased as the autocorrelation increased. With no outliers present, both LS bootstrapping and RB bootstrapping produced similar results, but LS bootstrapping was slightly better under normality or autocorrelation levels of 0.6 and 0.8 while RB bootstrapping was better under non-normality and all levels of autocorrelation. When outliers were present, RB bootstrapping had much lower standard errors than LS bootstrapping, regardless of the error structure and all levels of autocorrelation.

The simulation results suggest the RB method with bootstrapping should be recommended for ITS data analysis on small samples, unless the error structure appears normally distributed with no outliers. The RB method provides more efficient estimation of intervention effects and inferences compared to the least squares (LS) method under non-normality, and especially if any outliers exist. In addition, the presence of autocorrelation is often underestimated in ITS studies, in particular for small sample sizes [Citation8]. However, this is not a concern when using RB with bootstrapping because it will automatically adjust for any autocorrelation when calculating standard errors. The RB method with bootstrapping provides more robust and accurate estimates of intervention effects and statistical tests when assumptions for standard LS regression are violated, which is common with small ITS sample sizes.

The findings of the simulation are supported by analysis of a real dataset evaluating the effectiveness of the Istanbul Declaration on illegal organ transplants. The organ trafficking data was analyzed using the formula-based methods of LS and RB regression, as well as both LS and RB methods with bootstrapping. Results indicate that in most countries, while organ trafficking continues to rise following the Declaration, the rate of increase has decreased for most countries.

Residual analysis, presented in supplemental figures, showed potential outliers in the data for Brazil, Iran, Nigeria, Pakistan, Philippines and Thailand. The remaining countries had no outliers, and near or approximately normal residuals. However, regardless of the error structure in the organ trafficking data, RB with bootstrapping gave the best results for all eleven countries. It produced the smallest confidence intervals and the most significant results for the general linear hypothesis.

The simulation showed that under perfect normality and no outliers, standard formula-based LS regression is slightly better than RB with bootstrapping, but not by very much. Even if a researcher used RB with bootstrapping under ideal conditions, the difference between their results and the results from LS regression would be negligible (Table ). How often do errors follow a normally distribution exactly? The organ trafficking example showed several countries with residual plots that appeared ‘approximately normal’, which would normally suggest using standard LS regression. However, even in these cases, RB with bootstrapping performed better than standard LS regression on real data. It always had the lowest standard errors, narrowest confidence intervals, and while it ultimately didn’t make much difference in terms of statistical significance, it did have two more statistically significant results. The real data demonstrates the robustness of the RB method and its superiority to LS regression in analysing interrupted time series data when sample sizes are small, and any type of deviation from normality exists in the error structure, or outliers are present.

Though the formula-based procedure for RB regression did not perform well on the real data in this study, simulations showed that it can outperform standard LS regression in the presence of outliers, or non-normality in the error distribution. However, for the real data in this study, the outliers were probably not extreme enough and the departures from normality not severe enough to show its effectiveness. Even so, if extreme outliers are present and departures from normality are severe, this study suggests through simulation that RB with bootstrapping will perform just as well or better than formula-based RB regression when sample sizes are small.

There are limitations to the applicability of our findings for RB with bootstrapping. The simulations assumed equal error variances, no time-varying variables and linearity. A transformation to achieve linearity was conducted on real data in this study, but including time-varying covariates and investigating procedures for unequal error variances could be valuable.

Another issue in ITS analysis is missing data. Dropping observations with missing observations can significantly impact estimation and inference with small samples. A potential solution is to impute missing pre-intervention data using prior observations, and impute missing post-intervention data using subsequent observations, before fitting the intervention model. This study also does not account for external factors influencing the outcome variable. However, ITS analysis uses full longitudinal data to model pre-intervention trends, inherently accounting for external effects [Citation50]. If time-varying impacts like seasonality or population changes are known, they can be controlled by including them as covariates.

The analysis of real ITS data using the RB method with bootstrapping relies on computational implementation of bootstrapping procedure. In this study, percentile intervals were constructed to obtain confidence intervals for the trend and intervention effect parameters, assuming symmetry in the distribution of the estimates and this assumption is supported by the data. However, the distribution of the intervention effect may be asymmetric. The use of percentile intervals under an assumption of symmetry is a limitation, as it may result in inaccurate confidence intervals if the true distribution is skewed. Further research on the shape of the sampling distribution for the intervention effect obtained through RB with bootstrapping could inform whether percentile intervals are appropriate, or if bias-corrected and accelerated intervals should be preferred to account for potential asymmetry.

Further investigating the performance of RB with bootstrapping under changing error variances, missing data and additional covariates would be valuable insights into its robustness for real-world ITS analyses. Such extensions will be important in future work to establish guidelines for properly applying this method. While this simulation study has limitations in replicating more diverse scenarios of ITS studies, the current findings demonstrate the potential of RB regression with bootstrapping to improve the validity of ITS studies with small samples, outliers, non-normal errors and even hard to detect autocorrelation. Additional research on its properties under various assumptions will help translate these benefits into practice.

5.1. Practical implications of the study

Some practical implications of this study for practitioners analyzing interrupted time series data include:

Exploratory data analysis is crucial. Before fitting a model, practitioners should check for outliers, non-normal errors, autocorrelation and other assumption violations using plots and statistical tests. This will inform the choice of analysis methods.
If outliers or non-normal errors are present, which is common in real-world data, rank-based regression with bootstrapping is recommended over standard linear regression. It provides more unbiased, efficient and powerful analysis as shown in the simulations. In fact, this study showed using real data that if even modest departures from normality exist, RB with bootstrapping will outperform LS regression.
Bootstrapping should be used for inference when sample sizes are small, which is typical in ITS studies. Normal approximations can be unreliable, as shown in the simulations, and bootstrap confidence intervals and hypothesis tests outperformed asymptotic methods. In addition, RB with bootstrapping will automatically adjust for any autocorrelation that is present in the error structure. Current formula-based methods such as Newey-West and Prais-Winsten only account for an AR (1) autocorrelation structure, and there is no guarantee that the true autocorrelation structure is AR (1). Even if it is, the value for $ρ$ must be estimated as well, and simulation studies have shown that $ρ$ is often underestimated or even undetected with small samples [Citation8]. Bootstrapping eliminates these difficulties.
The rank-based method with bootstrapping is easy to implement in available software like R. This makes adopting these robust techniques accessible even without statistical expertise.
This study provides guidance to practitioners on checking assumptions and choosing appropriate statistical models for reliable ITS analysis in fields like public health, medicine, policy evaluation and environmental studies.

6. Conclusion

This study demonstrates that rank-based regression with bootstrapping is advantageous for interrupted time series analysis when standard regression assumptions are violated, and sample sizes are small. The simulations and real data application show this method provides more precise effect estimates, rigorous statistical inference and higher power-especially with non-normal errors and outliers. As small samples and assumption deviations are common in ITS studies which evaluate policy impacts over time, rank-based modelling with bootstrapping has the potential to promote more valid analyses in fields like public health and policy research. The choice of regression methods should be determined with the aid of exploratory data analysis. By checking assumptions, and applying appropriate methods, practitioners can improve the reliability of quantitative evaluations assessing the effectiveness of interventions through interrupted time series data.

Supplemental material

Supplemental Material

Download Zip (479.3 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

Wang J, Walter S, Grzebieta R, et al. A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect; 2013.
Google Scholar
Wagner AK, Soumerai SB, Zhang F, et al. Segmented regression analysis of interrupted time series studies in medication use research. J Clin Pharm Ther. 2002;27(4):299–309.
PubMed Web of Science ®Google Scholar
Wang JJ, Scott W, Raphael G, et al. A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect. In: Australasian Road Safety Research, Policing, and Education Conference; 2013.
Google Scholar
Linden A. Conducting interrupted time-series analysis for single-and multiple-group comparisons. Stata J. 2015;15(2):480–500.
Web of Science ®Google Scholar
Zhang S, McKean JW, Huitema BE. Least squares and robust rank-based double bootstrap analyses for time-series intervention designs. Eval Health Prof. 2022;45(4):362–376.
PubMed Web of Science ®Google Scholar
Box GE, Tiao GC. Intervention analysis with applications to economic and environmental problems. J Am Stat Assoc. 1975;70(349):70–79.
Web of Science ®Google Scholar
Turner SL, Karahalios A, Forbes AB, et al. Comparison of six statistical methods for interrupted time series studies: empirical evaluation of 190 published series. BMC Med Res Methodol. 2021;21(1):1–19.
PubMed Web of Science ®Google Scholar
Turner SL, Forbes AB, Karahalios A, et al. Evaluation of statistical methods used in the analysis of interrupted time series studies: a simulation study. BMC Med Res Methodol. 2021;21(1):1–18.
PubMed Web of Science ®Google Scholar
Gebski V, Ellingson K, Edwards J, et al. Modelling interrupted time series to evaluate prevention and control of infection in healthcare. Epidemiol Infect. 2012;140(12):2131–2141.
PubMed Web of Science ®Google Scholar
Huitema BE, McKean JW. Identifying autocorrelation generated by various error processes in interrupted time-series regression designs: a comparison of AR1 and portmanteau tests. Educ Psychol Meas. 2007;67(3):447–459.
Web of Science ®Google Scholar
Huitema BE, Mckean JW, Mcknight S. Autocorrelation effects on least-squares intervention analysis of short time series. Educ Psychol Meas. 1999;59(5):767–786.
Web of Science ®Google Scholar
Bence JR. Analysis of short time series: correcting for autocorrelation. Ecology. 1995;76(2):628–639.
Web of Science ®Google Scholar
Turner SL, Karahalios A, Forbes AB, et al. Design characteristics and statistical methods used in interrupted time series studies evaluating public health interventions: a review. J Clin Epidemiol. 2020;122:1–11.
PubMed Web of Science ®Google Scholar
Islam MM, Heiny EL. A robust statistical method to estimate the intervention effect with longitudinal data. Stat Optim Inform Comput. 2020;8(1):318–327.
Google Scholar
Nistal-Nuño B. Segmented regression analysis of interrupted time series data to assess outcomes of a south American road traffic alcohol policy change. Public Health. 2017;150:51–59.
PubMed Web of Science ®Google Scholar
Jaeckel LA. Estimating regression coefficients by minimizing the dispersion of the residuals. Ann Math Stat. 1972;43(5):1449–1458.
Google Scholar
Berk R. A primer on robust regression; 1990.
Google Scholar
De Menezes DQF, Prata DM, Secchi AR, et al. A review on robust M-estimators for regression analysis. Comput Chem Eng. 2021;147:107254.
Web of Science ®Google Scholar
Stuart C. Robust regression. Durham, UK: Department of Mathematical Sciences, Durham University; 2011. p. 169.
Google Scholar
Alma ÖG. Comparison of robust regression methods in linear regression. Int J Contemp Math Sci. 2011;6(9):409–421.
Google Scholar
Yohai VJ. High breakdown-point and high efficiency robust estimates for regression. Ann Stat. 1987;15(2):642–656.
Web of Science ®Google Scholar
Rousseeuw PJ. Least median of squares regression. J Am Stat Assoc. 1984;79(388):871–880.
Web of Science ®Google Scholar
Huber PJ. Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat. 1973;1(5):799–821.
Web of Science ®Google Scholar
Kloke JD, McKean JW. Rfit: rank-based estimation for linear models. R J. 2012;4(2):57.
Google Scholar
Jaeckel LA. Estimating regression coefficients by minimizing the dispersion of the residuals. Ann Math Stat. 1972;43(5):1449–1458.
Google Scholar
Kloke JD, McKean JW, Rashid MM. Rank-based estimation and associated inferences for linear models with cluster correlated errors. J Am Stat Assoc. 2009;104(485):384–390.
Web of Science ®Google Scholar
Pollard D. Asymptotics for least absolute deviation regression estimators. Econ Theory. 1991;7(2):186–199.
Web of Science ®Google Scholar
Narula SC, Wellington JF. Algorithm AS 108: multiple linear regression with minimum sum of absolute errors. Appl Stat. 1977;26(1):106–111.
Google Scholar
Hollander M, Wolfe DA, Chicken E. Nonparametric statistical methods. New York: John Wiley & Sons; 2013.
Google Scholar
Koul HL, Sievers GL, McKean J. An estimator of the scale parameter for the rank analysis of linear models under general score functions. Scand J Stat. 1987;14(2):131–141.
Web of Science ®Google Scholar
Chen T, Tang W, Lu Y, et al. Rank regression: an alternative regression approach for data with outliers. Shanghai Arch Psychiatry. 2014;26(5):310.
PubMedGoogle Scholar
Hettmansperger TP, McKean JW. A robust alternative based on ranks to least squares in analyzing linear models. Technometrics. 1977;19(3):275–284.
Web of Science ®Google Scholar
Schweder T. Window estimation of the asymptotic variance of rank estimators of location. Scand J Stat. 1975;2(3):113–126.
Google Scholar
Schuster EP. On the rate of convergence of an estimate of a functional of a probability density. Scand Actuar J. 1974;1974(2):103–107.
Google Scholar
Hettmansperger TP, McKean JW. Robust nonparametric statistical methods. Boca Raton London NewYork: CRC Press; 2010.
Google Scholar
Kloke JD, McKean JW. Rfit: rank-based estimation for linear models. R J. 2012;4(2):57.
Google Scholar
Wang JJ, Scott W, Raphael G, et al. A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect. In: Australasian road safety research, policing and education conference, August. Australasian College of Road Safety; 2013.
Google Scholar
Efron B. The jackknife, the bootstrap, and other resampling plans. Philadelphia: Society for Industrial and Applied Mathematics; 1982.
Google Scholar
Efron B.: bootstrap methods: another look at the jackknife. Ann Stat. 1979;7:1–26.
Web of Science ®Google Scholar
Efron B, Tibshirani R. An introduction to the bootstrap. New York: Chapman and Hall; 1994.
Google Scholar
Simon JL, Bruce P. Resampling: a tool for everyday statistical work. Chance. 1991;4(1):22–32.
Google Scholar
Davison AC, Hinkley DV. Bootstrap methods and their applications. New York: Cambridge University Press, Cambridge; 1997.
Google Scholar
Islam MM, Webb B, Palais R, et al. Assessing the potential impact of the declaration of Istanbul 2008 on internet reporting of human organ transplantation-related crimes using interrupted time series analysis and meta-analysis approaches. In: Transplantation proceedings (Vol. 52, No. 1). Elsevier; 2020. p. 12–19.
Google Scholar
World Bank Open Data. http://data.worldbank.org/.
Google Scholar
Muller E, Dominguez-Gil B, Martin D. The declaration of Istanbul on organ trafficking and transplant tourism (2018 edition) introduction. Transplantation. 2019;103(2):217.
PubMed Web of Science ®Google Scholar
Strengthening and promoting effective measures and international cooperation on organ donation and transplantation to prevent and combat trafficking in persons for the purpose of organ removal and trafficking in human organs: resolution / adopted by the General Assembly UN. General Assembly (75th session: 2020-2021).
Google Scholar
Abboud O, Abbud-Filho M, Abdramanov K, et al. The Declaration of Istanbul on organ trafficking and transplant tourism. Clin J Am Soc Nephrol. 2008;3(5):1227–1231.
PubMed Web of Science ®Google Scholar
Assembly STWH. Who guiding principles on human cell, tissue and organ transplantation. Cell Tissue Bank. 2010;11(4):413–419.
PubMed Web of Science ®Google Scholar
Organ trafficking and transplant tourism and commercialism: the Declaration of Istanbul. Lancet. 2008;372(9632):5–6.
Google Scholar
Kontopantelis E, Doran T, Springate DA, et al. Regression based quasi-experimental approach when randomisation is not an option: interrupted time series analysis. BMJ. Jun 9, 2015;350:h2750.
PubMed Web of Science ®Google Scholar

Rank-based regression with bootstrapping and the least square regression for analysis of small sample interrupted time-series data: simulation studies and application to illegal organ transplants

Abstract

1. Introduction