303
Views
0
CrossRef citations to date
0
Altmetric
Case Studies

Case study on applying sequential analyses in operational testing

, , &
 

Abstract

Sequential analysis concerns statistical evaluation in which the number, pattern, or composition of the data is not determined at the start of the investigation, but instead depends on the information acquired during the investigation. Although sequential analysis originated in ballistics testing for the Department of Defense (DoD)and it is widely used in other disciplines, it is underutilized in the DoD. Expanding the use of sequential analysis may save money and reduce test time. In this paper, we introduce sequential analysis, describe its current and potential uses in operational test and evaluation (OT&E), and present a method for applying it to the test and evaluation of defense systems. We evaluate the proposed method by performing simulation studies and applying the method to a case study. Additionally, we discuss challenges to address for sequential analysis in OT&E. Lastly, while operational testing is the focus in this paper, the methodology presented is applicable to campaigns of experimentation and general testing across numerous disciplines.

Notes

1 Up and Down Method (UD), Langlie Method (LM), Delayed Robbins Monroe Method (DRM), Wu’s three-phased approach (3POD), Neyer’s Method (NM), the Robbins Monroe Joseph Method (RMJ), and K-in-a-row (KR).

2 The Statistical Research Group was an Office of Scientific Research and Development activity at Columbia University during the Second World War.

3 The T&E community may be more familiar with the SDOE planning approach described by Box and Wilson (Citation1951) and Montgomery (Citation2020), which is the focus of this paper, but sequential design problems are more generally those that involve a sequential search for informative experiments (Chernoff Citation1959).

4 Failure rate = proportion of times radar fails to detect the incoming projectile.

5 Δ>0 is the effect size, which can be thought of as a statement regarding an acceptable level of risk; experimental setups are optimized for detection of deviations from the null hypothesis of at least Δ. Detecting large deviations from the null hypothesis is easier than detecting small deviations, so smaller effect sizes are associated with larger sample sizes needed to make a decision. Deviations less than Δ will still be detected, but the error rate for detecting small deviations will be larger than our chosen error rate thresholds. This is a necessary trade off in test planning, and testers should utilize subject matter expertise to determine an acceptable value of Δ.

6 Simulation settings: true failure proportions p = p0 and p = p1, and type I and type II error rate set to 20%.

7 Note, because the SPRT does not have a fixed sample size, we present the average sample size and standard deviation (SD) for each simulation scenario in .

8 In this example we used the exact binomial test which is commonly used in T&E for analyzing binomial data. However, a more natural comparison would be to use the Neyman-Pearson Likelihood Ratio Test, because the hypotheses are identical to the SPRT. The Exact Binomial is mathematically identical to Neyman-Pearson LRT (see Appendix A).

9 By characterization we mean describing the system performance under different operational conditions.

10 We encourage using subject matter experts (SMEs) in the test planning and design process. For example, a SME may be able to eliminate, based on their knowledge of the system, the need to plan for certain two-way interactions and quadratic effects.

11 The letters correspond to the factor labels in Table 2, and the colon (:) between letters represents a two-way interaction between those factors

12 The type I error rate does not equal α.

13 For example, a test based on a planned significance level α=0.3 for the null hypothesis that a single parameter is different from zero will produce an actual type I error rate α˜0.97 associated with the null hypothesis that all parameters are jointly zero when separately testing 10 different parameters for inclusion in the model.

14 Discussion of this and other related considerations are beyond the scope of this paper; we will instead discuss these considerations in more detail in a separate paper. White (Citation2000), Hansen (Citation2005), and Romano and Wolf (Citation2005) each provide procedures that account for the multiple hypothesis problem in model selection.

Additional information

Notes on contributors

Monica Ahrens

Dr. Monica Ahrens, PhD is a Research Scientist at Virginia Tech in the Center for Biostatistics and Health Data Science. She completed a summer internship with the Institute for Defense Analyses in 2022 and received her PhD from the University of Iowa Department of Biostatistics in 2022.

Rebecca Medlin

Dr. Rebecca Medlin, PhD is a research staff member in the Operational Evaluation Division at the Institute for Defense Analyses. She supports the Director, Operational Test and Evaluation on training, research and applications of statistical methods for the planning and evaluation of military systems. She received her PhD in Statistics from Virginia Tech in 2014.

Keyla Pagán-Rivera

Dr. Keyla Pagán-Rivera has a Ph.D. in Biostatistics from The University of Iowa and serves as a Research Staff Member in the Operational Evaluation Division at the Institute for Defense Analyses. She supports the Director, Operational Test and Evaluation on training, research and applications of statistical methods for the planning and evaluation of military systems.

John W. Dennis

Dr. John W. Dennis, PhD is a research staff member focusing on Econometrics, Statistics, and Data Science with the Institute for Defense Analyses' Human Capital Group.  He received his PhD in Economics from the University of North Carolina at Chapel Hill in 2019.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.