2,973
Views
3
CrossRef citations to date
0
Altmetric
Theory and Methods Special Issue on Precision Medicine and Individualized Policy Discovery

Introduction to the Theory and Methods Special Issue on Precision Medicine and Individualized Policy Discovery

, , &
Pages 159-161 | Received 30 Nov 2020, Accepted 08 Dec 2020, Published online: 09 Mar 2021

Abstract

We introduce the Theory and Methods Special Issue on Precision Medicine and Individualized Policy Discovery. The issue consists of four discussion papers, grouped into two pairs, and sixteen regular research papers that cover many important lines of research on data-driven decision making. We hope that the many provocative and original ideas presented herein will inspire further work and development in precision medicine and personalization.

This article is related to:
Rejoinder: New Objectives for Policy Learning
Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives for Policy Learning
Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives for Policy Learning
Rejoinder: New Objectives for Policy Learning

1 Introduction

The Theory and Methods Special Issue on Precision Medicine and Individualized Policy Discovery emphasizes the use of data to improve decision making. This includes the development of robust and efficient methods for estimation of personalized intervention recommendation systems as well as the generation of new knowledge about a particular decision process under study. While many of the papers in this special issue focus on biomedical applications, the ideas presented here have the potential to inform decision making across a wide range of areas including business, engineering, public policy, and education. We are appreciative of the many authors who contributed to this project. Working on this special issue and seeing myriad new and exciting ideas from a diverse set of researchers, has made us optimistic for the future of this important area of statistics. We are grateful to have been a part of it. We wish to thank co-editors Regina Liu and Hongyu Zhao, who instigated this project and invited Michael Kosorok to be guest editor and agreed to loan associate editors Eric Laber, Dylan Small and Donglin Zeng to assist with this endeavor. We also thank Jamie Hutchens and Eric Sampson, as well as all of the other editors and editorial staff who have supported this project since its initiation. We also appreciate the many reviewers who contributed numerous hours of thoughtful and helpful review time. We note that, to avoid any potential conflicts of interest, we (Drs. Kosorok, Laber, Small, and Zeng) excluded ourselves from being eligible for discussion papers and did not participate in reviews or decisions regarding papers we were authors on.

The basic premise of precision medicine is to leverage data and clinical knowledge to optimize health care by tailoring treatment to the uniquely evolving health status of each patient. The data describing a patient’s health status can be complex, comprising irregular and sparse clinic data, genetic and genomic information, imaging data, billing and insurance information, and so on. The challenges associated with synthesizing such data to identify if, when, and how best to treat patients are numerous. Some of the most pressing challenges are considered in this special issue (briefly outlined below). An overview of the concepts and recent developments in this area can be found in Kosorok and Laber (Citation2019) and Tsiatis et al. (Citation2019). The paradigm of precision medicine can be expanded to precision public health which involves treatments, policies, and actions for families, clinics, hospitals, and communities, while also considering issues of cost and scalability. A timely application within precision public health is real-time resource allocation for the management of an infectious disease (Rasmussen, Khoury, and Del Rio Citation2020).

Of course, optimal intervention assignment based on accumulating information is not unique to applications in human health. There are rich lines of research in operations research, engineering, and computer science focused on optimal sequential decision making and resource allocation with applications in nearly every sector of science and society. Over the last 10 years, there has been increasing cross-pollination between these lines of research and precision medicine. Especially fruitful have been the integration of ideas from reinforcement learning with statistics (Murphy Citation2005; Qian and Murphy Citation2011; Shortreed et al. Citation2011; Zhao et al. Citation2011; Goldberg and Kosorok Citation2012; Chakraborty Citation2013; Sugiyama Citation2015; Tewari and Murphy Citation2017; Ertefaie and Strawderman Citation2018; Clifton and Laber Citation2020; Kallus and Uehara Citation2020); introductory texts on reinforcement learning include Sutton and Barto (Citation1998), Si (Citation2004), Powell (Citation2007), Busoniu et al. (Citation2010), and Szepesvari (2010). Much of the early research on statistical reinforcement learning focused on (statistical) efficiency, inference, and the advancement of clinical/domain knowledge, whereas the focus of early computer science research in reinforcement learning was on computational efficiency, generalization, and the engineering of systems capable of learning to perform complex tasks. However, the distinction between statistics and computer science research in reinforcement learning is now less clearly delineated with statisticians publishing more regularly in computer science journals and conferences and vice versa (indeed, many of the authors contributing to this special issue have appointments in computer science departments.) We hope that more researchers outside of statistics will take an interest in precision medicine and that this special issue might serve to highlight important open problems and to identify possible entry-points to this exciting field.

2 Overview of Papers

We will now introduce the papers in this issue. To begin with, the issue contains four discussion papers that are grouped into two pairs. All four papers consider estimation of optimal policies in the presence of unmeasured confounders and they are grouped by their approach. The first pair of papers focuses on the use of an instrumental variable to address potential unmeasured confounding in dynamic treatment regime estimation. In Cui and Tchetgen Tchetgen’s paper, “A Semiparametric Instrumental Variable Approach to Optimal Treatment Regime Estimation,” a general approach is developed which leverages an interesting and practical assumption about patient compliance. In Qiu, Carone, Sadicova, Petukhova, Kessler, and Luedtke’s paper, “Optimal Individualized Decision Rules using Instrumental Variable Methods,” the authors consider both optimal treatment recommendations as well as encouragement strategies that identify patients who will benefit from changing their current course of treatment.

The second pair of discussion papers focus on generalization to new populations and the use of weighting to address unmeasured confounding. The paper by Kallus, “More Efficient Policy Learning via Optimal Retargeting,” proposes an efficient reweighting scheme that ensures a policy estimated for one population remains valid in another targeted population. In Mo, Qi, and Liu’s paper, “Learning Optimal Distributionally Robust Individualized Treatment Rules,” the authors also address the challenge of retargeting policies, but pursue policies that perform well over a range of new target distributions.

All four of these papers advance the theory and methodology for estimation and evaluation of intervention strategies. The importance and scope of these advances are made clear by the lively and insightful comments provided by the discussants. These discussions also serve to chart new and important lines of research.

The remaining papers focus broadly on design and analysis tools for decision support, which we group by their focus on single- or multiple-decision problems. In the first group of papers, which focus on a single decision point, a number of important issues are addressed, including trial design, precision dosing, interpretability, and assessment under unmeasured confounding. In Chen, Huling, and Smith’s paper, “A Two-Part Framework for Estimating Individualized Treatment Rules from Semi-Continuous Outcomes,” the authors develop a method for estimating an optimal dynamic treatment regime when the outcome is semi-continuous, as happens, for example, with health-care payments. In “Efficient Estimation of Optimal Regimes Under a No Direct Effect Assumption,” by Liu, Shahn, Robins, and Rotnitsky, the authors consider the setting where the decision to test has no direct effect on the patient outcome except through the choice of treatment to administer. The authors leverage this structure in the context of an interesting HIV-infection application to develop decision rules for deciding when to conduct a diagnostic test.

In “Statistical Inference for Online Decision Making via Stochastic Gradient Descent,” by Song, Chen, and Lu, the authors develop an online approach to policy estimation based on stochastic gradient descent which allows interval estimation and hypothesis testing. The method is demonstrated with an interesting application to news article recommendation systems. In “Doubly Robust Estimation of Optimal Dosing Strategies,” by Schultz and Moodie, the authors develop new doubly robust methods, based on weighted least squares, for discovering optimal dynamic treatment regimes when the treatment is continuous, as happens, for example, in precision dosing. In “Learning Individualized Treatment Rules for Multi-Domain Latent Outcomes,” by Wang, Chen and Zeng, the authors use a restricted Boltzmann machine to develop methods for estimating individualized treatment rules for optimizing latent outcomes, such as latent mental health status based on multiple-domain psychological or clinical symptoms.

In Zhao and Pan’s paper, “Improved Doubly Robust Estimation in Learning Optimal Individualized Treatment Rules,” the authors develop a new doubly robust optimal individualized treatment rule estimator that achieves minimum variance among all doubly robust estimators when the propensity score model is correctly specified regardless of the outcome model. In “Selecting and Ranking Individualized Treatment Rules with Unmeasured Confounding,” Zhao, Zhang, Weiss, and Small develop a paired testing procedure to compare and rank different individualized treatment rules in the presence of unmeasured confounders. In Zhou, Guo, and Ma’s paper, “Estimation of Optimal Individualized Treatment Rule using the Covariate-Specific Treatment Effect Curve with High-Dimensional Covariates,” the authors use a single-index model to facilitate estimation of and inference for an optimal treatment rule in the presence of high-dimensional covariates.

In Lin, Thall, and Yuan’s paper, “BAGS: A Bayesian Adaptive Group Sequential Design with Subgroup-Specific Survival Comparisons,” the authors develop a Bayesian clinical trial design for comparing survival outcomes within patient subgroups where treatment-subgroup interactions may be present. In Yadlowsky, Pellegrini, Lionetto, Braun, and Tian’s paper, “Estimation and Validation of Ratio-Based Conditional Average Treatment Effects using Observational Data,” the authors develop a doubly robust approach to conditional average treatment effect estimation based on the ratio, rather than the more typical difference, of the potential outcomes to mitigate potentially spurious treatment-covariate interactions that can arise in observational data. In Meng and Li’s paper, “A Multi-resolution Theory for Approximating Infinite-p-Zero-n: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance Tradeoff” the authors develop a multi-resolution based approach to improve performance and inference when developing individual-level predictions.

Now we outline the second group of papers, those that considered multiple decision points, address issues of time-varying confounding, nonstationarity, and online updating. In “Robust Q-Learning,” by Ertefaie, McKay, Oslin, and Strawderman, the authors develop a robust version of Q-learning, which provides efficient estimation and inference while allowing the use of flexible models for nuisance functions. In Liao, Klasnja, and Murphy’s paper, “Off-Policy Estimation of Long-Term Average Outcomes with Applications to Mobile Health, “the authors develop an off-policy reinforcement learning method for estimation of an optimal policy in the average reward setting with application to mobile-health.

In “Learning When-to-Treat Policies,” by Nie, Brunskill, and Wager, the authors develop an “advantage doubly robust” estimator of a dynamic treatment regime for deciding when to treat without requiring a Markov assumption. They also derive welfare regret bounds for the estimated policy and demonstrate promising performance of the new method in a number of contexts. In Qian, Hu, Cheng, and Cheung’s paper, “Personalized Policy Learning using Longitudinal Mobile Health Data,” the authors use a generalized linear mixed model to synthesize population and individual level effects in the context of an mHealth application. A group lasso penalty is used to prevent overfitting. In Wang and Sun’s paper, “Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes,” the authors develop a stochastic tree-based reinforcement learning approach for estimating dynamic treatment regimes which applies to both randomized and observational data and which yields an interpretable sequence of policies.

3 Concluding Comments

The wide variety of topics covered in this special issue is evidence of the many open and interesting problems in precision medicine and individualized policy discovery and the vibrant research community that is driving this field forward. Although we sought after breadth in our planning for this issue, it was not intended to be comprehensive in its coverage of all relevant facets of the area, and the final selection of topics was guided in part by the interests of the contributing authors. The articles in this special issue show that data-driven decision support holds enormous potential to improve the quality and efficiency of healthcare. This issue was assembled during the midst of the COVID-19 pandemic which underscores the need for adaptive strategies for testing and monitoring, resource allocation and triage decisions, timing and severity of nontherapeutic interventions, economic recovery, and so on. Several authors delayed their submissions to meet urgent requests from policy-makers wanting to use data on the outbreak to inform their decision making. We hope that researchers in precision medicine and related areas will similarly find ways to contribute to the fight against COVID as well as advance human health and well-being generally.

References

  • Busoniu, L., Babuska, R., De Schutter, B., and Ernst, D. (2010), Reinforcement Learning and Dynamic Programming Using Function Approximators, New York: CRC Press.
  • Chakraborty, B. (2013), Statistical Methods for Dynamic Treatment Regimes, New York: Springer.
  • Clifton, J., and Laber, E. (2020), “Q-Learning: Theory and Applications,” Annual Review of Statistics and Its Application, 7, 279–301. DOI: 10.1146/annurev-statistics-031219-041220.
  • Ertefaie, A., and Strawderman, R. L. (2018), “Constructing Dynamic Treatment Regimes Over Indefinite Time Horizons,” Biometrika, 105, 963–977. DOI: 10.1093/biomet/asy043.
  • Goldberg, Y., and Kosorok, M. R. (2012), “Q-Learning With Censored Data,” The Annals of Statistics, 40, 529–560. DOI: 10.1214/12-AOS968.
  • Kallus, N., and Uehara, M. (2020), “Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes,” Journal of Machine Learning Research, 21, 1–63.
  • Kosorok, M. R., and Laber, E. B. (2019), “Precision Medicine,” Annual Review of Statistics and Its Application, 6, 263–286. DOI: 10.1146/annurev-statistics-030718-105251.
  • Murphy, S. A. (2005), “A Generalization Error for Q-Learning,” Journal of Machine Learning Research, 6, 1073–1097.
  • Powell, W. B. (2007), Approximate Dynamic Programming: Solving the Curse of Dimensionality, New York: Wiley.
  • Qian, M., and Murphy, S. A. (2011), “Performance Guarantees for Individualized Treatment Rules,” The Annals of Statistics, 39, 1180. DOI: 10.1214/10-AOS864.
  • Rasmussen, S. A., Khoury, M. J., and Del Rio, C. (2020), “Precision Public Health as a Key Tool in the Covid-19 Response,” JAMA, 324, 933–934. DOI: 10.1001/jama.2020.14992.
  • Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., and Murphy, S. A. (2011), “Informing Sequential Clinical Decision-Making Through Reinforcement Learning: An Empirical Study,” Machine Learning, 84, 109–136.
  • Si, J., ed. (2004), Handbook of Learning and Approximate Dynamic Programming (Vol. 2), New York: Wiley.
  • Sugiyama, M. (2015), Statistical Reinforcement Learning: Modern Machine Learning Approaches, Boca Raton, FL: CRC Press.
  • Sutton, R. S., and Barto, A. G. (1998), Reinforcement Learning: An Introduction, Cambridge, MA: MIT Press.
  • Szepesvari, C. (2010), Algorithms for Reinforcement Learning, Williston, VT: Morgan and Claypool.
  • Tewari, A., and Murphy, S. A. (2017), “From Ads to Interventions: Contextual Bandits in Mobile Health,” in Mobile Health, eds. J. Rehg, S. Murphy, and S. Kumar, Cham: Springer, pp. 495–517.
  • Tsiatis, A. A., Davidian, M., Holloway, S., and Laber, E. (2019), Dynamic Treatment Regimes: Statistical Methods for Precision Medicine, Boca Raton, FL: CRC Press.
  • Zhao, Y., Zeng, D., Socinski, M. A., and Kosorok, M. R. (2011), “Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer,” Biometrics, 67, 1422–1433. DOI: 10.1111/j.1541-0420.2011.01572.x.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.