1,600
Views
0
CrossRef citations to date
0
Altmetric
Review Article

Reflections on estimands for patient-reported outcomes in cancer clinical trials

, , &
Received 09 Mar 2023, Accepted 27 Oct 2023, Published online: 19 Nov 2023

ABSTRACT

It is common and important to include the patient’s perspective of the impact of treatment on health-related quality of life (HRQoL) outcomes. In this commentary, we focus on applying the new addendum to ICH E9 guideline E9 (R1) relating to the estimand framework to Patient Reported Outcomes (PROs) collected in cancer clinical trials, from a statistician’s viewpoint. Currently, common practice for statistical analysis of PRO endpoints of published cancer clinical trials demonstrates ambiguity, leaving critical questions unspecified, hindering conclusions about the effect of treatment on PRO endpoints as well as comparability between clinical trials. To avoid this scenario, we advocate the systematic use of the estimand framework which requires the prospective definition of clear PRO research questions. Among the five attributes of the estimands framework, the definition of the endpoint (what is the right PRO measure and timeframe to target and why?), the intercurrent event identification and management (what happens with PRO data post-disease progression, what is the impact of death?) and the population-level summary (what is an acceptable statistical summary for PRO data?) require the most attention for PRO estimands. We identify good practice and highlight discussion points including the challenges of statistical analysis in the presence of missing and/or unobservable data and in relation to death. Through this discussion we highlight that there is no “statistical magic”, but that the estimand framework will help you find out what you really want to know when quantifying the benefit of treatments from the patients’ perspective.

1. Introduction

Patient-Reported Outcomes (PROs) are broadly used in registrational oncology trials submitted to European Medicines Agency (EMA) (Teixeira et al. Citation2022) and US Food and Drug Administration (FDA) (Gnanasakthy et al. Citation2022). PRO endpoints are typically secondary or exploratory endpoints used to evaluate clinical benefits in terms of disease-related symptoms, physical and role functioning, but also tolerability (Kluetz et al. Citation2016) or more general concepts such as Health-Related Quality of Life. The growing use of PROs to support high-stakes decisions, such as treatment approval and pricing or reimbursement, calls for better standards for PRO evidence generation. Best practices call for careful planning, involving cross-functional teams of trialists, clinicians and statisticians, and thoughtful execution, but also engagement of patients throughout the process (Addario et al. Citation2020; Brundage et al. Citation2022). A series of international initiatives have been conducted over the past years to define good general practice for PRO planning and execution in clinical trials, from protocol writing (Standard Protocol Items: Recommendations for Interventional Trials PRO extension – SPIRIT-PRO) (Calvert et al. Citation2018, Citation2021), to statistical analysis (Setting International Standards in Analyzing Patient-Reported Outcomes and Quality of Life Endpoints Data – SISAQOL) (Coens et al. Citation2020) and reporting (Consolidated Standards of Reporting Trials-PRO extension – CONSORT-PRO) (Calvert et al. Citation2013). A common feature raised by these international good practice recommendations for PRO research in clinical trials is the critical importance of the clear definition of a PRO objective when designing a clinical trial and alignment of the execution of the trial (design, analysis, reporting) with this objective.

Despite these initiatives to improve the quality of PRO endpoints in clinical trials, protocols including PRO endpoints and reports of PRO results from clinical trials still rarely include clear statements of the PRO objective(s) of the study (Kyte et al. Citation2019). In parallel, statistical analysis of PRO endpoints in published cancer clinical trials are inconsistent between studies, and the way in which the results are reported is often ambiguous, particularly in addressing the statistical assumptions of the models in the presence of missing or unobservable data and relaying relevant important information for the interpretation of results (Hamel et al. Citation2017). The lack of systematic approach to the evidence related to the PRO endpoints in cancer clinical trials may hinder conclusions about the effect of treatment on PRO endpoints as well as comparability between clinical trials.

Even if the same PRO concept is reported, the chosen endpoint can lead to different conclusions. This potential confusion of different conclusions can be illustrated by the case of two clinical of published in 2014 in the New England Journal of Medicine, comparing bevacizumab to temozolomide and radiotherapy as first-line treatment in patients with newly diagnosed glioblastoma; the article reporting the results of the study RTOG 082513 concluded that “increased symptom severity and decline in health-related quality of life were found over time among patients who were treated with bevacizumab” whereas the AVAglio14 study concluded that “baseline health-related quality of life [was] maintained longer in the bevacizumab group”. These seemingly contradictory conclusions may be explained by the different statistical approaches (in RTOG 0825 the endpoint was the between group difference in change from baseline at 46 weeks; in AVAglio it was deterioration-free survival) that reflect different underlying PRO research questions/objectives. This example highlights the importance of the clear specification of the initial research objective for the demonstration of PRO benefits in cancer clinical trials.

The concept of estimands has gained traction with the publication of the US National Research Council’s report on missing data in clinical trials in 2010 (National Research Council Citation2010) and the formalization in 2019 of the estimand framework in an addendum to the International Committee for Harmonization of Technical Requirements for Pharmaceuticals for Human Use (ICH) E9 Guidance document “Statistical Principles for Clinical Trials” fully dedicated to this notion (Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research Citation2021). The estimand framework provides a systematic approach to ensure alignment among objectives, trial execution, statistical analyses, and interpretation of results from a clinical trial. It is therefore a compelling tool to address the challenges of good practice for the inclusion of PRO endpoints in the context of cancer clinical trials. More recently the application of the estimands framework to PRO endpoints has been considered, with some illustrative examples (Bell et al. Citation2019; Fiero et al. Citation2020; Lawrance et al. Citation2020); these recent publications emphasized the promising role of the estimand framework in supporting an effective discussion around the demonstration of PRO benefits of new treatments among multidisciplinary teams, by forcing the development of a clear and explicit statement of the PRO objectives.

This commentary reinforces many of the key points described in these previous publications by providing a collective statistician’s perspective on some of the challenges related to an estimand framework construction for PRO endpoints in cancer clinical trials.

2. Estimand attributes with pro considerations

Five attributes must be defined for the full specification of an estimand (Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research Citation2021): Treatment, target study population, variable of interest (endpoint), intercurrent event management, and population-level summary. All the attributes of the estimand need to be considered together to describe a full, and unambiguous estimand. The definition of the endpoint (variable) (what is the specific PRO measure and timeframe to target?), the intercurrent event identification and management (what happens with PRO data post-disease progression, what is the impact of death?) and the population-level summary (what is an acceptable statistical summary for PRO data?) typically require the most attention for PRO estimands.

2.1. Treatment

This attribute is usually reflective of the randomised treatment arms and aligning directly with the primary endpoint which is also relevant to the PRO endpoints.

2.2. Targeted study population

The specification of the population attribute has typically not been specifically considered a challenge for PRO endpoints as it has been assumed to be like other endpoints as, in general, it is defined as “all randomised” participants. However, results are frequently reported in a PRO population, usually in the context of when an endpoint is based on a change from baseline or individual improvement/deterioration (where a PRO population is defined as patients with both baseline and post-baseline PRO datapoints). If a PRO population is used it must be clearly justified.

2.3. Variable (endpoint) of interest

“What is the specific PRO measure and timeframe to target”

The definition of a PRO endpoint typically involves several steps. At first, the concept of interest (COI) that will be targeted by the demonstration should be identified as clearly and explicitly as possible. As opposed to death or, to a lesser extent, to endpoints based on biological or imaging parameters (e.g., tumour progression), PRO endpoints are meant to capture concepts that are not necessarily obvious to clinicians and that therefore need to be explicitly defined in the first place. The identification of the target COI requires extensive consideration of the experience of patients in the specific context of the study, but also clear hypotheses on the intended treatment benefit (Walton et al. Citation2015). Core COIs for cancer clinical trials are disease-related symptoms, physical functioning, role functioning, symptomatic adverse events (US Food Drug Administration Core patient-reported outcomes in cancer clinical trials, draft guidance for industry Citation2021; Oncology Center of Excellence, Center for Biologics Evaluation and Research and Center for Drug Evaluation and Research, Citation2021) and Health-related quality of life. These generic concepts often require refinement to be used effectively in PRO endpoints (e.g., defining as specifically as possible what symptoms related to a cancer type will be targeted in each study). Once the COI is clearly defined, it is then necessary to identify what PRO instrument will be used to measure it. The choice of the PRO instrument to be used will be informed by the consideration of its content and measurement properties. The variable of interest will then typically be a numerical outcome from the application of the PRO instrument (a score) that is assumed to reflect the underlying COI. Hence, for example, if the hypothesis is that a new treatment may impact the physical functioning of patients, physical functioning could be measured in all the participants of the study at various scheduled timepoints using the Physical Functioning (PF) domain score of the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Core 30 items (QLQ-C30). This example illustrates the notion of COI (Physical functioning) and measurement instrument (the PF score of the QLQ-C30). In the case of a PRO endpoint, both need to be clearly and explicitly identified for the variable of interest to be defined.

Another aspect of the definition of the variable of interest for PRO endpoints, which has probably been less commonly addressed, is the timeframe used for the analysis. The definition of the timepoint or timeframe of interest provides the context for consideration of relevant intercurrent events. Various timeframes could be relevant depending on the specific PRO objective: consideration of the PRO variable until death, or until disease progression or until a clinically relevant timepoint (which would be disease-dependent e.g., 3 months, 6 months, one year). Identification of relevant timeframe presents challenges for PRO data because in cancer trials the PRO endpoints are typically secondary to assessment of disease progression. With a long timeframe censoring becomes a key consideration – patients may be lost to follow-up (at random) or terminate the study which can lead to increased missing data. However, with a short timeframe data collected from some patients is unused, which also may not be ideal. It has recently been recommended to pre-define a specific timepoint of interest for PRO endpoints (Fiero et al. Citation2022); however, in practice, in cancer clinical trials it is often challenging to identify a clinically meaningful timepoint at which inference should be restricted.

2.4. Intercurrent events

“What happens with PRO data post-disease progression? What is the impact of death?”

Consideration of events that may happen during the clinical trial and within the timeframe of the variable (endpoint of interest) that may alter the relationship between treatment and the variable of interest are termed intercurrent events (ICEs). Further, ICEs may impact the ability to collect relevant data or render data non-existent (e.g., post-mortem). Pre-identification of these potential events and the management of data collected/not collected/not available is important to clarify. The most relevant events to consider for PRO endpoints are typically discontinuation of treatment, disease progression and death, depending on the timeframe of interest, as defined in the variable. When considering treatment discontinuation and disease progression events it is important to be clear on reasons for discontinuation of treatment for example due to clinical disease progression, adverse events or other reasons. As these intercurrent events pose considerable challenges for PRO endpoints these are discussed further in this paper.

2.5. Summary measure (population-level summary)

“What is an acceptable statistical summary for PRO data?”

This attribute is probably the most obviously relevant to statisticians and as such in the past has not had as much cross-discipline discussion as it requires. It may be seen as straightforward to define initiallyfor example, a difference in mean scores or a hazard ratio of time to deterioration. However, to calculate the summary measure the aspect of handling of missing data and the assumptions made in the statistical analysis have traditionally been poorly reported in methods sections describing statistical analysis (Bell and Fairclough Citation2014; Fielding et al. Citation2008, Citation2016). Nonetheless, the estimand framework helps to ensure that this clarity is not limited to just the statistical methods. As a general rule, the choice of the summary measure is closely tied to the type of endpoint and comes with underlying hypotheses related to its estimation. PRO endpoints in cancer clinical trial are in most cases related to one the three following types: magnitude of change, time-to-event, proportion of responders (Coens et al. Citation2020). For PRO endpoints of magnitude of change, mean change in the PRO score or mean (or median) PRO scores at a predefined timepoint would typically be the summary measure. For PRO time-to-event endpoint, summary measures would typically be hazard ratios or median time-to-event. Finally, for responder analysis, percentage of responders or odds ratio would typically be the summary measures. The type of endpoint and the choice of the corresponding summary measure relies on the PRO hypothesis as well as the statistical assumptions that are made for the intended estimation of the summary measure. For example, the use of time-to-event methods for PRO endpoints has been recently questioned given the assumptions required for their estimation (especially their legitimacy in practice) and the relevance of the PRO objective reflected by such endpoints (Fiero et al. Citation2022). The nature of the variable of interest (PRO score) is also important to consider when selecting a summary measure, as often times PRO scores can be considered close to continuous; however, they are frequently ordinal in nature. Hence, when the variable can only take a few numbers of values (e.g., when it is obtained using a single item that can only take four or five values), the legitimacy and consequences of choosing a summary measure that typically applies to a continuous variables, such as the mean, or is resulting from models that assume normal distribution, should be carefully considered (e.g., EORTC QLQ-C30 disease specific symptom modules).

2.6. Example PRO estimand

Given the five attributes of the estimand, we can now use this framework and apply to PRO estimand; an example can be seen in (Bell et al. Citation2021). Consideration of the patient’s journey for potential scenarios also often provides a useful tool to help reflect on relevant components of the estimand and focus on timing of PRO assessments and potential intercurrent events ().

Figure 1. Illustration of Patient Journeys with PRO assessments and clinical events.

Figure 1. Illustration of Patient Journeys with PRO assessments and clinical events.

Table 1. PRO estimand attributes.

It is worth noting that although the estimand framework was primarily intended for use in confirmatory clinical trial settings for estimating the treatment effect on primary and secondary endpoints, its structure can remain extremely useful for all clinical trial endpoints, even more exploratory, descriptive ones.

2.7. Specific challenges and considerations for PRO endpoints

2.7.1. Missing data

The estimand framework has required statisticians to think differently about missing data in clinical trial analytics. Missing PRO data is an ongoing challenge: the rates of missing data are often high, and handling of missing data is often unclear (Fielding et al. Citation2016). The lack of clarity could be because it is sometimes confusing to know what is missing data and what is not. Historically, Rubin’s seminal work (Rubin Citation1976) defining types of missing data allowed statisticians to understand the extent to which analytic methods relied on assumptions about the values that were unobserved. For example, data were considered missing at random if the probability of missingness did not depend on the unobserved outcomes, after conditioning on covariates and previously observed outcomes in a longitudinal study. The estimand framework shifts the focus from missing data to the reason why the data are unobserved. Stated differently, missing data is most often the result of an intercurrent event. Using patient journeys in , some patients had PRO assessments post progression e.g. ID_05 and ID_08 and some didn’t e.g. ID_01 and ID_02; but, depending on the estimand, the patient’s PRO values post disease progression may not be relevant, thus it is not missing.

All studies may experience missing data, where the reason for the missingness is unknown, e.g., a missed assessment. Similarly, all analytic approaches that include incomplete data are subject to the untestable assumptions of the missingness. When a hypothetical approach to intercurrent events is defined, then data post-event are often considered to be not relevant to the estimand. This is seemingly a convenient way of accounting for not collecting PRO assessments after disease progression. However, the high proportions of uncollected or unused data makes estimation (using hypothetical or other strategies for intercurrent events) more reliant on untestable assumptions (Ratitch et al. Citation2020). The scheduling of PRO assessments in oncology trials may become less frequent or even cease after an intercurrent event. This can make the implementation of a treatment policy estimand heavily reliant on missing data assumptions if, despite the trial design, researchers are still interested in a treatment policy approach. In , patient ID_05 shows that PRO assessment frequency changes from every 4-weeks to every 4 months after disease progression, and thus the ability to assess a temporal change in patient-reported outcomes is greatly reduced.

Intercurrent event strategies will drive the analytic approach of an estimand. For example, in a responder analysis of PROs, the intercurrent event of disease progression or death can be defined as an outcome (composite strategy). While this strategy reduces the rates of missingness, estimation still relies on many assumptions, both statistically and fundamental assumptions about the outcome, which can cloud the interpretation of the results particularly when the rates of ICEs are not balanced across treatment groups. For example, results from a composite endpoint measuring deterioration in physical function or use of physical therapy (considered to be a “rescue treatment”) may be more heavily influenced by the rates of physical therapy rather than actual functional decline. It is important to recognise that the endpoint becomes a comparison of response on the PRO or occurrence of the ICE and is no longer a direct comparison of improvement on the PRO. If considered as a direct comparison of PRO response rates, estimates of treatment effects can be biased when an ICE is designated as non-response (Floden and Bell Citation2019).

If the intended estimand should relate to all randomised patients, steps should be taken in statistical analysis and interpretation to ensure that the level of missing data at baseline is explicitly explained and that robust methods for estimating the treatment effect given missing baseline data are used and stated, such as analysis comparing treatment arms based on PRO score at a specific timepoint (rather than change from baseline).

Regardless of the ICE strategy, missing data are likely to still exist in longitudinal trials. Good practices on how to prevent missing PRO data including developing clear collection procedures, collecting auxiliary data that may inform missingness, training the sites and patients appropriately, and using electronic capture should be implemented when possible (Basch et al. Citation2012; Coens et al. Citation2020; Mercieca-Bebber et al. Citation2016). Efforts to ensure maximum compliance should be put in place, e.g., reminders, not allowing patients to miss items etc. Principled analytic methods to address random missingness include the popular so-called mixed-model repeated measures, and other applications of maximum-likelihood modelling, and multiple imputation (Bell et al. Citation2019; Fairclough Citation2010). Intermittent missingness may be observed in patient-reported data. While this may be truly random, i.e., MAR, it may also be related to the patient’s current experience. Consider an assessment of patient-reported pain. A patient who is in great pain may not come to the clinic for their assessment on that day or may not open their device if collection is decentralized. When this type of influence is suspected, methods to handling missingness should accommodate missing not at random (MNAR) because the probability of missingness may depend on the uncollected outcome. Strategies to handle MNAR include the family of control-based imputation. Differential completion rates between each treatment arm raises particular concerns and the application of a principal stratum approach using information from the post-treatment completion rate observed has been proposed as a useful additional analysis (Roydhouse et al. Citation2019).

2.7.2. Handling death

In therapeutic areas such as oncology, or cardiovascular disease, death usually is an important endpoint. Drugs for these indications usually include overall survival as important endpoint, with high position in the hierarchy – either primary or key secondary. For PRO endpoints in these trials, death represents a terminal event after which observation is impossible, i.e., no observed values of the PRO variable are available after this event. The analyses of PRO endpoints requiring these values is therefore technically challenging; no method which imputes values post-mortem is satisfactory (imputation either directly or indirectly), as it raises fundamental questions from both philosophical and statistical angles. Therefore, careful consideration on what is the clinical question of interest for PRO endpoints in the presence of a non-negligible proportion of patients who die should be made. To date, death has not been treated in any different way than regular unobserved values after an ICE, with a hypothetical approach being quite common. For longitudinal change from baseline analysis, a repeated measures mixed model approach (MMRM) has become common for PRO endpoints, including being included in SISAQoL recommendations. This approach uses a hypothetical approach for data not present (either due to disease progression or death) and therefore seeks to estimate what the outcome would have been had the patients still been alive and on the randomized treatment arm at the timepoint of interest. Whether death should ever be handled with a “hypothetical” approach in oncology studies still requires some reflection from a statistical point of view due to the potential approaches to imputation or assumptions post-mortem and whether it really reflects a scenario of clinical interest.

However, alternative analytical approaches also have limitations or other biases. Possible alternative approaches are discussed in commentary by Ratitch et al. (Ratitch et al. Citation2020), proposing that presenting the treatment benefit “while alive” may be of interest in settings such as palliative care but may not be considered to be that appropriate for earlier disease settings. The application of a principal stratum of patients who would not die under either randomized arm, i.e., survivors has been proposed and may have useful application (Roydhouse et al. Citation2019). This approach may be appealing because using baseline covariates to partition data into latent strata makes use of information that may be related to a patient dying, helps to differentiate whether the intervention is effective for a particular group strengthening the causal interpretation. One drawback to this strategy is the difficulty in describing complex statistical concepts and multiple assumptions needed to various stakeholders. A composite approach incorporating death into the endpoint is a common approach, but composite approaches also come with their own limitations, which will be discussed further below.

2.7.3. Treatment policy strategy challenges

The treatment policy strategy has been embraced as the closest-to-the-ITT-principal approach among the different strategies the ICH E9 (R1) has proposed. Regulators and payers seem to feel comfortable with it and request it, however, since its introduction, the lack of estimators that align with such estimand has been revealed. When complete data exist up to and including the timepoint of interest, treatment policy can be applied easily. However, this is hardly ever the case, despite the efforts sponsors are making to improve data collection (see also section Missing data above). The EFSPI Estimands Implementation Working Group (EIWG Citation2023) (European Federation of Pharmaceutical Industries and Associations) is working on establishing appropriate estimators in the presence of missing data that can be applied to any longitudinally collected data in clinical trials, including PROs. The specific challenge with PROs is around collection of data after ICE to properly target this strategy (Lundy et al. Citation2021). Regulators, payers, sponsors and clinicians are all aware of the value of PRO data for as long as possible in the patient’s journey in their disease and treatment course. While sponsors are making substantial efforts to collect as much patient experience data as possible, collecting PROs after treatment discontinuation can become burdensome for the patient, and also challenging (and costly) from the operations’ perspective. In large clinical trials that are conducted in diverse populations (e.g., multi-national clinical trials), it also raises the question of the heterogeneity of the post-progression data (e.g., a typical question is that of the therapeutic options available to participants after treatment discontinuation), and therefore generalizability of the estimates obtained using post-progression data. This leads to protocol schedules that often stop the collection after patient discontinues treatment, rendering estimation using the treatment policy strategy reliant on strong assumptions. While statisticians are making efforts to establish methods that will “target” treatment policy in the presence of data that were not collected because of protocol limitations, or were partially collected in the best of cases, these will inevitably rely on modelling assumptions and the availability of some data from which information will be borrowed. If no post-ICE data exist, then other approaches, again making strong assumptions, may be used, which may be closer to a composite approach rather than a treatment policy (ongoing work by EFSPI EIWG) (European Federation of Pharmaceutical Industries and Associations).

2.7.4. Composite strategy challenges

As mentioned above, the occurrence of death in certain therapeutic areas may bring the need for composite approaches, when there is a designated timepoint of interest at which inference on PRO endpoints is sought. Several options exist in the literature (Mallinckrodt et al. Citation2020), each with its own limitations. When the PRO endpoint of interest is binary or time-to-event (i.e., based on a dichotomization of the change from baseline), a non-responder approach for the ICE of interest is simple and attractive. It comes with its own limitations though as both PRO event and ICE equally contribute to the variable (endpoint) attribute, as well as rely on a threshold for dichotomization of the PRO component. These and other limitations have led authors to suggest the focus to be shifted to continuous/ordinal change from baseline endpoints at specific timepoints of interest (Fiero et al. Citation2022). In this case, ranking approaches may provide useful statistical analysis methods: a ranking scheme must be devised, e.g. patients with unobserved values at the timepoint of interest due to death are assigned the worst ranking, then patients with unobserved values due to disease progression are assigned the second worst ranking etc. Lastly, the observed values are ranked, e.g., from best to worst change from baseline at the timepoint of interest. However, the ranking method needs to be carefully considered; while ranking death as the worst outcome may be acceptable by many, ranking the rest of the patients (with observed or missing data due to other reasons) requires careful consideration and potential sensitivity analyses to be defined. Rank-based methods provide inference on the ranks (e.g., rank-based ANCOVA, win ratio/win odds), rather than score values, which pose difficulty in communicating these results to clinicians and patients. The win ratio method specifically seems to have gained some attention lately with increasing application in the cardiovascular area (Abdalla et al. Citation2016; Ferreira et al. Citation2020; Kosiborod et al. Citation2021; Pocock et al. Citation2012; Redfors et al. Citation2020). This method is an appealing analysis model for composite endpoints combining death with non-fatal events (Mao and Kim Citation2021), which prioritizes death over non-fatal events in a natural, hierarchical way. It can therefore be very relevant when considering PRO-related events together with death. Another proposed approach to address a composite endpoint is quantile regression, which is an extension of the ordinary linear regression and is used to model the conditional median, in contrast to the conditional mean. Missing data are imputed a very poor value, e.g., the worse score or even lower, which does not impact the estimation of the difference in medians. and as a result, the obtained treatment effect is on the original scale, which can be appealing (Mehrotra et al. Citation2017). To date, these composite methods have not been extensively published on for PRO endpoints and are therefore a new area of research which may be of interest in the future.

3. Conclusion

This commentary aimed to highlight the relevance of using the estimands framework when generating evidence to support the demonstration of the benefits of new therapies on PRO endpoints in cancer clinical trials and provide a collective statistician’s perspective on the main challenges attached to the application of estimands in this context. Applying the estimand framework to PRO endpoints in cancer clinical trials raises many statistical questions related to definition of the variable of interest, management of intercurrent events and definition of the population-level summary, some of which remain largely unaddressed (management of death and disease progression, exploration of missing data, or definition of the best sensitivity and supplementary analyses). We showed that there is no statistical “magic bullet” and that; while statisticians are important influencers for the estimand specification, they cannot achieve this alone.

We recommend that the estimand framework should be thought as a tool for communication among stakeholders from various backgrounds to better discuss how to define and estimate treatment effects in clinical trials. This is particularly critical for PRO endpoints in cancer clinical trials, where substantive and methodological challenges are interrelated. In this context, the estimand framework will enable a fruitful back-and-forth between statisticians and other stakeholders to set up studies with more relevant PRO objectives and their corresponding statistical analyses.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The authors have not received any direct funding in preparation of this manuscript. RL is an employee of Adelphi Values, KS is an employee of IQVIA, AR is an employee of Modus Outcomes, LF is an employee of Clinical Outcome Solutions.

References

  • Abdalla, S., M. E. Montez-Rath, P. S. Parfrey, and G. M. Chertow. 2016. The win ratio approach to analyzing composite outcomes: An application to the EVOLVE trial. Contemporary Clinical Trials 48:119–124.
  • Addario, B., J. Geissler, M. K. Horn, L. U. Krebs, D. Maskens, K. Oliver, A. Plate, E. Schwartz, and N. Willmarth. 2020. Including the patient voice in the development and implementation of patient‐reported outcomes in cancer clinical trials. Health Expectations 23 (1):41–51.
  • Basch, E., A. P. Abernethy, C. D. Mullins, B. B. Reeve, M. L. Smith, S. J. Coons, J. Sloan, K. Wenzel, C. Chauhan, and W. Eppard. 2012. Recommendations for incorporating patient-reported outcomes into clinical comparative effectiveness research in adult oncology. Journal of Clinical Oncology 30 (34):4249–4255.
  • Bell, M. L., and D. L. Fairclough. 2014. Practical and statistical issues in missing data for longitudinal patient-reported outcomes. Statistical Methods in Medical Research 23 (5):440–459.
  • Bell, M. L., L. Floden, B. A. Rabe, S. Hudgens, H. M. Dhillon, V. J. Bray, and J. L. Vardy. 2019. Analytical approaches and estimands to take account of missing patient-reported data in longitudinal studies. Volume 10:129–140. doi:10.2147/PROM.S178963.
  • Bell, J., A. Hamilton, O. Sailer, and F. Voss. 2021. The detailed clinical objectives approach to designing clinical trials and choosing estimands. Pharmaceutical Statistics 20 (6):1112–1124.
  • Brundage, M. D., N. L. Crossnohere, J. O’donnell, S. Cruz Rivera, R. Wilson, A. W. Wu, D. Moher, D. Kyte, B. B. Reeve, and A. Gilbert. 2022. Listening to the patient voice adds value to cancer clinical trials. Journal of the National Cancer Institute 114 (10):1323–1332.
  • Calvert, M., J. Blazeby, D. G. Altman, D. A. Revicki, D. Moher, M. D. Brundage, and F. t. CONSORT PRO Group. 2013. Reporting of patient-reported outcomes in randomized trials: The CONSORT PRO extension. JAMA 309(8):814–822.
  • Calvert, M., M. King, R. Mercieca-Bebber, O. Aiyegbusi, D. Kyte, A. Slade, A.-W. Chan, E. Basch, J. Bell, and A. Bennett. 2021. SPIRIT-PRO extension explanation and elaboration: Guidelines for inclusion of patient-reported outcomes in protocols of clinical trials. British Medical Journal Open 11 (6):e045105.
  • Calvert, M., D. Kyte, R. Mercieca-Bebber, A. Slade, A. W. Chan, M. T. King, A. Hunn, A. Bottomley, A. Regnault, and C. Ells. 2018. Guidelines for inclusion of patient-reported outcomes in clinical trial protocols: The SPIRIT-PRO extension. JAMA 319 (5):483–494.
  • Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research. 2021. E9(R1) statistical principles for clinical trials: Addendum: Estimands and sensitivity analysis in clinical trials. Accessed February 2023. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/e9r1-statistical-principles-clinical-trials-addendum-estimands-and-sensitivity-analysis-clinical.
  • Coens, C., M. Pe, A. C. Dueck, J. Sloan, E. Basch, M. Calvert, A. Campbell, C. Cleeland, K. Cocks, and L. Collette. 2020. International standards for the analysis of quality-of-life and patient-reported outcome endpoints in cancer randomised controlled trials: Recommendations of the SISAQOL consortium. The Lancet Oncology 21 (2):e83–e96.
  • European Federation of Pharmaceutical Industries and Associations. 2023. EFSPI/EFPIA estimands implementation WG (EIWG). Accessed February 2023. https://efspi.org/EFSPI/Working_Groups/EFSPI_EFPIA_EIWG.aspx.
  • Fairclough, D. L. 2010. Design and analysis of quality of life studies in clinical trials. Boca Raton, FL USA: Chapman and Hall/CRC.
  • Ferreira, J. P., P. S. Jhund, K. Duarte, B. L. Claggett, S. D. Solomon, S. Pocock, M. C. Petrie, F. Zannad, and J. J. McMurray. 2020. Use of the win ratio in cardiovascular trials. Heart Failure 8 (6):441–450.
  • Fielding, S., G. Maclennan, J. A. Cook, and C. R. Ramsay. 2008. A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes. Trials 9:1–6.
  • Fielding, S., A. Ogbuagu, S. Sivasubramaniam, G. MacLennan, and C. Ramsay. 2016. Reporting and dealing with missing quality of life data in RCTs: Has the picture changed in the last decade? Quality of Life Research 25:2977–2983.
  • Fiero, M. H., M. Pe, C. Weinstock, B. L. King-Kallimanis, S. Komo, H. D. Klepin, S. W. Gray, A. Bottomley, P. G. Kluetz, and R. Sridhara. 2020. Demystifying the estimand framework: A case study using patient-reported outcomes in oncology. The Lancet Oncology 21 (10):e488–e494.
  • Fiero, M. H., J. K. Roydhouse, V. Bhatnagar, T.-Y. Chen, B. L. King-Kallimanis, S. Tang, and P. G. Kluetz. 2022. Time to deterioration of symptoms or function using patient-reported outcomes in cancer trials. The Lancet Oncology 23 (5):e229–e234.
  • Floden, L., and M. L. Bell. 2019. Imputation strategies when a continuous outcome is to be dichotomized for responder analysis: A simulation study. BMC Medical Research Methodology 19 (1):1–11.
  • Gnanasakthy, A., J. Russo, K. Gnanasakthy, N. Harris, and C. Castro. 2022. A review of patient-reported outcome assessments in registration trials of FDA-approved new oncology drugs (2014–2018). Contemporary Clinical Trials 120:106860.
  • Hamel, J.-F., P. Saulnier, M. Pe, E. Zikos, J. Musoro, C. Coens, and A. Bottomley. 2017. A systematic review of the quality of statistical methods employed for analysing quality of life data in cancer randomised controlled trials. European Journal of Cancer 83:166–176.
  • Kluetz, P. G., A. Slagle, E. J. Papadopoulos, L. L. Johnson, M. Donoghue, V. E. Kwitkowski, W.-H. Chen, R. Sridhara, A. T. Farrell, and P. Keegan. 2016. Focusing on core patient-reported outcomes in cancer clinical trials: Symptomatic adverse events, physical function, and disease-related symptoms. Clinical Cancer Research 22 (7):1553–1558.
  • Kosiborod, M. N., R. Esterline, R. H. Furtado, J. Oscarsson, S. B. Gasparyan, G. G. Koch, F. Martinez, O. Mukhtar, S. Verma, and V. Chopra. 2021. Dapagliflozin in patients with cardiometabolic risk factors hospitalised with COVID-19 (DARE-19): A randomised, double-blind, placebo-controlled, phase 3 trial. The Lancet Diabetes & Endocrinology 9 (9):586–594.
  • Kyte, D., A. Retzer, K. Ahmed, T. Keeley, J. Armes, J. M. Brown, L. Calman, A. Gavin, A. W. Glaser, and D. M. Greenfield. 2019. Systematic evaluation of patient-reported outcome protocol content and reporting in cancer trials. JNCI Journal of the National Cancer Institute 111 (11):1170–1178.
  • Lawrance, R., E. Degtyarev, P. Griffiths, P. Trask, H. Lau, D. D’Alessio, I. Griebsch, G. Wallenstein, K. Cocks, and K. Rufibach. 2020. What is an estimand & how does it relate to quantifying the effect of treatment on patient-reported quality of life outcomes in clinical trials? Journal of Patient-Reported Outcomes 4 (1):1–8.
  • Lundy, J. J., C. D. Coon, A.-C. Fu, and V. Pawar. 2021. Collection of post-treatment PRO data in oncology clinical trials. Therapeutic Innovation & Regulatory Science 55:111–117.
  • Mallinckrodt, C., J. Bell, G. Liu, B. Ratitch, M. O’Kelly, I. Lipkovich, P. Singh, L. Xu, and G. Molenberghs. 2020. Aligning estimators with estimands in clinical trials: Putting the ICH E9 (R1) guidelines into practice. Therapeutic Innovation & Regulatory Science 54:353–364.
  • Mao, L., and K. Kim. 2021. Statistical models for composite endpoints of death and nonfatal events: A review. Statistics in Biopharmaceutical Research 13 (3):260–269.
  • Mehrotra, D. V., F. Liu, and T. Permutt. 2017. Missing data in clinical trials: Control‐based mean imputation and sensitivity analysis. Pharmaceutical Statistics 16 (5):378–392.
  • Mercieca-Bebber, R., M. J. Palmer, M. Brundage, M. Calvert, M. R. Stockler, and M. T. King. 2016. Design, implementation and reporting strategies to reduce the instance and impact of missing patient-reported outcome (PRO) data: A systematic review. British Medical Journal Open 6 (6):e010938.
  • National Research Council. 2010. The prevention and treatment of missing data in clinical trials. Committee on National Statistics, Division of Behavioral and Social Sciences and Education.
  • Oncology Center of Excellence, Center for Biologics Evaluation and Research and Center for Drug Evaluation and Research. 2021. Core patient-reported outcomes in cancer clinical trials.
  • Oncology estimand working group. 2023, February. http://www.oncoestimand.org/.
  • Pocock, S. J., C. A. Ariti, T. J. Collier, and D. Wang. 2012. The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal 33 (2):176–182.
  • Ratitch, B., J. Bell, C. Mallinckrodt, J. W. Bartlett, N. Goel, G. Molenberghs, M. O’Kelly, P. Singh, and I. Lipkovich. 2020. Choosing estimands in clinical trials: Putting the ICH E9 (R1) into practice. Therapeutic Innovation & Regulatory Science 54:324–341.
  • Redfors, B., J. Gregson, A. Crowley, T. McAndrew, O. Ben-Yehuda, G. W. Stone, and S. J. Pocock. 2020. The win ratio approach for composite endpoints: Practical guidance based on previous experience. European Heart Journal 41 (46):4391–4399.
  • Roydhouse, J. K., R. Gutman, V. Bhatnagar, P. G. Kluetz, R. Sridhara, and P. S. Mishra‐Kalyani. 2019. Analyzing patient‐reported outcome data when completion differs between arms in open‐label trials: An application of principal stratification. Pharmacoepidemiology and Drug Safety 28 (10):1386–1394.
  • Rubin, D. B. 1976. Inference and missing data. Biometrika 63 (3):581–592.
  • Teixeira, M. M., F. C. Borges, P. S. Ferreira, J. Rocha, B. Sepodes, and C. Torre. 2022. A review of patient-reported outcomes used for regulatory approval of oncology medicinal products in the European Union between 2017 and 2020. Frontiers in Medicine 9, 968272.
  • US Food Drug Administration Core patient-reported outcomes in cancer clinical trials. Draft guidance for industry. Draft guidance. 2021 , June.
  • Walton, M. K., J. H. Powers, J. Hobart, D. Patrick, P. Marquis, S. Vamvakas, M. Isaac, E. Molsen, S. Cano, and L. B. Burke. 2015. Clinical outcome assessments: Conceptual foundation—report of the ISPOR clinical outcomes assessment–emerging good practices for outcomes research task force. Value in Health 18 (6):741–752.