271
Views
0
CrossRef citations to date
0
Altmetric
Open Peer Commentaries

Machine Learning Algorithms in the Personalized Modeling of Incapacitated Patients’ Decision Making—Is It a Viable Concept?

This article refers to:
A Personalized Patient Preference Predictor for Substituted Judgments in Healthcare: Technically Feasible and Ethically Desirable

New informatics technologies are becoming increasingly important in medical practice. Machine learning (ML) and deep learning (DL) systems enable data analysis and the formulation of medical recommendations. Furthermore, they are increasingly being used as medical decision support systems (MDSS). Earp and colleagues (Citation2024) propose the use of MDSS in the personalized modeling of incapacitated patients’ decision-making processes. One such system, Personal Patient Preference Predictor (P4), would be used when the patient has lost the ability to make decisions and the doctor has not obtained clear instructions from their family. In such cases, physicians can use P4 to reconstruct the patient’s individual therapeutic preferences and formulate a surrogate decision consistent with their personality profile. This postulate, however, raises many reservations—both ethical and methodological. Among the latter, we address the following problems: P4’s predictions made on the basis of non-causal knowledge, P4’s data training, and validating P4.

In modern medicine, the term “personalization” refers to adapting medical activities to the patient’s profile. A necessary condition for implementing such personalized actions is having causal knowledge. Gene therapy medicinal products (GTMP) are an example of personalized therapies, meaning that they are designed on the basis of causal knowledge about the molecular mechanisms of genetic diseases and the mechanism of action of genetic engineering products. Thanks to accurate predictions based on causal knowledge, the role of clinical trials and statistical inferences in assessing the effectiveness and safety of GTMP can be reduced (Rzepiński Citation2024). The situation is different in case of MDSS.

As MDSS, both ML and DL can be based on different architectures using different computational procedures: rough sets, fuzzy sets, and others (Bello and Falcon Citation2017; Rzepiński Citation2014). Broadly speaking, these are systems in which data analysis is carried out on the basis of inference by analogy. MDSS compare the research objects to certain reference objects, which may be previously established (supervised systems) or generated by systems themselves (unsupervised systems). MDSS can compare RTG, USG, or the histopathological images of pathophysiological changes, identify the similarity relationship between these objects, and generate appropriate recommendations.

Inference by analogy is a type of inductive inference, that is, logically invalid. Inductive inference is successfully used in statistical analyses of clinical trials data. Nevertheless, when applying guidelines formulated on the basis of statistical analyses, we are aware of the uncertainty of the findings and use concepts such as confidence interval, relative or absolute risk reduction, and likelihood ratio. The accuracy of the indications is therefore not limited to a specific patient, but applies to the entire target population. The goal of analyses using inductive inference in statistics is therefore different from the goal we would like to achieve by modeling a specific patient’s preferences using MDSS. When making decisions based on the patient’s personality profile reconstructed by P4, we want to be sure that the decision will not be characterized only by statistically satisfactory accuracy, but that it will certainly be consistent with the patient’s preferences. However, we will never have such certainty, because inference by analogy is logically invalid, regardless of whether it is performed by humans or by MDSS.

Earp and colleagues claim that P4 should be trained on writing produced by the patient in the past, such as emails, blog posts, and social media posts, and supplemented by information reflecting the individual’s past choices or behavior, treatment decisions, electronic health records, and explicit responses to questions relating hypothetical treatment (Earp et al. Citation2024). However, it seems unlikely that the amount of data obtained from an individual patient will be sufficient to create a digital mirror of their personality. Meanwhile, the guidelines generated in MDSS are highly effective only when these systems are trained on very large amounts of data. Reliability of recommendations decreases without a Big Data background.

An additional problem of P4 is the type of data used in its training. The P4 system is supposed to be a kind of large language model (LLM). Earp and colleagues therefore assume that in order to adequately model the patient’s preferences, the model can be limited to verbalized data. From the perspective of psychological research, this is an oversimplification. Our preferences can be communicated by various nonverbal means: gestures, facial expressions, emoticons used on social media, and so on. This, however, would pose a challenge to the P4 model because it would mean that a large part of nonverbalized beliefs that potentially influence the patient’s decision making would not be included in the data set analyzed by P4.

The patient’s value system is constantly changing under the influence of new experiences, books read, meeting new people, social media influence and world news. Even the choice of the type of theater performance can induce completely new preferences. Can we therefore be sure that preferences reconstructed on the basis of past declarations can be applied to the radically different health situation of the patient? After all, very often it is the difficult health situation that leads the patient to reevaluate their system of preferences. Furthermore, there is a gap in knowledge and various misconceptions about end-of-life issues among the general public (Canny, Mason, and Boyd Citation2023). If the patient has not been in such a situation before, no data from their daily linguistic behavior will justify conclusions about what decision they would make in a situation involving risk to the quality of life.

Another methodological caveat concerns validating the P4 system. MDSS are very often perceived as black box artificial intelligence (AI) because they generate decisions based on operations that we cannot understand or explain (Durán and Jongsma Citation2021; London Citation2019). As Sparow and Hatherley write, even their designers are unable to understand and explain why an MDSS produces the output that it does (2020, 16). This situation poses a particularly serious challenge to all concepts of personalized prediction. The problem is that validation of the P4 system cannot be performed. No method allows checking whether the therapeutic decision generated by the system is consistent with the preferences of the incapacitated patient. The indications of the P4 system are therefore essentially unfalsifiable.

Our last two comments concern ethical issues; however, they are consequences of methodological reservations. Earp and colleagues state that the P4 system increases the patient’s autonomy, because the individual should be asked how, if at all, they would like a P4 to be used in case they lose decisional capacity (Earp et al. Citation2024). However, if, based on the course of the disease, the doctor assumes that the patient may lose their decision-making capacities, then would it not be better to ask the patient about their decision, rather than asking about their decision regarding the decision generated by P4? How will a patient perceive their dignity in a situation where they are asked about the possibility of submitting their fate to the decisions of a computational system whose recommendations cannot be even explained, let alone validated? This certainly does not increase patients’ autonomy.

Moreover, it can be argued that systems like P4 fail to respect an incapacitated individual’s autonomy because, as a machine, they cannot appreciate the reasons and values that underpin patients’ preferences (John Citation2018; Sharadin Citation2018). Earp and colleagues reject this objection, stating that it assumes a higher ethical standard for MDSS than for human surrogates, for example, family members. However, it seems that the objection is valid. The LLM system—the basis for the P4 system—is a machine that generates sequences of expressions based on the syntactic rules identified in the training process. We assign meanings to these expressions, but we have no basis for claiming that the LLM understands these expressions. Moreover, we can certainly conclude that when combining language expressions in accordance with the rules of grammar, the LLM system cannot be conscious and aware of feeling pain, the inability to move, the loss of independence and intellectual capacities, or the fear of nonexistence. Therefore, although human surrogates may make decisions that do not take into account the patient’s changing preferences, at least it is easier for them to imagine the patient’s feelings and dilemmas related to the choice made. Thus, it seems that among the various applications of MDSS, their use in modeling the personalized decisions of incapacitated patients raises the most methodological and ethical concerns.

DISCLOSURE STATEMENT

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

REFERENCES

  • Bello, R., and R. Falcon. 2017. Rough sets in machine learning: A review. In Thriving rough sets. Studies in computational intelligence, vol 708, eds. G. Wang, A. Skowron, Y. Yao, D. Ślęzak, and L. Polkowski. Cham: Springer. doi: 10.1007/978-3-319-54966-8_5.
  • Canny, A., B. Mason, and K. Boyd. 2023. Public perceptions of advance care planning (ACP) from an international perspective: A scoping review. BMC Palliative Care 22 (1):107. doi: 10.1186/s12904-023-01230-4.
  • Durán, J. M., and K. Jongsma. 2021. Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. Journal of Medical Ethics 47:329–35. doi: 10.1136/medethics-2020-106820.
  • Earp, B., S. Mann, J. Allen, S. Salloch, V. Suren, K. Jongsma, M. Braun, D. Wilkinson, W. Sinnott-Armstrong, S. Rid, et al. 2024. A personalized patient preference predictor for substituted judgments in healthcare: Technically feasible and ethically desirable. The American Journal of Bioethics 24 (7):13–26. doi: 10.1080/15265161.2023.2296402.
  • John, S. D. 2018. Messy autonomy: Commentary on patient preference predictors and the problem of naked statistical evidence. Journal of Medical Ethics 44 (12):864–864. doi: 10.1136/medethics-2018-104941.
  • London, A. J. 2019. Artificial intelligence and black-box medical decisions: Accuracy versus explainability. The Hastings Center Report 49 (1):15–21. doi: 10.1002/hast.973.
  • Rzepiński, T. 2024. Problems and limitations of clinical trials for advanced therapy medicinal products (ATMP). Accessed April 15, 2024. https://etykabadan.komisja.uj.edu.pl/documents/149614951/151135720/03.+Ekspertyza+ATMP+%28wersja+angielska%29/57f106a0-c504-4734-b386-628b3ff7442c
  • Rzepiński, T. 2014. Randomized controlled trials versus rough set analysis: Two competing approaches for evaluating clinical data. Theoretical Medicine and Bioethics 35 (4):271–88. doi: 10.1007/s11017-014-9283-7.
  • Sharadin, N. P. 2018. Patient preference predictors and the problem of naked statistical evidence. Journal of Medical Ethics 44 (12):857–62. doi: 10.1136/medethics-2017-104509.
  • Sparrow, R., and J. Hatherley. 2020. High hopes for “deep medicine”? AI, economics, and the future of care. The Hastings Center Report 50 (1): 14–7. doi: 10.1002/hast.1079.