386
Views
0
CrossRef citations to date
0
Altmetric
Review Articles

Interventional probability of causation (IPoC) with epidemiological and partial mechanistic evidence: benzene vs. formaldehyde and acute myeloid leukemia (AML)

, ORCID Icon & ORCID Icon
Pages 252-289 | Received 20 Feb 2024, Accepted 25 Mar 2024, Published online: 16 May 2024

Abstract

Introduction

Causal epidemiology for regulatory risk analysis seeks to evaluate how removing or reducing exposures would change disease occurrence rates. We define interventional probability of causation (IPoC) as the change in probability of a disease (or other harm) occurring over a lifetime or other specified time interval that would be caused by a specified change in exposure, as predicted by a fully specified causal model. We define the closely related concept of causal assigned share (CAS) as the predicted fraction of disease risk that would be removed or prevented by a specified reduction in exposure, holding other variables fixed. Traditional approaches used to evaluate the preventable risk implications of epidemiological associations, including population attributable fraction (PAF) and the Bradford Hill considerations, cannot reveal whether removing a risk factor would reduce disease incidence. We argue that modern formal causal models coupled with causal artificial intelligence (CAI) and realistically partial and imperfect knowledge of underlying disease mechanisms, show great promise for determining and quantifying IPoC and CAS for exposures and diseases of practical interest.

Methods

We briefly review key CAI concepts and terms and then apply them to define IPoC and CAS. We present steps to quantify IPoC using a fully specified causal Bayesian network (BN) model. Useful bounds for quantitative IPoC and CAS calculations are derived for a two-stage clonal expansion (TSCE) model for carcinogenesis and illustrated by applying them to benzene and formaldehyde based on available epidemiological and partial mechanistic evidence.

Results

Causal BN models for benzene and risk of acute myeloid leukemia (AML) incorporating mechanistic, toxicological and epidemiological findings show that prolonged high-intensity exposure to benzene can increase risk of AML (IPoC of up to 7e-5, CAS of up to 54%). By contrast, no causal pathway leading from formaldehyde exposure to increased risk of AML was identified, consistent with much previous mechanistic, toxicological and epidemiological evidence; therefore, the IPoC and CAS for formaldehyde-induced AML are likely to be zero.

Conclusion

We conclude that the IPoC approach can differentiate between likely and unlikely causal factors and can provide useful upper bounds for IPoC and CAS for some exposures and diseases of practical importance. For causal factors, IPoC can help to estimate the quantitative impacts on health risks of reducing exposures, even in situations where mechanistic evidence is realistically incomplete and individual-level exposure-response parameters are uncertain. This illustrates the strength that can be gained for causal inference by using causal models to generate testable hypotheses and then obtaining toxicological data to test the hypotheses implied by the models—and, where necessary, refine the models. This virtuous cycle provides additional insight into causal determinations that may not be available from weight-of-evidence considerations alone.

Introduction

An enduring challenge for quantitative health risk assessments used in occupational and public health regulations, worker compensation, and some toxic tort litigation has been how to quantify the extent to which changing (e.g. reducing or eliminating) a specific exposure would change the probabilities of specific adverse health outcomes occurring. This has remained a challenge not only for theorists but also for practitioners. One major obstacle is that most epidemiological evidence is purely observational and limited to detecting and characterizing observed associations between exposure and response variables, perhaps using regression models to “control” for levels of other observed variables, especially confounders. Unlike experiments in which the impact of an intervention can be more directly evaluated, methods traditionally used in epidemiology (e.g. Bradford Hill’s nine viewpoints) typically rely on critical inspection of these observed associations while trying to rule out alternative non-causal explanations (e.g. chance, biases and confounding) to support causal inference. However, these methods and their respective measures (e.g. estimated “burden of disease,” “population attributable fraction,” and other quantities derived from relative risk estimators) cannot demonstrate whether removing any risk factor prevents disease, and therefore cannot quantify the causal contribution of any risk factor.

Regulators may want to know by how much a contemplated reduction in exposure to a noxious substance would change the future probabilities of adverse health effects over time in exposed workers or members of the public. Litigants argue about whether and by how much a plaintiff’s lung cancer would have been delayed (or, perhaps, prevented) had a specific past occupational exposure not occurred, i.e. “but for” such exposure, would the cancer not have occurred when it did, or at all? Congressional committees have resorted to tables of “probability of causation” to allocate compensation payments for cancers of types that are associated with occupational exposures such as ionizing radiation. In each case, the technical challenge is to quantify an interventional causal effect on risk, i.e. the extent to which changing exposure would change the probabilities of adverse health outcomes. This indicates the need for robust methods that can estimate quantitatively what we propose to call the interventional probability of causation (IPoC) for using data and causal knowledge (or assumptions) to estimate—or at least put quantitative bounds on—the change in probability of harm caused by a change in exposure. IPoC is intended to be useful for risk assessment practitioners; accordingly, we demonstrate its application to assessing the effects on acute myeloid leukemia (AML) risks of changing or intervening on exposures to benzene and formaldehyde, substances considered by the International Agency for Research on Cancer (IARC) and other institutions including regulatory agencies to be “known” (i.e. Group 1) causes of human leukemias including AML (IARC Citation2018, IARC Citation2012, USEPA Citation2000, USEPA Citation2022).

PART ONE: Interventional probability of causation (IPoC)—concepts and methods

I. IPoC definitions and concepts

Intuitively, it is tempting to define the causal effect of an exposure on risk of disease or other harm in an individual as the change in probability of harm (over a stated period, such as a lifetime) caused by the exposure. This, in turn, might be equated with the difference in probability of harm with and without the exposure, which contrasts with the probability of difference in outcomes, harm or no harm, with and without the exposure, as emphasized in some “potential outcomes” conceptualizations of causal effects. For example, Imbens and Rubin (Citation2015, Chapter 1) state that “the causal effect is the comparison of potential outcomes, for the same unit, at the same moment in time post-treatment.” Such informal constructs raise important conceptual and practical difficulties involving isolating the effect of the exposure from other effects such as confounding and effect modification (i.e. conceptually, the difference in probability of harm caused by an exposure versus its absence may depend on the levels of other variables). In this case, it is not clear how much of the difference should be attributed to the exposure, as opposed to the covariates with which exposure interacts in affecting risk (Land and Gefeller Citation2000). It also may not be clear whether the exposure simply correlates with causal factors and effect modifiers but plays no direct causal role, in which case the true probability of causing harm should be zero. Pragmatically, neither probabilities nor changes (or differences) in probabilities for different levels or histories of exposure can be observed for any individual, making it challenging to estimate them from observations or other data. Concepts from causal artificial intelligence (CAI), however, can help resolve many of these difficulties: these are summarized next.

Literature review and methodology background on causal artificial intelligence (CAI) concepts and terminology used in epidemiology

This section summarizes and illustrates key CAI concepts and terms that also are used in epidemiology.

  • A causal model is a collection of variables (typically including random variables) in which the values or probability distributions of some of the variables are determined by the values of others. These dependency relations are often shown by arrows between nodes in a directed acyclic graph (DAG) model. For example, if the risk of acute myeloid leukemia (AML) in an individual’s lifetime depends on his or her benzene exposure and karyotype (or genetic makeup), then a corresponding fragment of a DAG model reflecting these dependencies would be as follows: benzene exposure historyAMLkaryotype.

  • To create a well-defined model, each variable must be clearly (ideally, operationally) defined; for example, we must specify whether AML refers to diagnosis with AML or to death with AML. Similarly, exposures must be defined precisely and ideally quantified, as agents capable of causing a disease at high concentrations may not cause the same disease at very low levels, and are unlikely to do so at endogenous levels. These variable-definition issues are the same for causal modeling as for statistical modeling.

  • In a causal DAG network, also called a causal Bayesian network (BN) model, the nodes represent variables (Bielza and Larrañaga Citation2014). Each variable’s value (in the special case of deterministic causal relationships) or probability distribution of values (in the more general case of random variables) depends on the values of the variables that point into it, i.e. its “parents” in the DAG.

  • A conditional probability table (CPT) for a variable is a model that specifies the conditional probability distribution of the variable for any combination of values of its parents, i.e. of the direct causes that point to it in a causal BN. In the simplest cases, this model is an explicit table giving the probability of each possible value of the variable for each possible combination of values of its parents. This is practical only when variables have few discrete levels. More generally, a CPT is a statistical or machine-learning model such as classification and regression tree (CART), regression model, or random forest ensemble that specifies the conditional probability distribution of the variable for any combination of values of its parents. For example, in the BN model: benzene exposure historyAMLkaryotype, suppose that there are only two levels for benzene exposure history, “Occupational” and “Non-occupational”; and only two relevant equivalence classes of karyotypes, which we will call “Susceptible” and “Non-susceptible”. Then a hypothetical CPT for AML probability could be displayed as in . The notation P(AML = 1 | Pa(AML)) in the right column stands for the conditional probability that AML occurs (where the random variable AML is an indicator variable with value AML = 1 if AML occurs in the individual’s lifetime and AML = 0 otherwise), given the values of its parents, as shown in the first two columns. For example, this hypothetical CPT shows that the conditional probability that AML occurs in someone with benzene exposure history = Non-occupational and karyotype = Susceptible is 0.006. A CPT generalizes the concept of a dose-response or exposure-response function by including other causally relevant variables for individual predicting response probabilities (such as karyotype, in this example) in addition to exposure.

  • Input variables, meaning variables with no parents and only outward-directed arrows (i.e. outward from itself but toward other key variables), also have probability tables. These tables assign probability “1” to any known (or assumed) input value. If an input value is uncertain, then its table gives probabilities for its different possible values. In the terminology of the field, these tables are called marginal probability distributions of the input variables, but we will use the term “CPT” generically to include the marginal probabilities of input nodes as well as the conditional probabilities of nodes with parents. For health risk analysis applications, exposure is usually an input variable of interest, and death or other health harm is often an output variable (i.e. one with only onward-pointing arrows) of interest.

  • A causal CPT is one in which the conditional probability distribution of a variable is completely determined by the values of its parents. The property of invariant causal prediction (ICP) states that a causal CPT is the same (i.e. “invariant”) for all contexts and interventions (Mooij et al. Citation2020). A causal CPT may be thought of roughly as a probabilistic causal law that is the same for everyone and under all conditions. If we assume that the CPT in is a causal CPT, then the response probability P(AML = 1 | Pa(AML)) is the same for everyone with the same values of the parents Pa(AML), and hence conditionally independent of all other variables, such as sex, age, co-exposures, co-morbidities, and so forth, given the values of benzene exposure history and karyotype.

  • If, to the contrary, CART analysis or other methods reveal that these other considerations help to predict response probabilities even after conditioning on the values of benzene exposure history and karyotype, then would not be a causal CPT. Rather, it would be a mixture of underlying causal CPTs, with mixing weights reflecting the prevalence of these other relevant conditions in the data from which it is derived. As a simplified example, suppose that shows the true causal CPT. Here, sex has been added as a parent of AML. is formed from by aggregating over (i.e. ignoring or, in BN jargon, “marginalizing out”) sex in a population where the distribution of sex values is half men or “M” and half women or “W”. The CPT in describes the exposure-karyotype-response-probability specifically for a population with equal numbers of men and women. A different CPT would result (i.e. the one in would not be invariant) for populations with different sex ratios.

  • With sufficient data, CPTs estimated from one population can be generalized, adjusted, or “transported” to apply to different target populations by conditioning on covariates to make the original and new populations exchangeable (Degtiar and Rose Citation2023; Robertson et al. Citation2022). For example, to transport the CPT in to a target population consisting of all men, the numbers in it would be adjusted to coincide with the top half of . For a target population of women, the transported CPT would coincide with the bottom half of . For target populations with other sex ratios, the transported CPTs would consist of corresponding mixtures (i.e. weighted averages) of the men-specific and women-specific CPTs.

  • Widely used measures of risk contrast between exposed and unexposed groups (or more-exposed and less-exposed groups—or, in epidemiological case-control studies, the exposure prevalence between groups with and without the disease outcome) include odds ratios, hazard ratios, rate differences, logistic regression coefficients, and proportional hazards model regression coefficients. These measures cannot be transported across populations, calling into question much applied work that uses meta-analysis to contrast or combine results from different study populations (Didelez and Stensrud Citation2022). These measures also lack clear causal interpretations (except in rare special cases of “collapsibility”) when they are based on data that average over distributions of omitted variables that help to predict risk, even in the absence of confounding (Daniel et al. Citation2021; Whitcomb and Naimi Citation2021). Their numerical values depend on the distributions of the omitted covariates, which are typically unknown.

  • By contrast, a causal CPT such as that contains all of the causal parents of AML is the same (“invariant”) for all populations, since it does not average over omitted variables that would affect prediction of the output variable, AML. This invariant causal prediction (ICP) property can be taken as a defining property of a causal CPT. It can be used to help identify causal CPTs from many types of data from multiple studies, with or without interventions (Gamella and Heinze-Deml Citation2020). (In reality, of course, the hypothetical example in is oversimplified: the true causal CPT for AML is likely to depend on many things other than just sex, benzene exposure history, and karyotype. Practical work requires dealing with CPTs that are mixtures of underlying causal CPTs. But the concept of an invariant causal CPT clarifies what we seek to quantify from data.)

  • If the uncertain values of different variables are correlated with each other, then this dependency is expressed by paths linking them in the causal network. Undirected links may be used when there is no clear causal direction for the dependence between two variables. For example, if the probability of the sensitive karyotype depends on sex, then the corresponding causal BN network structure could be shown as in .

  • If it is not clear that sex actually directly determines probabilities of karyotypes—for example, if both sex and karyotype might have an unobserved common cause such genotype, as shown in , then the arrow from sex to karyotype in can be replaced by a simple undirected line (i.e. no arrow) to indicate that sex and karyotype are correlated (or, more generally, not statistically independent of each other) while remaining agnostic about the underlying causes and direction of this statistical dependence between them.

  • Although we will not pursue them here, distinctions among direct, indirect, and total effects and concepts of mediation and moderation (Agler and De Boeck Citation2017; Blair Citation2023) can be clarified with the help of DAG models such as . In , sex has both a direct effect on AML, indicated by the arrow from sex to AML; and also, an indirect effect mediated by karyotype, indicated by the arrow from sex to karyotype and another from karyotype to AML. These distinctions can be refined, e.g. to distinguish between the direct effect of sex on AML when all other variables are held fixed at the levels they currently have for each individual (“natural direct effect”) and the direct effect of sex on AML when all other variables are held fixed at specified nominal levels (“controlled direct effect”). We will generalize these concepts by considering scenarios that specify the inputs to a causal model, including the levels of covariates that are held fixed at user-specified values. Input scenarios will be discussed further shortly.

  • A common causal interpretation of a CPT for a variable is that if the value of any parent is changed, then the value (in deterministic cases) or conditional probability distribution of the value (in more general cases) of the variable changes in response (Kossakowski et al. Citation2021). Pearl (Citation2009) emphasized the importance of what he termed the basic distinction between statistical associations, which do not describe how counterfactual changes in the values of some variables would change the probability distributions of others, and causal relationships, which do address changes. This process of change propagation from parents to children (i.e. the nodes into which they point) is an abstraction of a dynamic process that usually is not described in detail. The new value or probability distribution for the dependent variable is implicitly assumed to be reached as the outcome of an underlying equilibration process that is not explicitly modeled (Weinberger Citation2023). (Dynamic causal models and simulation models do explicitly model the dynamic adjustment processes, e.g. using systems of algebraic and differential equations, but causal DAG models and more general causal network models usually do not.) Unless otherwise specified, the causal models considered in this paper assume such an equilibration interpretation of causal CPTs. We consider causal models with inputs, including exposure; outputs, including death or other adverse health effects; and networks of variables linked by causal arrows and interpreted so that if xy, then a change in the value of x leads to a change in the conditional probability distribution of y. Changes in exposure propagate through the network along directed paths to change outcome probabilities (for binary outputs such as AML/no AML) or outcome probability distributions (more generally).

  • Intervening on a specific variable such as exposure to set it to a new value makes it an input variable: inward-directed arrows from any parents that might otherwise have determined its value or probability distribution are severed, and the new value becomes an input to the model. The process of setting a variable X to a specific value x is often denoted by do(X = x) (the basis for Pearl’s “do calculus” for causal diagrams) and the process of disconnecting X from its parents, Pa(X), is called “surgery” on the diagram (Pearl Citation2009; Jacobs et al. Citation2019).

  • An input scenario is a specification of values for all of the inputs in a causal model, including exposure in the models we consider. An intervention is a change in input scenarios: it specifies changes in the values of one or more inputs from their initial values to stated final values. Interventions also modify the causal network structure by disconnecting the intervened-on variables from their parents, if any, i.e. surgery.

  • The causal effect of an intervention on an output (such as AML in ) is the difference in the probability distribution of the output between the final and initial input scenarios for the intervention. As noted by Greenland et al. (Citation1999), “Causal effects are defined only for comparisons of treatment levels.” We generalize this by saying that the causal effects of an intervention on one or more outputs are defined only for comparisons of specified input scenarios. In most applications, the comparison is between two input scenarios describing the world with and without the intervention. At least one of these scenarios is counterfactual. In toxic tort and regulatory applications (if not most “real-world” settings), the question of interest is usually how lower levels of exposure would change the probability of harm for exposed individuals. In this context, the input scenarios specify the values of the exposure, covariates and other input variables needed to calculate output probabilities from their causal CPTs.

  • A fully specified causal model is one in which both the qualitative part of the model (e.g. the network structure summarizing dependence and conditional independence relationships between variables in a causal Bayesian network model) and its quantitative part (e.g. causal CPTs for all nodes with parents, and values or probability distributions for all input nodes) are specified.

Figure 1. A Simple causal Bayesian network (BN).

Figure 1. A Simple causal Bayesian network (BN).

Figure 2. A Modified causal network with genotype as a common cause of sex and karyotype.

Figure 2. A Modified causal network with genotype as a common cause of sex and karyotype.

Table 1. Example of a hypothetical CPT.

Table 2. Example of a hypothetical causal CPT.

These concepts of input scenario, intervention, and causal effect, and the discussion of equilibration are useful for defining IPoC and for quantifying the effects of changes in exposure on changing health risks. The terminology and conceptual infrastructure introduced in this section up through and including the definition of a fully specified causal model are to a large extent standard in the causal artificial intelligence and parts of the epidemiology research communities (e.g. Greenland et al. Citation1999; Pearl Citation2000; Pearl Citation2009).

Definition and interpretation of IPoC

With the foregoing concepts and terms, we now define IPoC for a specific intervention, harm, and individual simply as the causal effect of an intervention on the probability of harm for the individual, as predicted by a fully specified causal model. In other words, it is the change or difference that the intervention makes in the probability of harm for that individual, as predicted by a fully specified causal model. We say “change or difference” because the two concepts are interchangeable in causal models that assume equilibration of causal networks in response to exogenous changes or interventions. In such models, an intervention that changes exposure causes new probability distributions for the descendants of exposure in the network. Continuing with the AML example, the difference in conditional probabilities of AML between the new and old conditions coincides with the change in AML probability caused by the intervention. If “harm” is a continuous variable rather than a binary indicator, then “probability” should be replaced by “probability distribution”: IPoC is the change or difference that the intervention makes in the probability distribution of harm, as predicted by a fully specified causal model. Thus, the IPoC for an individual depends on the specific causal model used to calculate it, as well as on the specific intervention (such as removal or reduction of an exposure) whose effect on risk of harm is to be calculated.

II. Using a fully specified causal Bayesian network model for IPoC calculations

A fully specified causal Bayesian network model can be used to predict how interventions change probability distributions of output variables. These predictions are calculated by Bayesian network inference algorithms that generalize Bayes’ rule to apply to BN networks of variables. BN network inference algorithms have been developed extensively over the past four decades. They are now available in many free and commercial software packages. lists some of the best-developed ones and provides references to seminal works, largely in artificial intelligence. The fundamental functionality of any BN inference algorithm is to condition the probability distributions of unobserved variables on the known (or assumed) values of other variables, thereby obtaining their posterior (i.e. updated prior probability) distributions after conditioning on observations or assumptions.

Table 3. Some popular Bayesian network (BN) inference algorithms.

Given a BN inference algorithm, a fully specified causal BN, and two input scenarios, each input scenario can be propagated through the network via the inference algorithm to calculate resulting output probabilities. The difference in harm probabilities for the two different input scenarios is the IPoC.

To illustrate, shows a simplified hypothetical example of a BN model built using the popular BN modeling package Netica (https://www.norsys.com/netica.html). In this model, Exposure has a statistical dependence on Sex, with men being more likely to be occupationally exposed than women; likewise, Exposure affects Karyotype, with occupational exposure increasing the probability of the Susceptible karyotype. shows the CPT for AML. The remaining CPTs are as follows.

Figure 3. A Simplified hypothetical example Bayesian network model.

Figure 3. A Simplified hypothetical example Bayesian network model.

CPT for Karyotype:

P(Karyotype = Susceptible | Exposure = NonOccupational) = 0.15

P(Karyotype = Susceptible | Exposure = Occupational) = 0.30

CPT for Exposure:

P(Exposure = Occupational | Sex = M) = 0.2

P(Exposure = Occupational | Sex = F) = 0.1

CPT (more strictly, the marginal probability distribution) for Sex:

P(Sex = M) = P(Sex = F) = 0.5.

The bars in the nodes of indicate the relative frequencies (in %) of the possible levels for each random variable. Such a BN might be used to describe these variables and their statistical dependencies for a population of patients at a hospital or for the residents of an industrial town, for example. Given this fully specified BN model, the BN inference algorithms in the Netica software application (Norsys Software Corp.) can calculate answers to questions about posterior probabilities of variables conditioned on observations or assumptions. For example, suppose our research question is: What is the difference in AML probability between a randomly selected member of the population with occupational exposure (e.g. to benzene or formaldehyde) and a randomly selected member of the population without such occupational exposure? shows the results of the BN inference calculations. The probability of AML according to this hypothetical BN model is 0.0018 for a random individual without occupational exposure and 0.0032 for a random individual with occupational exposure. Of course, the difference between these two probabilities, 0.0032 − 0.0018 = 0.0014, is not a representation of what was caused by the difference in exposure. Furthermore, a risk assessment that treated this probability difference as the fraction of AML cases in the population that is attributable to occupational exposure (as contrasted with nonoccupational exposure) would be misguided. An inspection of the bars in the nodes in makes clear that the evidence that someone has occupational exposure changes the posterior probability distributions for sex (via Bayes’ rule) and karyotype. These inferences contribute to the difference in the conditional probabilities of AML between occupational exposure and nonoccupational exposure groups.

Figure 4. Conditional probabilities of AML with (right) and without (left) occupational exposure.

Figure 4. Conditional probabilities of AML with (right) and without (left) occupational exposure.

For comparison, shows the results of the calculations needed to quantify the causal impact on AML risk, as measured by IPoC, of an intervention that changes Exposure from occupational to non-occupational in this hypothetical example. Since Exposure is the intervention variable, it is first disconnected from its parent, Sex. Next, the input scenarios for which AML risks are to be compared are constructed. There is one for each combination of M and F (since sex is an input) with occupational and non-occupational levels of Exposure (this is the intervention variable and hence becomes an input node).

Figure 5. Conditional probabilities of AM with (right) and without (left) occupational exposure, for men (top) and women (bottom).

Figure 5. Conditional probabilities of AM with (right) and without (left) occupational exposure, for men (top) and women (bottom).

For men, the IPoC of the intervention that reduces exposure from the occupational to the non-occupational level is 0.0029 − 0.00125 = 0.00165. For women, the corresponding IPoC is also 0.0039 − 0.00225 = 0.00165. Thus, we can assess IPoC = 0.00165 as the increase in risk (i.e. lifetime probability of AML) caused by occupational exposure, as contrasted with non-occupational exposure, in this model. This is the amount by which the lifetime probability of AML for an individual is predicted to decrease if exposure is reduced from occupational to non-occupational levels; in this example, although not in general, it is independent of sex.

The treatment of Karyotype in these calculations illustrates that variables that adjust or equilibrate in response to changes in inputs (such as exposure) must be left free to do so in order to calculate the total effect of an intervention on outcome probabilities (Weinberger Citation2023). shows that the effect on AML risk of reducing exposure from non-occupational to occupational levels is 0 (zero) for non-susceptible people (i.e. people with the non-susceptible karyotype). It is 0.0085 − 0.0055 = 0.003 for susceptible men. The models in assume that reducing exposure from occupational to non-occupational levels reduces the probability of the susceptible karyotype from 0.30 to 0.15. Thus, before the intervention, a man would have a 0.30 probability of the susceptible karyotype and a conditional probability of AML of 0.0085 if he had that karyotype, and otherwise a risk of 0.0005. After the intervention he would have a 0.15 probability of the susceptible karyotype and a conditional probability of AML of 0.0055 if he had that karyotype, and otherwise a risk of 0.0005. Thus, occupational exposure causes an incremental risk of 0.30*(0.0085-0.0005) − 0.15*(0.0055-0.0005) = 0.00165 above the non-occupational exposure level. This is the IPoC for occupational exposure compared to non-occupational exposure. However, the timing of such changes in risk within an individual’s lifetime, and implications for age-specific hazard rates for AML, are not addressed by causal models based on equilibration (Weinberger Citation2023). Thus, IPoC values are best regarded as quantifying the effects of changes from higher to lower levels of exposure on lifetime probability of AML, while remaining silent about more detailed dynamics of risk within a lifetime. Robins and Greenland (Citation1989) discuss the timing of outcomes and note some of the challenges of estimating timing-sensitive measures of comparative risks from observational data.

III. Obtaining fully specified causal bayesian network models

As shown using the simplified hypothetical example, IPoC can be calculated by applying BN inference algorithms () to fully specified causal BN models (i.e. BN models with CPTs satisfying the ICP, or invariant causal prediction property) and interventions (i.e. pairs of input scenarios to be contrasted) if they are available. But how can fully specified causal BN models be developed for realistically complex biological systems? This section argues that a combination of causal artificial intelligence (CAI) methods and human expertise often suffices to create approximate causal BN models adequate for estimating or at least deriving useful bounds for IPoCs in practice.

Developing discrete CPTs for continuous exposure scenarios

Causal Bayesian networks decompose complex webs of causation into relatively simple components, namely, the causal CPTs showing how the conditional probability distribution for each variable depends on its parents. CPTs, in turn, encourage use of a few discrete levels to summarize the possibly meaningfully different levels of each node’s parents. For example, although there is an infinite number of possible exposure histories, one way to represent complex, detailed exposure-response dynamics in living systems with tractable CPTs is to partition exposure histories into a small number of equivalence classes based specifically on which distinct response profiles they elicit. For example, suppose that toxicological data support the exposure-response pattern in . The table shows five qualitatively different responses of target stem cells to different levels of exposure, ranging from no response for “Very low” exposures to saturated production of malignant cells at “Very high” exposures. Very low and Low exposures do not increase cancer risk and may even reduce it if there is a hormetic or biphasic response. Medium exposures increase cytotoxic damage (e.g. cell-killing) but do not increase cancer risk because the damage is contained and repaired. High exposures overwhelm these target cell defenses and increase the rate of production of malignant cells and cancer risk. In addition, cytotoxicity-induced regenerative cell proliferation may increase the risk of cancer, in combination with the genotoxic effects. For many chemicals, these occur only at very high exposure levels, both in human and animal models. The combination of increased cell proliferation with genotoxicity can create nonlinear or threshold-like dose-response curves. These are graded responses, meaning that at the lowest exposure levels, responses vary but do not increase cancer risk, but more severe exposures elicit quantitatively greater responses, so that overall, there is an exposure-response gradient. Very high exposures increase cancer risk by producing cancerous cells at a maximum possible (saturated) rate.

Table 4. Example of qualitative exposure levels defined by the qualitatively different responses elicited.

If these five qualitatively different response patterns are the only ones that can occur for a particular chemical, then exposure scenarios, no matter how complex, can be partitioned into these five qualitative categories based on the qualitative response patterns they elicit. The next step is to integrate quantitative information for the sizes of the response elicited by different exposure histories.

Quantitative IPoC calculations in a two-stage clonal expansion (TSCE) model of carcinogenesis

To move from qualitative descriptions of responses to categories of increasing doses of carcinogens to quantitative cancer risk implications, it is necessary to combine qualitative and quantitative understanding of how biological responses affect cancer risk. A fruitful framework for this purpose is the two-stage clonal expansion (TSCE) model of carcinogenesis (e.g. Richardson Citation2009; Zeka et al. Citation2011; Subramaniam et al. Citation2008). A simple version of the TSCE model is as follows (Cox Citation2006).

  • N(t) = number of normal stem cells available to be initiated at time t (which may require that they be undergoing active cell division rather than dormancy)

  • I(t) = number of initiated stem cells at time t

  • M(t) = expected number of in situ malignant cells formed from initiated cells (and not immediately detected and killed before they can divide) by time t

  • dI(t)/dt = µ1(t)N(t) + (b(t) − d(t) − µ2(t))I(t) where µ1(t) = rate of unrepaired and unremoved initiating transformations per normal stem cell per unit time at time t; b(t) = birth (proliferation) rate of initiated cells at time t; d(t) = death rate of initiated cells at time t; and µ2(t) = rate of unrepaired and unremoved malignant transformation (progression) per initiated stem cell per unit time at time t

  • dM(t)/dt = µ2(t)I(t)

  • P(t) = Probability of at least once cancer cell being formed by time t = 1 - exp(-M(t))

The actual number of viable in situ malignant stem cells formed by time t is a random variable, but standard results for random (non-homogeneous Poisson process) models show that if its expected value is M(t), then the probability that at least one such malignant cell has been formed is P(t) = 1 − exp(−M(t)): this is an identity and not a parametric modeling assumption. We take P(t) as a useful metric of quantitative cancer risk. Subsequent events including acquisition of other hallmarks of cancer (Hanahan Citation2022; Hanahan and Weinberg Citation2011) such as metastatic potential, inflammation, escape from local control, and expression as a frank tumor or cancer, are not modeled in the TSCE framework, but are understood to contribute to the latency period between formation of a malignant cell in situ (i.e. induction) and the earliest possible diagnosis of cancer.

Exposure to a chemical carcinogen potentially can affect any of the quantities N(t), µ1(t), b(t), d(t), µ2(t), as well as influence the latency period. An initiator increases µ1(t); a promoter increases the net proliferation rate of initiated cells, b(t) − d(t); and a converter increases the malignant transformation rate, µ2(t). Inhibited detection and repair of stem cells having unrepaired initiating damage also increases µ1(t). Similarly, suppression of immune surveillance that might ordinarily detect and kill cells that have just undergone malignant transformation increases µ2(t). The possible increases in these parameters are limited, however. The size of the normal stem cell population N(t) susceptible to initiating mutations is typically regulated by homeostatic feedback control loops, and the net proliferation rate of initiated cells is likewise subject to local control. For smoking-induced lung cancer, for example, plausible sizes for these increases might be multiplicative factors of about 1–3 for the number of normal stem cells at risk of initiating transformations; 2–4 for the rate of unrepaired initiating transformations (initiation); 1.5 for expansion of the initiated population (promotion); and 1.5 for the rate of malignant transformation (progression) (Cox Citation2006). From this perspective, details of exposure histories and dynamics of biological responses are less important for quantitative risk assessment than the approximate amounts by which the key quantities are changed by exposure. For example: if the rate of malignant transformation is small enough so that its effect on reducing the size of the initiated population can be ignored; and if initiated clones eventually regress (die out) in the absence of continued promotion (implying that b(t) < d(t)); and if all transients in N(t) and I(t) are ignored; then the differential equation dI(t)/dt=μ1(t)N(t) + (b(t) d(t)μ2(t))I(t) can be simplified by dropping dependence on time and equating the rate of change dI(t)/dt to 0, yielding the following steady state equation for the size of the initiated population: I =μ1N/(db)

The expected rate of formation of malignant stem cells under these conditions is then dM/dt=μ2I=μ1μ2N/(db)

If a specific occupational exposure history increases this product µ1µ2N/(d - b) by a factor of R > 1 for a fraction of a lifetime, then the lifetime probability of lung cancer increases from a baseline level of p0= 1 exp(M0), where M0 is the expected cumulative number of malignant cells formed in a lifetime in the absence of the occupational exposure, to an increased value of p1= 1 exp(M1), where M1=fRM0+(1 f)M0 and f is the effective fraction of a lifetime (a number between 0 and 1) for which exposure increases the average formation rate of malignant cells R-fold (Cox Citation2006).

If a worker would normally accumulate a certain expected number of malignant cells, fM0, over a time interval in the absence of occupational exposure, but instead accumulates a greater expected number of malignant cells, fRM0, over that interval because of the effects of exposure on the parameters µ1, µ2, N and (d - b), and if exposure has no other effects on the worker’s lifetime probability of cancer, then the IPoC for cancer for that worker is given by IPoC =p1p0=exp(M0) exp(M1) =exp(M0) exp(fRM0 (1 f)M0).

Here, f reflects the duration of exposure and R reflects its effect on the rate of accumulation of expected number of malignant cells. Exposure histories with two or more periods of increased exposure can be handled by generalizing the preceding formula: IPoC =exp(M0) exp(ΣifiRiM0)ifiRiM0M0 if all risks are small (e.g., < 0.1) where fi and Ri reflect the duration and multiplier for exposure history segment i. The background risk (i.e. lifetime probability of cancer), p0, determines M0 via the identity M0 = -ln(1 - p0). (If short-term exposures have long-lasting or irreversible effects, such as when b(t) > d(t), so that an initiated population, once formed, continues to expand, then Ri may reflect exposures in earlier segments.) The effective fractions of a lifetime fi for different exposure scenario segments come from exposure history information. The multiplier Ri is the product of the following four multipliers:

  • Stem cell population factor: Ri1 = Ni/N0

  • Initiation factor: Ri2 = µ1i/µ10

  • Promotion factor: Ri3 = (d0b0)/(dibi)

  • Progression factor: Ri4 = µ2i/µ20

These four ratios can be estimated or bounded based on toxicological knowledge or data. shows illustrative examples of ranges for their values that may be plausible for some carcinogens (Cox Citation2006).

Table 5. Examples of multipliers and multiplier ranges.

To illustrate such a bounding calculation, suppose that high exposure to a chemical carcinogen for a lifetime (f = 1) increases the number of initiation-susceptible (e.g. actively cycling) stem cells at any time by at most 10%, i.e. R1 = 1.1, where the subscript i for exposure segment has been dropped because there is only one segment for lifelong exposure; the initiation rate by not more than 30% (R2 = 1.3); and the malignant conversion rate by no more than 20% (R4 = 1.2), while also reducing the net death rate of initiated cells by a factor of 1/Ri3 = 0.9. Then the lifetime exposure would be estimated to increase M0 by at most R = R1R2R3R4 = 1.1*1.3*1.2*(1/0.9) = 1.91. If the baseline level of exposure that could be achieved by an intervention is expected to create a lifetime cancer risk (more specifically, a lifetime probability of at least one malignant cell) of about p0 = 0.001, then the incremental increase in risk caused by exposure is at most IPOC=exp(M0) exp(RM0) = 0.00091

It may be worth emphasizing that this is not a statistical parametric risk modeling assumption, but a mathematical implication of a conceptual model in which malignant cells are formed by a (nonhomogeneous Poisson) stochastic arrival process with an intensity that is modulated by biological processes.

IV. Causal assigned share (CAS) calculations

The fraction of total lifetime risk of a cancer that is caused by the exposure in this case can be defined as the ratio of the incremental risk caused by exposure, i.e. IPoC, to the total risk with exposure: 0.000908/(1-exp(-0.00191)) = 0.476. We will refer to this ratio as the causal assigned share (CAS) of exposure in causing the lifetime probability of cancer. More generally, we define the causal assigned share (CAS) of an intervention in causing or failing to prevent (or reduce the risk of) the harm of an individual as follows:

  • Let scenario 0 be a low-exposure input scenario that would be achieved by an intervention.

  • Let scenario 1 be a higher-exposure (i.e. higher-risk) input scenario that would occur without the intervention.

  • Assume that all variables on causal paths leading from exposure histories to the harm to the individual of interest have conditional probability distributions determined by their CPTs and by the exposure histories in the input scenarios, holding all other inputs fixed at levels specified in the input scenarios.

  • Assume that all other variables not on causal paths leading from exposure histories to the harm to the individual of interest are held fixed at levels specified in the input scenarios.

  • Let p1 and p0 denote the probabilities of harm to the individual for input scenarios 1 and 0, respectively, as predicted by a fully specified causal model.

  • Then the CAS for the intervention is defined by the formula: CAS= (p1p0)/p1=IPoC/p1

Interpretively, an intervention that changed exposure (and perhaps other inputs) from scenario 0 to scenario 1 thereby increases the individual’s risk of a specified harm by IPoC. The CAS expresses this increase as a fraction of the individual’s total risk of harm under the increased exposure scenario. Likewise, an intervention that could have changed matters from scenario 1 to scenario 0 but that was not implemented could have reduced the individual’s risk by the fraction CAS.

Ratios of the form (p1 p0)/p1 have a long history in epidemiology if p1 and p0 are interpreted instead as incidence rates of harm observed in exposed and unexposed populations, respectively (or, in some variations, in more-exposed and less-exposed populations, or in total and unexposed populations, and so forth) (Rockhill et al. Citation1998). In the epidemiological context, the ratio is sometimes called the probability of causation (PC) (or etiologic fraction, or one of several variations of attributable risk), although it is not a probability (for example, it could be negative if p1 < p0) and although, because it generally is derived from observed associations, it does not necessarily or usually address interventional causation. Limitations of attributable fraction, preventable fraction, etiologic fraction, and probability of causation interpretations of such ratios when p1 p0 refers to differences of observed incidence rates rather than IPoC are well documented (Greenland Citation2015; Greenland and Robins Citation1988; Suzuki et al. Citation2012). For example, Robins and Greenland (Citation1989) describe

“…conditions under which epidemiologic data can provide estimates of the excess fraction (proportionate increase in caseload due to an exposure) and the etiologic fraction (fraction of cases caused by exposure). The excess fraction can be estimated under essentially the same conditions often cited for general study validity. In contrast, estimation of the etiologic fraction will usually require very specific non-identifiable assumptions about exposure action and interactions, although one can derive simple lower and upper bounds for the fraction from survival comparisons. Since the etiologic fraction is equivalent to the probability of causation, our results have implications for injury compensation in lawsuits involving the probability of causation.”

As noted by Suzuki et al. (Citation2012), excess fractions and etiologic fractions do not necessarily approximate one another, and the etiologic fraction cannot generally be estimated from observational data without making strong biologic (causal) assumptions. The TSCE model supplies those assumptions, but even with the aid of such models, estimating etiologic fractions is challenging when model parameters are uncertain. In the IPoC framework, an input scenario to a fully specified causal model supplies additional information about the levels of other variables needed to quantify the causal assigned share (CAS) of exposure in an observed case of harm.

The ratio CAS = (p1 p0)/p1 = IPoC/p1 with p1 and p0 defined as the probabilities of harm to an individual predicted by a fully specified causal model for input scenarios with higher (i.e. more risky) and lower (less risky) exposures, respectively, can differ sharply from the traditional probability of causation or etiologic fraction PC = (p1 p0)/p1 with p1 and p0 defined as the incidence rates of harm observed among populations with higher and lower average exposures, respectively. For example, levels of confounders would typically be specified and held fixed in the input scenarios used in defining the intervention for which CAS is calculated, but not necessarily in the observations used to calculate PC. Less obviously, if (a) p1 and p0 refer to the incidence rates of harm among people with and without a certain biomarker; (b) the biomarker is caused only by exposure; (c) disease occurs only among people with the biomarker; and (d) the biomarker occurs at exposure levels lower than those required to cause the disease, as well as at all higher exposure levels, then presence of the biomarker is not a cause of the disease, even though it occurs in everyone with disease. Indeed, if the only exposures considered are either high enough to cause the disease (and, a fortiori, the biomarker) or too low to cause either the disease or the biomarker, then PC would be 1 but CAS could be 0. This can be seen in the DAG model: biomarkerexposuredisease where it is clear that biomarker is a side effect of exposure, but not a cause of disease. The CPTs are

  • P(biomarker = yes | exposure = low) = 0;

  • P(biomarker = yes | exposure = high) = 1;

  • P(disease = yes | exposure = low) = 0; and

  • P(disease = yes | exposure = high) = 1.

An intervention on exposure that changes its level changes the probability of disease (i.e. harm) correspondingly. But an intervention that changed the level of biomarker without changing the level of exposure would have no effect on disease probability. The CAS for such a biomarker is 0 even though the PC is 1 based on observations that the disease occurs if and only if the biomarker is present.

Dealing with input uncertainties: Integrating qualitative and quantitative information

The two input scenarios that define an intervention for an individual will often be uncertain. The exposure histories (typically, one real and the other counterfactual) may be known, but other causally relevant inputs such as confounders, co-exposures, phenotypic variations, and co-morbidities, may be unknown. This section discusses IPoC and Assigned Share (AS) calculations with uncertain inputs (AS is the same as CAS if the only causal parents of risk are M0 and R).

In the example calculations just given, the values of the four multipliers were assumed to be R1 = 1.1, R2 = 1.3, Ri3 = 1/0.9 = 1.1, and R4 = 1.2, yielding a total multiplier of R = R1R2R3R4 = 1.1*1.3*1.1*1.2 = 1.9. In practice, even if the simplified TSCE model is accepted as a reasonable approximation for estimating quantitative causal exposure-response relationships, these multipliers will be uncertain for any individual. Substantial inter-individual heterogeneity in these multipliers arises from sources such as the following:

  • Genetic polymorphisms and differences in individual hormone levels and biochemistry (e.g. specific activity levels of enzymes involved in Phase 1 and Phase 2 metabolism) can affect causally relevant factors such as the production and detoxification of reactive metabolites from exposure-related and other (e.g. endogenous and diet-related) sources;

  • Differences in susceptibility of target cells to cytotoxic, cytogenetic, and epigenetic damage, e.g. reflecting differences in production of reactive oxygen species (ROS) in target cells exposed to toxic metabolites and differences in resulting upregulation of antioxidant defenses that scavenge and neutralize ROS;

  • Differences in capacity of target cells and other (e.g. immune) systems to detect and repair or safely remove (e.g. via apoptosis) stem cells with unrepaired cytogenetic damage; and

  • Substantial variation in confounders or effect modifiers of exposure-health effect associations (Pullen et al. Citation2021; Brown et al. Citation2015) including co-exposures and co-morbidities that might contribute to increased risk.

gives examples of some specific potential sources of inter-individual differences in susceptibility to chemical carcinogens suggested in the literature. However, the qualitative relevance and quantitative effects on risk of many gene polymorphisms and other potential sources of inter-individual variation in causal exposure-response relationships remain unclear (e.g. Kang Citation2015). Moreover, many sources of inter-individual variability are unmeasured for any individual, leaving the magnitude of the overall multiplier R uncertain.

Table 6. Examples of sources of inter-individual variability in epidemiological studies.

One way to model such uncertainty in the biological processes mediating causal exposure-response relationships is to treat their combined effect as being drawn from a probability distribution. The lognormal distribution often is used for this purpose, especially if R can be viewed as a product of many relative risk ratios arising from many independent (or approximately independent) risk factors (Lutz et al. Citation2014; Andersen et al. Citation2006; Lutz et al. Citation2006). More generally, given a causal model, Monte Carlo uncertainty analysis software can be used to (a) sample input scenarios for defining the IPoC from any user-specified probability distributions (not necessarily lognormal); and (b) derive corresponding probability distributions for the IPoC and CAS values. To illustrate this approach to uncertainty analysis, consider the following simple model:

  • M0 and R are the only directed causes (i.e. DAG parents) of risk.

  • In the formulas for interventional probability of causation and assigned share, IPoC=exp(M0) exp(RM0) andAS=IPoC/(1exp(RM0)), respectively, and both M0 and R are modeled as random variables. (As noted above, AS is the same as CAS if the only causal parents of risk are M0 and R, which we assume in this example.)

  • The probability distribution of M0 reflects interindividual variability in background risk, where M0 = expected number of malignant cells formed in a lifetime in a particular target organ or tissue in the low-exposure input scenario used in defining the intervention for which IPoC and CAS are to be assessed. To illustrate uncertainty analysis, assume that M0 is estimated to be equally likely to be anywhere between 0.0005 and 0.0015.

The probability distribution of R reflects interindividual variability (and hence uncertainty for an individual) in the exposure-related risk multiplier in the higher-exposure input scenario used in defining the intervention for which IPoC and CAS are to be assessed. Assume that R is estimated to have a 90% probability of being equal to 1 (no effect) because the exposure history in the higher-exposure input scenario is thought to have a 90% probability of being too small to affect formation of malignant cells. If it does affect them, however, then the conditional probability distribution of R is estimated to be equally likely to be anywhere between 1 and 3. This type of specification for input uncertainty combines a qualitative component—the toxicological knowledge (or assumption) that some levels of exposure may be too small to affect formation of malignant cells—with a quantitative estimate of uncertainty conditioned on exposure being large enough to affect formation of malignant cells.

With this specification of uncertainties about the input scenarios, Monte Carlo simulation straightforwardly yields the cumulative probability distribution functions (CDFs) for IPoC and AS shown in . (The required simulation is simple enough for ChatGPT4 to generate the R code for : https://chat.openai.com/share/875fce19-ef04-4e8e-83b9-124ace4a4b5f.)

Figure 6. Cumulative probability distribution functions (CDFs) of IPoC (left) and AS (right).

Figure 6. Cumulative probability distribution functions (CDFs) of IPoC (left) and AS (right).

These curves show that, even though there is a 90% probability that the IPoC and CAS (or AS, as in this example) are both 0 under the specified assumptions about uncertainty, their maximum possible values are 0.0030 for IPoC and 2/3 = 0.67 for CAS. If CAS is interpreted as a probability of causation, then its mean value, 0.045 in this example, is the unconditional probability of causation taking into account the uncertainties about the input scenarios. (To check the simulation results by hand, note that this maximum value for CAS occurs when exposure increases formation of malignant cells maximally, with R = 3. In that case, the CAS value is approximately (RM0-M0)/RM0 = 1-(1/R) = 1-(1/3) = 2/3. Their mean values are mean(IPoC) = 0.0001 for IPoC and mean(AS) = 0.045 for CAS. This mean value of 0.045 for CAS can be derived as 0.9*0 + 0.1*mean(1-(1/R)) where R is uniformly distributed between 1 and 3.)

Monte Carlo uncertainty analysis deals similarly with many other probabilistic uncertainties, including uncertainties about exposure histories, competing risks that might also cause harm, phenotypic variations that might amplify or reduce risk, co-exposures and co-morbidities that might contribute to or modify risk, and other sources of uncertainty and inter-individual variability. They key ideas is that, conditioned on an input scenario to a fully specified causal model, all such uncertainties are resolved since, by definition, an input scenario in a fully specified causal model species all of the details needed to calculate outcome probabilities via CPTs. Unconditional uncertainty about input scenarios for an individual is then modeled by having multiple possible input scenarios with different probabilities.

If the probabilities for different input scenarios are unknown or cannot be agreed on, then it may be easier to reach agreement on bounds for them. For example, if the uncertainty specification in the preceding example that “the exposure history in the higher-exposure input scenario is thought to have a 90% probability of being too small to affect formation of malignant cells” is replaced with “the exposure history in the higher-exposure input scenario is thought to have at least an 80% probability of being too small to affect formation of malignant cells,” then rerunning the Monte Carlo uncertainty analysis with this new proposed lower bound shows that the mean CAS is at most 0.09 (i.e. the value corresponding to an 80% probability of being too small to affect formation of malignant cells). Sensitivity analysis shows how the upper bound on mean CAS varies with this assumed lower bound on the probability that exposure is too small to affect formation of malignant cells, across a wide range of values. For example, shows that as that lower bound varies from 0 to 1, the corresponding upper bound on the probability of causation, meaning the mean CAS value, varies from 0.45 to 0.

Figure 7. Example of a sensitivity analysis showing how the unconditional probability of causation, mean(AS) (which is the same as mean(CAS) if M0 and R are the only direct causes of risk) depends on the assumed probability that exposure is too low to affect the number of malignant cells formed. Source: Code at https://chat.openai.com/share/04aa37d4-ee74-4c8a-ae64-c32e70d7fbeb.

Figure 7. Example of a sensitivity analysis showing how the unconditional probability of causation, mean(AS) (which is the same as mean(CAS) if M0 and R are the only direct causes of risk) depends on the assumed probability that exposure is too low to affect the number of malignant cells formed. Source: Code at https://chat.openai.com/share/04aa37d4-ee74-4c8a-ae64-c32e70d7fbeb.

In summary, uncertainty analysis methods for mathematical models, including Bayesian networks, are well developed (e.g. Rohmer Citation2020). Common practice includes combining bounding, sensitivity analyses, and Monte Carlo uncertainty analysis, as illustrated in this section. In addition, Bayesian techniques and generalizations of probabilities such as interval-valued and imprecise probabilities can be used to quantify or bound uncertainties about probabilities, including the conditional probabilities in CPTs (Rohmer Citation2020). These techniques are not specific to causal models but can be applied to causal network models as well as to other quantitative models (e.g. simulation models). Thus, if a fully specified causal model is available, then uncertainties about its inputs and CPTs can be addressed by the same types of uncertainty analysis methods used for other quantitative models.

V. Dealing with causal model uncertainty and validation

If a fully specified causal model linking exposure and other inputs to risk is not known, then available incomplete knowledge and data can be used to constrain plausible IPoC and CAS values. Even small portions of causal DAG models typically make testable predictions. Rejecting causal models and fragments of causal models that are clearly inconsistent with reliable observations, while keeping causal models and sub-models that are consistently supported by (or at least not inconsistent with) reliable observations, narrows the set of plausible fully specified causal models. Computing IPoC and CAS values for each of the fully specified causal models in a set (or ensemble) of such models that are deemed “plausible,” in the sense of being consistent with available observations and knowledge, yields corresponding sets of “plausible” IPoC and CAS values. If all of these are smaller than some de minimis or otherwise acceptable level identified as useful for decision-making or screening of potential interventions, then no further analysis may be necessary. If some of the plausible values are large enough to make proposed interventions worthwhile, however, then further analysis (and perhaps further data collection, if its expected value of information (VOI) exceeds the costs of obtaining it) may be justified. In either case, the set of plausible fully specified causal models determines the set of plausible IPoC and CAS values for a proposed intervention. The rest of this section therefore focuses on how to identify such models. This is closely related to the challenge of validating a causal model by determining whether its testable predictions are consistent with observations.

A useful principle for causal model validation is that valid causal models must successfully describe and explain the data (i.e. sets of observed values of variables) that they are proposed to generate. For example, if a deterministic causal model contains the following two structural equations: y=f(x)Risk=g(y) with corresponding causal DAG xyRisk, then a testable implication of this model fragment is that the curve plotting Risk values against x values should be the composition of the two structural equations: Risk = g(f(x)). (A structural equation of the form y = f(x) has the interventional causal interpretation that if x is set to a new level, then the value of y will adjust until the equality holds again. Thus, causality flows from right to left across the “=” sign.) For an interpretation, x could be an exposure metric, y a metric of internal dose, and Risk the lifetime probability of cancer. For example, if y=76.4x/(80.75 +x)=f(x)Risk=1 exp(0.000019y3)=g(y) where y is the relevant internal dose of benzene metabolites (in mg/kg/day) for a mouse exposed to x mg/kg/day of benzene by oral gavage for a lifetime, and Risk is the lifetime probability of cancer (Cox Citation1996); and if the DAG model xyRisk is correct, then the causal relationship between x and Risk is predicted to be Risk=1 exp(0.000019(76.4x/(80.75 +x))3)=g(f(x))

If plotting empirically observed (or estimated) values of Risk against corresponding values of x gives a significantly different curve, then this would be evidence that the model xyRisk is not correct or that the structural equations (which play the roles of CPTs) are not correct, or both. Conversely, if the observed curve of Risk vs. x values matches the predicted relationship g(f(x)), then this would support the adequacy of the model xyRisk as a description of the data-generating process.

show the curves for the exposureinternal dose (in other words, y = f(x) or xy) component and the internal doseRisk (i.e. the Risk = g(y) or yRisk) component, as well as for their composition, the predicted overall causal exposure-response function (corresponding to Risk = g(f(x)), or xRisk) implied by the DAG model xyRisk. If this causal model is correct, then observed (x, Risk) data should not differ significantly from the predicted curve. In practice, Risk is not directly observed: only tumor data are observed. But the observed number of mice responding at each exposure level should be consistent with (i.e. in this case, binomially distributed around) the predictions of the overall exposure-response function if the causal model is correct for the mice being tested.

Figure 8. Decomposition of a causal exposure-response function, exposure, xrisk (bottom) into two components: exposure, xinternal dose, y (upper left); and internal dose, yrisk (upper right).

Figure 8. Decomposition of a causal exposure-response function, exposure, x → risk (bottom) into two components: exposure, x → internal dose, y (upper left); and internal dose, y → risk (upper right).

If the data are clearly inconsistent with the model predictions, then that is evidence that a different causal model is needed, such as the following refined model: phenotype  exposureinternal doseRisk

Conversely, if the predictions from the exposure-response model do match observations for a range of exposures (and other inputs, if any, on which Risk depends), then this is powerful evidence that the model (approximately) correctly describes the data-generating process, insofar as it is unlikely that the composition relationship would hold simply by chance.

These principles generalize to probabilistic causal networks with CPTs instead of deterministic functions. A well-known technique for “factoring” joint probability distributions of the variables in a Bayesian network (Pearl Citation2009) expresses them as products of marginal and conditional probabilities. For example, the joint probability distribution of the variables in the simple causal chain DAG model exposureinternal doseHarm can be factored as follows: P(exposure=x,internal dose=y,Harm=1) =P(exposure=x)P(internal dose=y|exposure=x)P(Harm=1|internal dose=y)

Here, Harm is an indicator random variable with value 1 if the harm (e.g. tumor) occurs and 0 otherwise; P(Harm = 1 | internal dose = y) is the conditional probability of harm given that internal dose = y; and P(internal dose = y | exposure = x) is the conditional probability that internal dose = y given that exposure = x. This identity implies the following probabilistic composition relationship for P(Harm = 1 | exposure = x), i.e. the conditional probability of harm, given exposure x: P(Harm= 1 |exposure=x) =ΣyP(Harm= 1 |internal dose=y)P(internal dose=y|exposure=x), or, more briefly, P(Harm | x) = ΣyP(Harm | y)P(y | x)

This is the probabilistic analog of the deterministic composition formula Risk = g(f(x)) illustrated in . It must hold if the DAG model exposureinternal doseHarm generates the data, since it expresses a quantitative implication of the fact that, in this DAG, Harm is conditionally independent of exposure, given internal dose. (If internal dose is continuous, then the sum over discrete levels of y is replaced by an integral over the continuous range of y values and the probabilities are replaced by probability densities.) Interpretively, the equation says that the probability of harm given exposure is calculated by multiplying the conditional probability of harm for each internal dose level (y) by the probability that the administered exposure concentration (x) causes that internal dose level, and then summing (or integrating) these products over all possible levels of internal dose. This sum-of-products form for composing probabilistic relationships in chain graphs leads to polynomial equations and inequalities (e.g. expressing non-negativity of probabilities) as constraints on the joint probability distributions that are consistent with the DAG structure. Imposing these constraints, in turn, allows mathematical methods developed for such systems of constraints to be applied to determine which causal effects are identifiable from available (typically incomplete) data and to provide formulas for estimating them, even in situations that cannot be adequately represented by Bayesian networks, but where causal knowledge is represented by partial orderings of variables (reflecting that effects are determined by their causes) (Riccomagno et al. Citation2010).

Testing the consistency of a causal model with observed data using composition relationships requires detailed quantitative knowledge of the functions or CPTs being composed and of the overall exposure-response function that they seek to explain. For situations where this detailed information is not available, various causal discovery and causal model validation algorithms have been developed and applied in epidemiology that require much less information, such as whether the conditional independence relationships implied by a DAG model are consistent with those found in the observed data that they are hypothesized to explain (Runge et al. Citation2019). For example, a causal model that posits that XYZ, where these three variables might represent exposure, internal dose, and cancer risk, respectively, implies that if X is changed but Y is prevented from changing (e.g. if an exposure is not systemically distributed or otherwise cannot reach the hypothesized target tissue—as with exogenous formaldehyde not systematically distributed or reaching the bone marrow—or by an intervention or naturally occurring polymorphism that blocks metabolism of X to form Y), then Z should not respond to the changes in X: the DAG implies that Z is conditionally independent of X, given Y. Conversely, if the same level of X creates different levels of Y in different people or test animals, perhaps due to phenotypic variations; or if Y is manipulated without changing X, then the DAG model predicts that Z should have different probability distributions based on the level of Y, even for the same value of X. These predictions are qualitative rather than quantitative. They can be compared to data using statistical tests for independence of variables (Li and Fan Citation2020; Runge et al. Citation2019). Such conditional independence tests provide a starting point for many causal discovery and model validation algorithms (Huntington-Klein Citation2022; Runge et al. Citation2019).

outlines several additional principles and methods for both (a) testing consistency of causal models with data; and (b) using such consistency tests to “learn,” or “discover” potential causal models describing the data-generating process from data, which may be observational, experimental (interventional), or a mix of observational and experimental data.

Table 7. Selected causal discovery algorithms for identifying plausible causal models from data.

Software packages implementing these methods are now widely available (Heinze-Deml et al. Citation2018; Heinze-Deml and Meinshausen Citation2020). Other important packages for estimating and validating causal network models from data automatically identify sets (ensembles) of DAG models with the same statistical properties (hence, that are observationally equivalent) but with different interventional causal implications (hence, that are nonequivalent for interventions) (Textor et al. Citation2016). These packages also identify which causal effects can be estimated from a subset of measured variables in a DAG. They identify alternative adjustment sets (if they exist) for estimating both direct and total causal effects, e.g. of exposure on outcomes. An adjustment set is a set of variables that, when conditioned on, control for confounding in estimating the causal effect. If there are multiple adjustment sets for a given causal effect, then testing whether they lead to consistent estimates of the effect can be a powerful approach for either confirming (if the estimates are not significantly different) or refuting (if they are significantly different) the internal validity of a proposed causal model, i.e. its ability to describe the data-generating process for the data used in creating it (Textor et al. Citation2016).

These technical tools provide computationally practical methods to help risk analysts identify, estimate, and validate plausible causal models from data. However, they are not a panacea. Without sufficient substantive knowledge of causal relationships from toxicology, epidemiology, and related fields, it may be impossible to identify unique causal models from observational data alone, especially when some causally relevant variables are not observed. For example, if causal relationships are nonlinear, as in , and there are unobserved variables, then in general it is not possible to identify all the causal relationships among observed variables via independence tests and regression, although it is possible to identify all causal relationships that can be estimated without being biased by the unobserved variables, and to avoid incorrect causal inferences (Maeda and Shimizu Citation2021). Thus, methods such as those in can assist a human expert by identifying causal models that are consistent with observations, at least if enough variables are observed under sufficiently diverse conditions, as detailed in the foregoing references. They can help to draw sound causal inferences and obtain unbiased estimates of causal effects even when some variables are unobserved (Maeda and Shimizu Citation2021). However, they cannot overcome a lack of knowledge and data if these are extensive enough to make causal effects impossible to estimate. Even the best available causal discovery algorithms can then only identify which effects can be estimated (if any) and which cannot be estimated. A promising direction for further refinements is to look across species to understand the causal relevance of animal models to humans, as suggested by work on the mode of action/human relevance framework (Sonich-Mullin et al. Citation2001; Meek et al. Citation2003; Boobis et al. Citation2006).

This raises the question of whether, in practice, enough scientific evidence can be observed about chemicals and their health effects under various conditions and in various species to enable valid estimation of their causal effects on diseases or other health outcomes of interest in humans. This question is perhaps best addressed on a case-by-case basis. The following sections present case studies for the possible leukemogenic effects of benzene and formaldehyde, respectively.

The main general lesson from this section is that, although different experts might disagree about IPoC and AS values for a given intervention if they do not agree on a unique fully specified causal model, the extent of disagreement should be limited by the extent to which plausible fully specified causal models predict different values. This elevates debate over plausible values to debate over plausible causal models. The techniques in can constrain the set of plausible models even when not enough is known to single out a unique causal model.

PART TWO: case studies

I. Quantitative IPoC and CAS calculations for benzene exposure-related AML

Benzene exposure at sufficiently high concentrations for prolonged duration is a recognized human hematotoxin (blood-poisoning agent) and leukemogen. The International Agency for Research on Cancer (IARC) and other organizations consider benzene a known cause of acute myeloid leukemia (AML) (IARC Citation2018).

One of the earliest reports examining occupational exposure to benzene among individuals diagnosed with acute leukemia or preleukemia identified 26 individuals employed in shoemaking in Istanbul, Turkey (Aksoy et al. Citation1974). Exposure measurements revealed typical concentrations of 150-210 ppm and peak exposures up to 650 ppm in areas where adhesives containing benzene were being used (Yaris et al. Citation2004). Aksoy et al. (Citation1974) reported that the incidence of leukemia (mostly AML) was 13 per 100,000 among 28,500 shoe, slipper and handbag workers in Istanbul who were exposed chronically to benzene in solvents and adhesives for an average of a decade. This is more than double the estimated background incidence rate of 6 per 100,000 in the general population. If it is assumed that the difference is entirely caused by the benzene exposure (and hence not contributed to by confounders) and that all individuals are exchangeable, then the estimated causal assigned share (CAS) for a counterfactual intervention that reduced benzene exposure to zero without changing any other inputs (e.g. smoking prevalence or co-exposures) would be (p1 - p0)/p1 = IPoC/p1 = (13-6)/13 = 0.54.

Similarly, Yin et al. (Citation1987) identified 30 cases of leukemia (77% acute), mostly employed in organic chemical synthesis and paint and rubber manufacturing plants, among Chinese workers exposed to high concentrations (mainly between 160 and 1600 ppm) of benzene in 233 benzene factories and 83 control factories in 12 cities in China. The leukemia mortality rate of 14/100,000 person-years in the benzene cohort was nearly seven times that of the control cohort, and was statistically significant for exposed men and women, corresponding to a CAS of (14-2)/14 = 0.86 if a purely causal interpretation is assumed. Many other epidemiological studies have associated estimated benzene exposures with increased risk of acute leukemia, but the associations are clearest and strongest in these occupational studies where very high concentrations of benzene were demonstrated.

Qualitative causal dependencies: Causal hypotheses and networks for benzene-induced AML

Conjectures about how even much lower occupational or environmental exposures might also increase risk of AML, and possibly other lymphohematopoietic malignancies, have proliferated in the decades since Aksoy’s studies of Turkish shoe-workers, in part under the stimulus of regulatory and litigation interest (Natelson Citation2007). However, increased risks of leukemia at low exposure concentrations of benzene have not been established (Medinsky et al. Citation1996); the empirical evidence is, at best, mixed. For example, in a study of 14 leukemia cases and 55 matched controls among petroleum distribution workers exposed to average benzene concentrations up to 6.2 ppm, no increased risks were observed with increasing cumulative benzene exposure (Schnatter et al. Citation1996).

shows a proposed conceptual model that integrates findings and speculations based on a combination of in vitro and in vivo experimental data in multiple cell lines and other test systems and species (mainly rats and mice) as well as clinical and industrial hygiene observations in various occupational populations in China and elsewhere (McHale et a., 2012). In this conceptualization, hematopoietic stem cells (HSCs) in their bone marrow niches undergo oxidative stress from toxic metabolites of benzene, leading to increased rates of unrepaired mutations and possibly other (e.g. epigenetic) changes in an expanded pool of actively cycling HSCs susceptible to leukemic damage (corresponding to R1 > 0 in our TSCE model); clonal expansion of the altered (“initiated”) HSCs (corresponding to R2 > 0 and R3 > 0); and increased production of hypothesized leukemic stem cells (LSCs) that escape immune surveillance (corresponding to R4 > 0) (McHale et al. Citation2012; Wang et al. Citation2012).

Figure 9. A “no side effects” conceptualization of proposed causal pathways linking benzene exposure to AML risk. Source: McHale et al. (Citation2012)

Figure 9. A “no side effects” conceptualization of proposed causal pathways linking benzene exposure to AML risk. Source: McHale et al. (Citation2012)

We call a “no side effects” causal model of benzene-induced leukemia because it depicts every change (key event, modifying factor, or toxicological effect), without exception, as contributing to one or more causal pathways that lead to leukemia. But we think it is more plausible that many (possibly all, under currently realistic exposure conditions) of the changes that proposes as mediating the effects of benzene exposure on leukemia stem cell (LSC) formation and leukemia risk are instead likely to be side effects or biomarkers of exposure (i.e. satisfying changeexposureleukemia rather than exposurechangeleukemia). The specific hematotoxic, cytogenetic, genetic, epigenetic, and immune system changes on which is based come from associations derived from in vitro data, animal (usually, mouse and rat) models, as well as human data (usually from peripheral white blood cells in workers) that do not have AML as an endpoint. Instead, the animal models typically lead to lymphomas, which originate from lymphocytes rather than from myeloid stem or progenitor cells (Zhao et al. Citation2021; Farris et al. Citation1997) and which are not necessarily relevant to humans (Tillman et al. Citation2020; Ward Citation2006). It has been estimated that some 10%-50% of CD-1, C57BL/6, B6C3F1 and B6;129 mice develop lymphomas as they age (Ward Citation2006). Similarly, changes observed in peripheral white blood cells of benzene workers (e.g. Vermeulen et al. Citation2023) that shows as key events leading to leukemia have no known causal connection to leukemia. Speculations that observed reductions in immunosurveillance at low concentrations of benzene may contribute to AML risks are undermined by the fact that the specific observed changes, such as reductions in peripheral white blood cells (Guo et al. Citation2020) have not been shown to cause AML, or to increase evasion of immunosurveillance by the LSC. The LSC presumably and usually is located in a bone marrow niche (albeit with changes in the stromal microenvironment) and is not a white blood cell (Marchand and Pinho Citation2021).

Thus, the key qualitative causal hypotheses for leukemogenesis embedded in and similar diagrams are probably best regarded as exploratory causal hypotheses. They are suggested by statistical associations observed in datasets, but their causal relevance to the pathogenesis of exposure-related leukemia, including benzene-induced AML, remains unknown. It is likely that they conflate side effects of exposure with causal mediators and moderators of leukemogenesis. They therefore represent unproved hypotheses, and not causal findings or theories that have been confirmed by observations. This distinction between conjecture and fact is not always carefully preserved in discussions of benzene (or formaldehyde, discussed below) exposure and leukemia risk. For example, researchers from the University of California, Berkeley School of Public Health, where was developed, have referred to “the AML pathway” in studying clusters of gene expression in peripheral blood mononuclear cells (PBMCs) from benzene-exposed workers (Thomas et al. Citation2014), even though no causal pathway connecting such changes to AML is established and PBMCs cannot give rise to AML.

By contrast, several mechanistic studies support the TSCE paradigm for exposure-induced AML. They provide evidence that early initiating mutations HSCs, especially in the DNA methyltransferase 3 A gene, DNMT3A (Yang et al. Citation2015), lead to a clonally expanded pool of pre-leukemic (initiated) HSCs, or LSCs, from which AML arises (i.e. progression) (Shlush et al. Citation2014). However, the role of clonal hematopoiesis and clonal expansion of DNMT3A-mutated cells specifically in AML, as opposed to in aging, is still being clarified (Young et al. Citation2019). The hematopoietic system in general, and early myeloid stem cell and progenitor populations in particular, respond to hematotoxic (blood cell poisoning) exposures in a variety of ways and on multiple time scales (Hirabayashi et al. Citation2004; Cox Citation1999; Farris et al. Citation1997). The same exposure history can elicit both suppression and expansion of myeloid progenitor cell populations, such as relatively immature granulocyte-macrophage colony-forming units (CFU-GM), at different times (Cox Citation2000).

Moreover, the same cumulative exposure can elicit very different responses depending on how it is distributed over time. Specifically, smaller total benzene exposures administered at higher-concentrations over shorter-durations will produce much more damage (e.g. as measured by peaks and nadirs of CFU-GM levels, or by solid tumor yields in animals) than much larger cumulative exposures administered at lower concentrations over longer durations (e.g. Cox Citation1996; Hirabayashi et al. Citation2004). This renders the common practice of reporting estimated relative (RR) risk ratios for different cumulative exposures to chemicals (e.g. Zhang et al. Citation2009; Beane Freeman et al. Citation2009) inadequate, insofar as the time pattern of exposure is far more important than the cumulative amount administered in determining how target cell populations will respond (Cox et al. Citation2021; Hayes et al. Citation1997). This is consistent with epidemiological studies in which leukemia risk does not show an increasing trend with cumulative exposure to benzene, but exposure histories with peak exposures greater than about 100 ppm for more than about 40 days may have increased risk of all leukemias and acute non-lymphocytic leukemia (ANLL, most commonly AML) (Collins et al. Citation2003).

offers a proposed causal DAG network that seeks to interconnect the best-established biological insights about conditions that help to both predict and cause increased risk of AML in benzene-exposed humans. That the arrows in this network represent real-world interventional causal relationships, so that XY implies that changing the level of X changes the value or probability distribution of Y, is established via a combination of experimental and observational data, including repeatable observations of the levels of Y when various manipulations are used to alter the level of X. Likewise, the validity of a proposed causal chain such as XYZ is established not only by testing whether differences and changes in the levels of X predict and explain corresponding observed differences and changes in the levels of Z (and Y), but also by testing whether (a) the differences and changes in Z disappear when Y is held fixed but X is changed; and (b) the differences and changes in Z are again observed when Y is changed but X is held fixed. These strategies for elucidating interventional causation via empirical observations extend beyond causal chains. For example, shows the cytochrome P450 2E1 (CYP2E1) isoform as a moderator of benzene metabolism and as a causal ancestor in the DAG of cytotoxicity and oxidative stress in bone marrow cells. More specifically, CYP2E1 catalyzes the oxidation of benzene to benzene oxide (BO), which rearranges to produce phenol (PH) and which is detoxified by glutathione (GSH), producing S-phenylmercapturic acid (SPMA) that is cleared in urine (e.g. Zarth et al. Citation2015).

Figure 10. A Proposed causal DAG model for benzene-induced AML risk and side effects.

Figure 10. A Proposed causal DAG model for benzene-induced AML risk and side effects.

Evidence establishing the moderating effect of CYP2E1 on benzene metabolism and subsequent bone marrow (BM) toxicity includes the following:

  • A technologically possible intervention involves CYP2E1 knockout (cyp2e1−/−) transgenic mice that do not form CYP2E1 protein. Comparing responses to inhalation of 200 ppm benzene for 6 hr/day for 5 days in such cyp2e1−/− mice, as well as normal (wild type) mice, and B6C3F1 mice shows there is no observed benzene-induced cytotoxicity or genotoxicity in the bone marrow of cyp2e1−/− mice, but severe genotoxicity and cytotoxicity in both wild-type and B6C3F1 mice. This shows that CYP2E1 drives benzene metabolism and benzene-induced myelotoxicity in mice (Valentine et al. Citation1996).

  • A different intervention is to hold benzene exposure fixed at zero but administer phenolic metabolites that are ordinarily formed by CYP2E1 during Phase 1 metabolism. suggests that this intervention should reproduce the toxic effects of benzene. This prediction is tested and confirmed in mice by the finding that coadministration of phenol and hydroquinone reproduces myelotoxicity similar to that following benzene exposure (Subrahmanyam et al. Citation1990; McDonald et al. Citation2001)

  • Since ethanol induces CYP2E1, the causal network in suggests that chronic alcohol consumption should increase the myelotoxic effects of benzene. This prediction has been confirmed in mice (Marrubini et al. Citation2003).

  • Generalization of these findings on the crucial role of CYP2E1 from mice to humans is supported by the observation that its role in Phase 1 metabolism is highly conserved across mammalian (and some other) species (Harlow et al. Citation2018).

  • An indirect way to test whether CYP2E1 modulates benzene metabolism in humans without performing unethical experiments is to examine the consequences of gene polymorphisms (which may be viewed as naturally occurring interventions) in genes encoding glutathione-related enzymes that detoxify the toxic metabolites of benzene formed by CYP2E1. Nourozi et al. (Citation2018) performed such a study among employees at a petrochemical plant and concluded that “subjects with both null GSTT1 and GSTM1 genotypes had a significantly higher risk for hematological disorders as compared to subjects with positive GSTT1 and GSTM1 genotypes (OR = 2.35, 95% CI 1.14–4.8). The results of this study showed that individuals carrying null GSTT1 or both null STT1 and GSTM1 genotypes had a higher risk and were more susceptible to benzene-induced hematological disorders.” This is consistent with the hypothesis that reduced capacity to detoxify benzene metabolites formed by CYP2E1 increases the hematotoxicity of benzene in humans.

  • Other studies of genetic polymorphisms as risk factors for metabolic susceptibility of benzene-exposed workers to benzene hematotoxicity reported that a combination of a rapid metabolizer CYP2E1 phenotype and a NQO1*2 polymorphism (that, in effect, inactivates the NQO1 enzyme normally responsible for detoxifying 1,4-BQ) were associated with a 7.8 fold increased risk of decreased white blood cell and platelet counts in exposed workers (Ross and Zhou Citation2010; Rothman et al. Citation1997). Even if these outcomes are only side effects of benzene exposure with no implications for AML risk (Natelson Citation2007), they help to confirm these important aspects of metabolism in driving benzene hematotoxicity, as shown in .

Other aspects of are supported by animal and human evidence, as well as in vitro studies. For example:

  • Studies of markers of benzene exposure in Chinese workers exposed to a wide range of benzene concentrations (between 0.06 and 122 ppm on the day of sampling, with a median exposure of 3.2 ppm) confirmed significant increases in urinary metabolites of benzene (such as SPMA) and chromatid breaks and total chromosomal aberrations in exposed subjects compared with unexposed subjects (Qu et al. Citation2003). Urinary SPMA showed exposure-response trends even for exposure concentrations less than 1 ppm, but HQ, CAT, and phenol were significantly increased only for benzene exposure levels above 5 ppm. Albumin adducts of benzene oxide and 1,4-BQ in blood were strongly correlated with each other and with SPMA, as expected from .

  • Chinese benzene worker bone marrow data are also consistent with chronic inflammation followed by an immune-mediated bone marrow response, with chromosomal abnormalities as a side effect, as would be expected from a high-ROS and oxidative stress environment typical of inflammation (Gross and Paustenbach Citation2018). At sufficiently high exposures, benzene clearly causes inflammation and suppresses the adaptive immune system (Guo et al. Citation2020; Guo et al. Citation2019). That chromosomal damage is a side effect and not a key event in the causation of AML is supported by earlier findings that, although benzene exposure causes chromosomal changes such as aneuploidy and structural chromosomal aberrations, particularly in peripheral blood lymphocytes, “Firm conclusions cannot be made about the involvement of specific chromosomes or chromosome regions. Further, in leukemia cases associated with benzene exposure, there is no evidence of a unique pattern of benzene-induced chromosomal aberrations in humans” (Zhang et al. Citation2002).

  • Gasoline station workers in India with sufficiently high exposure to benzene to significantly decrease GSH were found to have elevated levels of benzene and its metabolites in blood and urine as well as indicators for oxidative stress in blood (Uzma et al. Citation2010).

  • Roles of increased ROS and oxidative stress (OS) in bone marrow in the initiation, promotion, and progression of AML are suggested by numerous in vitro experiments and observations of human bone marrow cells from AML patients and controls (Trombetti et al. Citation2021).

These observations support the existence of causal pathways leading from benzene exposure to increased risk of AML and to various observable side effects of exposure including chromosomal changes that are not on the causal pathway, consistent with the findings of the epidemiological studies of highly exposed workers.

The qualitative pathways in correspond to hazard identification in health risk assessment. They establish plausible causal mechanisms by which exposure can cause AML and perhaps causally related myeloproliferative disorders such as myelodysplastic syndrome (MDS). These qualitative causal dependencies are supported by observations confirming testable predictions of the conceptual model. The next challenge is to quantify these dependencies.

Is there a threshold?

A first step toward quantification of causal dependencies in the exposure-response relationship for benzene and AML is to determine whether the incremental risk of AML (and MDS) caused by benzene exposure (i.e. the IPoC for preventing such exposure) is zero for sufficiently small exposures. Opinions in the literature are divided. Over the decades since Aksoy and others reported increased risk of AML in worker populations occupationally exposed to high concentrations of benzene (e.g. >100 ppm), many investigators have speculated that relatively low concentrations of benzene, e.g. even less than 1 ppm, also cause AML and perhaps MDS, lymphomas, and related malignancies (McHale et al. Citation2012; Shallis et al. Citation2021). Although we focus on AML, very similar comments apply to MDS and possibly other myeloproliferative conditions that, based on the ambiguous associations reported in the epidemiological literature, may or may not be associated with occupationally relevant levels of exposure to benzene. Some have hypothesized that unknown enzymes might interact with such low concentrations of benzene and significantly increase AML risks (Rappaport et al. Citation2010). Others have responded that, although theories and claims of low-exposure adverse health effects may be encouraged by regulators (and general adherence to LNT risk models) as well as plaintiff’s attorneys, there is little scientific or medical reason for health concerns about levels of benzene exposure encountered by workers in the petrochemical industry or other industries, at least in developed countries (Natelson Citation2007). According to IARC (Citation2018), AML (in adults) remains the only cancer that is known to be caused by benzene. The higher quality epidemiological literature further indicates that only high concentrations of benzene (e.g. greater than 100 ppm) increase the risk of AML, consistent with the animal (rat and mouse) and mechanistic evidence of a substantial exposure threshold for risk.

On the technical side, common epidemiological practices reported in the published literature confuse the issue of determining the IPoC for reducing or eliminating adverse health effects with the assumption that observed associations with exposure reflect the amount of disease that is caused by the exposure. These include the following:

  • Use of cumulative (e.g. ppm-years) and average intensity exposure metrics (e.g. cumulative exposure divided by duration). These, though common, potentially lose crucial information about peak exposures and time patterns of exposure;

  • Failure to adjust adequately for confounding by time-varying factors (e.g. individuals’ smoking histories, or other occupational exposures);

  • Use of epidemiological effects measures such as estimated “burden of disease,” “population attributable fraction,” and other quantities derived from relative risk ratios. Papers using these ratios often improperly present them as if they were measures of interventional causal effects, also often without fully controlling for confounding (e.g. Yi et al. Citation2020);

  • Presentation of model-based results (e.g. from statistical regression models) as if they were empirical findings, without taking into account model uncertainty, model specification errors, or assumption-dependence of conclusions; and

  • Treating estimated exposures as if they were true exposures, without accounting for exposure estimation errors.

These practices have generated many importantly misleading claims such as that “Responses at or below 0.1 ppm benzene were observed for altered expression of AML pathway genes and CYP2E1. Together, these data show that benzene alters disease-relevant pathways and genes in a dose-dependent manner, with effects apparent at doses as low as 100 ppb in air” (Thomas et al. Citation2014). Crucial caveats omitted from such claims, include the following:

  • Due to a 0.2 ppm limit of detection of benzene in air, the reported concentrations “at or below 0.1 ppm” of benzene in air are not measured exposure concentrations. Rather, they are estimates from statistical models that correlate benzene in urine with benzene in air. These model-based estimates have unknown validity and unknown error distributions. They were not adjusted for inter-individual differences in CYP2E1, GSH, and other enzymes. Yet, such differences can easily generate a 5-fold range of levels for each benzene metabolite at a given air concentration of benzene (Rappaport et al. Citation2010). Thus, the estimates of air benzene levels are likely to contain substantial error variance due to omission of adjustments for differences in individual metabolism.

  • Errors in estimates of exposure were ignored in the analysis (i.e. errors-in-variables methods were not used), with estimated values treated in regression models as if they were known true values. This has the well-known statistical effect of concealing thresholds in the exposure-response relationship, if there are any (e.g. Glasgow et al. Citation2022). As noted by McNally et al. (Citation2017), “Ultimately it is not possible to study genuinely low-level benzene exposure in a human population through the use of urinary metabolite production when other sources of these metabolites cannot be excluded.”

  • As previously noted, the “altered expression of AML pathway genes” refers to a cluster of observed changes in gene expression in peripheral white blood cells. They have no known causal relevance to AML risk. The term “AML pathway” might suggest that such a pathway has been identified and that the reported observed changes are known to be involved in the pathogenesis of AML; however, neither is true.

No excess risks of AML (or other cancers) caused by low exposure concentrations of benzene (e.g. less than 1 ppm) have been established in animals or in humans (Medinsky et al. Citation1996).

Critics of the hypothesis that AML risks are induced by low-level exposures to benzene have also noted that, in mice, high doses of benzene (e.g. 100 and 200 ppm for 6 h/day, 5 days/week for 1, 2, 4, or 8 weeks) create profound hematotoxic and genotoxic effects, reducing the numbers of total bone marrow cells, progenitor cells, and differentiating hematopoietic cells, and increasing replication of primitive progenitor cells in the bone marrow. By contrast, none of these effects in bone marrow occurs at 10 ppm (Farris et al. Citation1997). Relatively low doses of benzene (1 and 10 mg/kg 6 times per week for 4 weeks) administered to mice by oral gavage do cause a decline in peripheral white blood cell counts but even lower concentrations (e.g. 0.1 mg/kg) appear to induce hormetic (i.e. stimulatory) responses (Li et al. Citation2018).

Some epidemiological studies have reported that risks of AML, leukemia, non-Hodgkin’s lymphoma, and multiple myeloma are not associated with cumulative exposure to long term, low-level occupational exposure to benzene (Schnatter et al. Citation1996; Talibov et al. Citation2014). Conversely, other epidemiological studies have reported associations between estimated cumulative occupational exposures to benzene and leukemia risks (e.g. Glass et al. Citation2003; Khalade et al. Citation2010; Schnatter et al. Citation2005) and for AML specifically (Saberi Hosnijeh et al. Citation2013). Use of cumulative exposure, failure to model exposure estimation errors (e.g. in reconstructing estimated exposure histories from occupational histories), unmeasured and incompletely controlled potential confounders, uncertain modeling assumptions, and limited power to detect small effects among relatively rare malignancies (leading to small numbers observed) limit the causal conclusions that can be drawn from such epidemiological studies, whether or not they find exposure-response associations. Studies that use statistical (including LNT) models to extrapolate risks based on highly exposed groups to lower exposure levels have claimed that risks persist at low concentrations without empirically demonstrating such exposure-risk relationships. However, a publication based on the NCI-Chinese cohort reported that AML/MDS risk was not statistically significantly associated with benzene exposure overall, but risk was strongly elevated (i.e. five-fold) and statistically significant among the subsets of workers younger than age 30 that were exposed at concentrations greater than 5 ppm (with some greater than 25 ppm) and/or had cumulative exposure greater than 40 ppm-years (with half greater than 100 ppm-years, indicating relatively high exposure intensity). Risks were most greatly increased specifically in the risk period of 2–10 years prior to AML/MDS diagnosis (Linet et al. Citation2019).

Mechanistic considerations as well as epidemiological observations suggest that exposure concentration or concentration-duration thresholds should be expected for causation of AML by benzene. For example,

  • If enzymes that detoxify hepatic metabolites of benzene (e.g. GSH-related antioxidants) are not depleted or saturated at low exposures, then toxic metabolites (e.g. benzene oxide and phenolic metabolites) formed in the liver promptly may be detoxified (e.g. to SPMA) and excreted into the urine without entering the bone marrow at all, or reaching it only in quantities small enough not to cause bone marrow toxicity. This is consistent with findings in the mouse experiments noted above, that reducing benzene concentration in air by 90% from 100 ppm to 10 ppm does not simply reduce indicators of bone marrow toxicity by 90%, but eliminates them entirely (Farris et al. Citation1997). It is also consistent with the finding that, in Chinese workers exposed to benzene concentrations between 0.06 and 122 ppm on the day of sampling, HQ, CAT, and phenol in urine were significantly increased only at benzene exposure concentrations above 5 ppm (Qu et al. Citation2003).

  • Similarly, if protective resources in the bone marrow, such as the antioxidant NQO1, are not depleted or saturated, they may detoxify metabolites reaching the bone narrow, preventing the onset of oxidative stress, chronic inflammation, and a pro-leukemic bone marrow environment.

  • Even if damage to HSCs does occur, it subsequently may be eliminated (e.g. via DNA repair or apoptosis of damaged cells) provided that repair capacity exceeds the rate of damage (Jamall and Willhite Citation2008) and that hematopoiesis is not stressed, so that HSCs are not abnormally rushed into active cycling with error-prone repair.

These conjectures may help to explain some otherwise puzzling epidemiological observations. For example, Jamall and Willhite (Citation2008) concluded:

Given the epidemiological evidence for workers exposed to gasoline and the virtual absence of evidence of increased cancer (notably AML and ANNL) after benzene exposures at concentrations less than 0.5 to 1 ppm… for decades, it appears that such exposures either do not exceed a threshold dose or that benzene exposure from gasoline is handled in a quantitatively different manner by the body such that delivered dose of the carcinogenic metabolite(s) in human bone marrow is insufficient to induce leukemia.

Similarly, for much lower potential community exposures, Johnson et al. (Citation2009) reported:

Between the years 2003 and 2006, a total of 3794 air samples were collected from 23 monitoring stations [in Florida] … The mean benzene concentrations by site ranged from 0.18 to 3.58 ppb. Extrapolated cumulative lifetime exposures ranged from 0.036 to 0.702 ppm-years. Regulatory risk analysis resulted in cancer risk estimates ranging from 4.37 × 10−6 to 8.56 × 10−6, all of which exceed the Florida Department of Environmental Protection acceptable risk of 1 × 10-6. Comparative analysis with the epidemiologic literature indicates the association between benzene exposure and AML is related to cumulative exposures far in excess of 1 ppm-years, with the likely threshold for benzene-induced leukemogenesis of 50 ppm-years cumulative exposure. Based upon the results of this investigation, it is unreasonable to anticipate AML cases in Florida residents as a result of ambient airborne benzene concentrations.

Although, as noted, the use of cumulative exposure metrics and other practices make it difficult to interpret epidemiological findings in terms of their implications, if any, for causal exposure-response curves and the existence of exposure intensity thresholds, we find the toxicological arguments for a threshold more convincing. Neither phenol (PH) nor hydroquinone (HQ) by itself reproduces benzene myelotoxicity, but the two together do so (McDonald et al. Citation2001; Legathe et al. Citation1994; Subrahmanyam et al. Citation1990). If it is true that PH and HQ in urine are significantly increased only at benzene exposure concentrations above 5 ppm, while SPMA is found at lower concentrations (Qu et al. Citation2003), then 5 ppm may the right order of magnitude for a plausible no-AML-effect level, or lower bound on the exposure concentration needed to produce bone marrow damage and increased AML risk in humans. This is also roughly consistent with the finding that 10 ppm in mice does not cause myelotoxicity (Farris et al. Citation1997), bearing in mind that mice are more sensitive to benzene than humans in that they have a relatively greater capacity than primates to metabolize benzene to HQ and other toxic metabolites (Henderson Citation1996). It is also roughly consistent with the estimated likely threshold for benzene-induced leukemogenesis of 50 ppm-years of cumulative exposure (Johnson et al. Citation2009) if it is assumed that exposures on the order of at least 5 ppm for at least 10 years are required to induce a high-ROS bone marrow environment capable of increasing AML risk.

Estimating and bounding quantitative causal dependencies

Even if there is a benzene exposure concentration threshold for myelotoxicity and consequent increased risk of AML, it is likely to have different values in different individuals. As an illustration of how to obtain a plausible upper bound on the IPoC for increased risk of AML in an individual exposed to a constant relatively low concentration of benzene, and hence potentially preventable by an intervention that sets such exposure to zero, consider the following fully specified causal model:

  1. Exposure to benzene is assumed to be constant for the input scenarios considered here.

  2. Production of phenol (PH) and hydroquinone (HQ) from benzene are assumed to be governed by the following Michaelis-Menten type structural equations (Rappaport et al. Citation2010, equation 1): YPH= Y0PH+ YmaxPH*X/(X50PH+ X)YHQ= Y0HQ+ YmaxHQ*X/(X50HQ+ X) Here, X is the concentration of benzene in air in ppm; YPH is the resulting concentration of phenol and YHQ is the resulting concentration of hydroquinone in urine in µM; and the constants in these equations are estimated by nonlinear regression modeling to be as follows for nonsmoking women (who may be more sensitive than men): Y0PH = 69.5; YmaxPH = 5293; X50PH = 107.1 for phenol; and Y0HQ = 6.36; YmaxHQ = 376; X50HQ = 50.9 for hydroquinone.

  3. There is roughly a 5-fold range in the levels of each metabolite in urine across individuals for any given concentration of benzene in air (Rappaport et al. Citation2010). This means that any individual may have between about 2.24 times higher or 1/2.24 lower concentration of each metabolite than predicted by the above equations (2.24 is approximately the square root of 5). We treat these values as the estimated upper and lower ends of a 95% prediction interval for an individual’s levels of metabolites. For an individual exposed to environmentally relevant levels of benzene in air, the concentrations of HQ and PH in bone marrow are expected to be approximately proportional to their concentrations in urine (Henderson Citation1996). This variability presumably reflects inter-individual phenotypic variations in benzene metabolism, e.g. due to genetic polymorphisms in enzymes catalyzing the production and removal of toxic metabolites. The extent to which such variations are correlated for PH and HQ is unknown. We will consider the extreme cases of independence (implying zero correlation) and perfect correlation.

  4. Bone marrow toxicity occurs only if both YPH and YHQ exceed the levels corresponding to an air benzene concentration of 10 ppm, i.e. the no observed adverse effects level (NOAEL) for myelotoxicity in the bone marrow of mice, the most sensitive species (Farris et al. Citation1997; Henderson Citation1996). Applying the above equations to X = 10 ppm gives YHQ = 68.1 and YPH = 521.5 as the corresponding levels of HQ and PH in human bone marrow that must be exceeded to cause myelotoxicity.

  5. If an exposure concentration of X ppm of benzene predicts an estimated concentration of a metabolite in urine of Y* based on the above regression equations, with a 95% prediction interval of Y*/2.24 to 2.24Y*, then we will assume that the true (yet unknown) concentration Y is approximately lognormally distributed, based on analysis of Chinese shoe worker data (McNally et al. Citation2017). For purposes of a simple illustration, suppose that Y* is a maximum likelihood estimate (MLE) of Y and that log(Y) is approximately normally distributed around its MLE of log(Y*) with a 95% prediction interval of [log(Y*) - log(2.24), log(Y*) + log(2.24)], and where log is the natural logarithm. Then the probability that the true metabolite concentration Y exceeds any specified threshold value, T, is given by P(Y > T) = P(log(Y) > log(T)) = 1-pnorm(log(T), log(Y*), log(2.24)/1.96, where pnorm(x, µ, σ) denotes the cumulative density function (CDF) evaluated at x for a normal distribution with mean µ and standard deviation σ. (Here, half the width of the 95% prediction interval is estimated as 2.24 = 1.96σ based on the properties of the normal distribution, and σ is accordingly estimated as 2.24/1.96.) For example, if the actual exposure concentration is 5 ppm (for which the predicted value is YHQ = 40.0), then the probability that the level of HQ exceeds the level predicted for an exposure concentration of X = 10 ppm, namely YHQ = 68.1 µM, would be 1-pnorm(log(T), log(Y*), log(2.24)/1.96) = 1-pnorm(log(68.1), log(40), log(2.24)/1.96) = 0.10. If X = 2 ppm, then the corresponding exceedance probability is 1-pnorm(log(68.1), log(27.29), log(2.24)/1.96) = 0.013. If X =1 ppm, then it is 0.000045, also denoted 4.5e-5, i.e. 4.5 chances in 100,000. Thus, the estimated probability that HQ concentrations in bone marrow would exceed the levels corresponding to 10 ppm, the NOAEL for myelotoxicity in mice, declines from about 10% to about 0.0045% as air benzene levels decline from 5 ppm to 1 ppm. Analogous calculations for phenol PH) show that the estimated probability of exceeding the mouse NOAEL concentration in bone marrow (i.e. the level for 10 ppm of benzene in air) declines from 1-pnorm(log(521.5), log(305.6), log(2.24)/1.96) = 0.10 at X = 5 ppm to 1-pnorm(log(521.5), log(118.4639), log(2.24)/1.96) = 0.00016 at X = 1 ppm. If the variations in PH and HQ are independent of each other, then the estimated probability that both are high enough to exceed their joint NOAEL levels is 0.10*0.10 = 0.0001 or 1e-4 at X = 5 ppm and is 0.000045*0.00016 = 7.2e-9 at X = 1 ppm.

  6. If steady-state HQ and PH levels are both above the estimated NOAEL levels corresponding to a steady benzene exposure concentration of 10 ppm in inhaled air, then we will assume that unrepaired damage to HSCs may occur that hastens the rate of formation of LSCs and increases AML risk. Rather than trying to quantify the CPT describing exactly how AML risk depends on benzene concentrations in air or metabolite concentrations in bone marrow, we will use a plausible upper-bound estimate based on the assumption that the effect of exposure concentrations less than 10 ppm is no greater than the effect estimated in the Aksoy et al. (Citation1974) study, which was an increase in leukemia (especially AML) risk from 0.00006 to 0.00013 among workers exposed to relatively high concentrations of benzene in air (150-210 ppm with peak exposures up to 650 ppm) (Yaris et al. Citation2004).

This model implies that AML probability increases from about 0.00006 (or 6e-5) in the absence of occupational exposure to at most 0.00013 for benzene-exposed people with bone marrow concentration of PH and HQ greater than the estimated levels of these metabolites for an exposure concentration of X = 10 ppm.

IPoC formula for this model is as follows: IPoC= Increment in risk for an intervention that would reduce benzene exposure concentration from X ppm to 0 ppm = P(PH and HQ in bone marrow exceed NOAEL levels) × (increase in risk if they do exceed it) < P(PH and HQ in bone marrow exceed NOAEL levels) × (0.00013 0.00006) =P(PH and HQ in bone marrow exceed NOAEL levels) × 7e5.

The last formula is a plausible upper bound on the IPoC because it assumes, conservatively, that any exceedance of the metabolite levels expected for the estimated NOAEL of 10 ppm for myelotoxicity affecting HSCs increases the risk of AML by the full amount estimated in the Aksoy study for much higher exposure concentrations. These exceedance probabilities are estimated via the formula: P(metabolite Y in bone marrow exceeds NOAEL level for 10 ppm) = 1pnorm(log(predicted Y at 10 ppm), log(predicted Y at X ppm), log(2.24)/1.96)

The predicted Y values for different X values are given by the equations YPH= Y0PH+ YmaxPH*X/(X50PH+ X)YHQ= Y0HQ+ YmaxHQ*X/(X50HQ+ X)

The corresponding causal assigned share (CAS) formula is CAS < P(PH and HQ in bone marrow exceed NOAEL levels) × (0.00007/0.00013)             = P(PH and HQ in bone marrow exceed NOAEL levels)× (0.54).

The probability that both HQ and PH exceed their NOAEL levels is the product of their separate exceedance probabilities if the error (or inter-individual variability) distributions around their MLE estimated levels are independent. Alternatively, if variations in PH and HQ are correlated, then an upper bound on the probability that both are above their NOAEL levels is the probability that at least one of them is. As it turns out, the exceedance probability curves for PH and HQ as a function of air benzene level X almost coincide over most of the interval from 0 to 100 ppm, with values for HQ being smaller at low concentrations, so we can use the exceedance curve for PH as an upper bound on the probability that both HQ and PH exceed their NOAEL levels.

shows the resulting upper-bound estimates for IPoC. The right panel is a 10-fold expansion of the low-concentration region in the left panel. shows the corresponding CAS curve. (The CAS curve is just the IPoC curve with the vertical axis rescaled to have a maximum value of 0.54, the CAS estimated from the Aksoy data, instead of 1e-7, the IPoC estimated from the Aksoy data.) This curve indicates that a plausible lower bound for the probability that a worker who was steadily exposed to less than about 5 ppm of benzene and developed AML would have developed it anyway is between about 95% and 100%, even if an intervention had reduced occupational exposure to zero.

Figure 11. Plausible upper-bound estimates of IPoC as a function of benzene concentration in air. # Source: R code for left side of . PPH = exceedance probability for PH; Y0PH = 69.5; YmaxPH = 5293; X50PH = 107.1; Y0HQ = 6.36; YmaxHQ = 376; X50HQ = 50.9; X <- (0:1000)/20; YHQ <- Y0HQ + YmaxHQ*X/(X50HQ + X); YPH <- Y0PH + YmaxPH*X/(X50PH + X); PHQ <- 1-pnorm(log(68.1), log(YHQ), log(2.24)/1.96); PPH <- 1-pnorm(log(521.5), log(YPH), log(2.24)/1.96); plot(X, PPH*1e-7, ylab = "IPoC", xlab = "benzene ppm"); # plot(X, PPH*0.54, ylab = "CAS", xlab = "benzene ppm");

Figure 11. Plausible upper-bound estimates of IPoC as a function of benzene concentration in air. # Source: R code for left side of Figure 11. PPH = exceedance probability for PH; Y0PH = 69.5; YmaxPH = 5293; X50PH = 107.1; Y0HQ = 6.36; YmaxHQ = 376; X50HQ = 50.9; X <- (0:1000)/20; YHQ <- Y0HQ + YmaxHQ*X/(X50HQ + X); YPH <- Y0PH + YmaxPH*X/(X50PH + X); PHQ <- 1-pnorm(log(68.1), log(YHQ), log(2.24)/1.96); PPH <- 1-pnorm(log(521.5), log(YPH), log(2.24)/1.96); plot(X, PPH*1e-7, ylab = "IPoC", xlab = "benzene ppm"); # plot(X, PPH*0.54, ylab = "CAS", xlab = "benzene ppm");

Figure 12. Plausible upper-bound estimates of causal assigned share (CAS) for AML as a function of benzene concentration in air.

Figure 12. Plausible upper-bound estimates of causal assigned share (CAS) for AML as a function of benzene concentration in air.

Discussion of results for benzene

This example has illustrated how useful quantitative bounds can be developed from a partially elucidated causal model. In contrast to previous negative theoretical results demonstrating that useful bounds on various concepts of “probability of causation” may be difficult or impossible to obtain from observational data in general (Dawid et al. Citation2024), the analysis leading to and shows that combining causal knowledge and reasonable assumptions based on multiple types and sources of data gives plausible upper bounds on IPoC and CAS values for realistic exposure scenarios (e.g. concentrations below 5 ppm). These calculations suggest that benzene is unlikely to be a preventable cause of most (e.g. 95%–100%) of the AML cases in workers with such exposures. This conclusion depends on the following key elements, in addition to the conceptual framework for defining IPOC and CAS: measurements of benzene exposure concentrations and metabolites in urine of exposed workers; Michaelis-Menten type nonlinear regression models for metabolite concentrations; an estimated 5-fold variability in levels of metabolite level around their predicted levels (Rappaport et al. Citation2010); assumption of an approximate lognormal distribution of this variability (McNally et al. Citation2017); findings that relevant metabolite concentrations in bone marrow are approximately proportional to their levels in urine for relevant levels of exposure (Henderson Citation1996); use of a 10 ppm exposure concentration as a NOAEL for bone marrow myelotoxicity in the most sensitive species (mice) (Farris et al. Citation1997), and hence also in humans as a plausible lower bound for the human NOAEL for bone marrow toxicity and consequent increases in AML risk; upper-bound approximation of the probability that both HQ and PH concentrations in the bone marrow are high enough to cause toxic damage by the probability that this is true for just one of them (HQ); and upper-bound approximation of the relatively poorly understood part of the causal network for benzene-induced AML—the part that follows entry of PH and HQ into the bone marrow in —by a single large step function in which risk jumps from estimated background levels to the levels estimated in the highly exposed Aksoy population. This last step corresponds to replacing an unknown conditional probability table or model with an extreme upper bound by assuming that exposures capable of causing any adverse effect in the bone marrow cause AML with probability 1.

A perspective from the IPoC conceptual framework is that calculated IPoC and CAS curves such as those in and do not attempt to present ground truth about the risk preventable by interventions; rather, they only attempt to provide useful bounds on how large the risks preventable by interventions might be based on current knowledge and data. For this purpose, they use fully specified causal models that incorporate conservative assumptions and approximations. This invites those who wish to improve upon the calculated values to introduce and justify more accurate modeling assumptions and approximations and new data that can help to reduce uncertainties and refine the initial bounds based on current knowledge. For example, if it were discovered that inter-individual variations in levels of PH and HQ around their model-predicted values were statistically independent of each other, then the values on the y axes for IPoC and CAS in and would become dramatically smaller (roughly, the squares of the current values, so that the 0.05 in would become about 0.0025). These results are for an individual worker whose physiological parameters are uncertain, so that her individual levels of PH and HQ are drawn from lognormal uncertainty distributions reflecting plausible inter-individual variability. The IPoC curve can also be reinterpreted as showing the conditional mean value of the population distribution of increased AML risks for each level of benzene concentration in air.

The IPoC and CAS calculations are limited in that they hold for specific (and specified) causal models and input scenarios. The curves in and hold for constant exposure scenarios and for the assumed causal model (in essence, that there is no excess risk of AML without bone marrow toxicity and that there is no bone marrow toxicity without sufficiently high concentrations of PH and HQ), with the only uncertainties being about how much PH and HQ an individual will produce for a given exposure concentration of benzene. If exposures change over time and/or are uncertain, then the IPoC calculations would need to be modified to describe these input scenarios and additional details and uncertainties (e.g. in the parameters of a PBPK model and the response characteristics of the hematopoietic system) would have to be modeled (e.g. Cox Citation1996) to obtain fully specified input scenarios and causal models from which to calculate IPoC and CAS values. Similarly, even for constant-exposure scenarios and the causal model we have used, learning specific facts about an individual, such as levels of benzene metabolites in urine following exposure to a known concentration of benzene, could change and improve the IPoC and CAS estimates. Such observations could change the mean and the standard deviation of the uncertainty distribution for PH (and HQ) in response to a given level of benzene exposure, leading to updated IPoC and CAS curves. Finally, the realistic possibility that the estimated parameter values used in the regression models for predicting HQ and PH levels for each level of benzene in air might themselves fluctuate over time has not been captured in the model. However, each of these complications can be addressed by the tools illustrated in the relatively simple case we have considered. The key tools are Monte Carlo uncertainty analysis and use of bounding assumptions to generate conservative distributions for uncertain quantities. Combined with fully specified causal models and input scenarios, these methods can provide plausible upper bounds for IPoC and CAS estimates even for more complicated models.

II. Does formaldehyde exposure cause AML?

Whether inhalation of formaldehyde (abbreviated FA, also called methanal, HCHO) can increase risk of AML has been debated over the past few decades (Mundt et al. Citation2018). FA is produced naturally within the body and is normally detoxified by the enzyme alcohol dehydrogenase 5 (ADH5) and protected against by the Fanconi Anemia DNA repair pathway (Reingruber and Pontel Citation2018). It has an odor threshold of around 0.5 to 1 ppm, typically less than its irritant threshold (Golden Citation2011).

Qualitative causal dependencies: Causal hypotheses for formaldehyde-induced AML

The biological plausibility of FA as a cause of leukemia was questioned by Heck and Casanova (Citation2004) who noted that “The normal endogenous concentration of formaldehyde in the blood is approximately 0.1 mM in rats, monkeys, and humans” and drew attention to

(1) the failure of inhaled formaldehyde to increase the formaldehyde concentration in the blood of rats, monkeys, or humans exposed to concentrations of 14.4, 6, or 1.9 ppm, respectively; (2) the lack of detectable protein adducts or DNA-protein cross-links (DPX) in the bone marrow of normal rats exposed to [3H]- and [14C]formaldehyde at concentrations as high as 15 ppm; (3) the lack of detectable protein adducts or DPX in the bone marrow of glutathione-depleted (metabolically inhibited) rats exposed to [3H]- and [14C]formaldehyde at concentrations as high as 10 ppm; (4) the lack of detectable DPX in the bone marrow of Rhesus monkeys exposed to [14C]formaldehyde at concentrations as high as 6 ppm; (5) the failure of formaldehyde to induce leukemia in any of seven long-term inhalation bioassays in rats, mice, or hamsters; and (6) the failure of formaldehyde to induce chromosomal aberrations in the bone marrow of rats exposed to airborne concentrations as high as 15 ppm or of mice injected intraperitoneally with formaldehyde at doses as high as 25 mg/kg.

They concluded that “The abundance of negative evidence mentioned above is undisputed and strongly suggests that there is no delivery of inhaled formaldehyde to distant sites. Combined with the fact that formaldehyde naturally occurs throughout the body, and that multiple inhalation bioassays have not induced leukemia in animals, the negative findings provide convincing evidence that formaldehyde is not leukemogenic.” Similarly, Golden et al. (Citation2006) assessed the plausibility of FA as a leukemogen, stating that

Data on benzene and selected chemotherapeutic cancer drugs are… compared and contrasted with the available data on formaldehyde in order to judge whether they fulfill the criteria of biological plausibility that formaldehyde would be capable of inducing leukemia as suggested by the epidemiological data. … There is (1) no evidence to suggest that formaldehyde reaches any target organ beyond the site of administration including the bone marrow, (2) no indication that formaldehyde is toxic to the bone marrow/hematopoietic system in in vivo or in vitro studies, and (3) no credible evidence that formaldehyde induces leukemia in experimental animals.

Some subsequent experimental evidence has confirmed and strengthened these negative findings. For example, Kleinnijenhuis et al. (Citation2013) developed a sensitive test to detect radiolabeled exogenous FA at levels in blood at concentrations down to approximately 1.5% of the endogenous FA concentration and confirmed that inhalation of radiolabeled FA at 10 ppm in air for 6 h did not increase total measured FA concentration in blood. Leng et al. (Citation2019) found that exposures of rats to 1, 30 and 300 ppb of FA for 28 days (6 h/day) by nose-only inhalation, did not change the levels of endogenous formaldehyde-induced DNA adducts or DNA-protein crosslinks.

Unlike the Aksoy study and similar epidemiological studies of occupational exposure to benzene and risk of AML, the most informative occupational cohort studies and a large cancer registry-based occupational case-control study on formaldehyde and AML do not demonstrate excess AML risk (Mundt et al. Citation2021). In a re-analysis of the NCI cohort study of workers employed in six US plants producing formaldehyde and with an adequate number of AML cases, absolute peak exposure (e.g. >4 ppm), cumulative exposure (>2.5 ppm-years), duration of time worked at the highest peak, and time since highest peak exposure all generated no clear or consistent associations with AML (Checkoway et al. Citation2015). Other occupational cohort studies of workers exposed to formaldehyde similarly provide no clear or consistent evidence of increased risks of AML (Hayes et al. Citation1990; Blair et al. Citation2001; Meyers et al. Citation2013; and Saberi Hosnijeh et al. Citation2013). Even in the largest study—a case-control study of 1,201 incident AML cases exposed to formaldehyde (136 above 1.6 ppm)—no statistically significant increased risk was noted (Talibov et al. Citation2014).

In contrast, a much-cited case-control analysis of ML and AML deaths among embalmers and funeral directors reported unstable (due to small numbers) associations with peak exposure categories of <7.0 ppm (for AML OR = 1.8; 95% CI 0.4 to 9.3 (n = 4)) and >9.3 ppm (for AML OR = 2.9; 95% CI 0.7 to 12.5 (n = 7)), respectively. None was statistically significant, and the OR for the middle exposure category was even more unstable (Hauptmann et al. Citation2009). However, shortly after this study was reported, a critique was published demonstrating that the cases that were identified from a larger cohort did not represent a significant excess of ML (and presumably AML) cases (Cole et al. Citation2010). Specifically, the PMR for all myeloid leukemias (n = 29) was 108 (95% CI, 70–156) and for acute myeloid leukemia (n = 20) the PMR was 116 (71–179), neither statistically significant. Given this indication that there was no significant excess occurrence of AML among the group that gave rise to the AML cases, these unstable ORs must be interpreted with extreme caution.

Overall, the epidemiological literature on occupational exposure to formaldehyde and risk of AML does not demonstrate clear or consistent increased risk of AML (Allegra et al. Citation2019; Mundt et al. Citation2018; Mundt et al. Citation2021), and thus does not contradict what the animal experimental and mechanistic evidence show.

Reviews that critically evaluated epidemiological, toxicological and mechanistic evidence indicate that it is unlikely that inhaled formaldehyde increases the risk of AML (Allegra et al. Citation2019; Mundt et al. Citation2018). The lack of increased formaldehyde or FA metabolites in blood or bone marrow; the absence of observed increased AML risks in animals (mainly mice and rats); and the predominately negative epidemiological evidence presented in these, and other critical reviews and studies, provide compelling reasons to conclude that the hypothesis that FA causes AML is not supported. However, regulatory interest in FA has not diminished, and some researchers have responded with theories and conjectures about how FA might cause AML, despite the absence of clear evidence that it does cause AML. Some of the same researchers who suggested for benzene (the “no side effects” model in which all observed changes following benzene exposure are interpreted as contributing to causation of leukemia, even if they are in biologically irrelevant cells such as terminally differentiated peripheral white blood cells) also conjectured that FA might be a human leukemogen at occupational exposure levels. For example, Zhang et al. (Citation2009) stated that “We hypothesize that formaldehyde may act on bone marrow directly or, alternatively, may cause leukemia by damaging the hematopoietic stem or early progenitor cells that are located in the circulating blood or nasal passages, which then travel to the bone marrow and become leukemic stem cells.”

How this transformation would occur and how to reconcile this hypothesis with the negative evidence of AML previously described were not explained, given the fact that formaldehyde is not systemically distributed and that leukemia is not observed in FA-exposed animals at any concentration. However, claims that FA causes hematotoxicity in animals and humans—including in bone marrow—soon appeared, with reactive oxygen species (ROS) and oxidative stress (OS) commonly proposed as possible mechanisms for the hypothesized leukemic effect of FA. For example, a paper by Zhang et al. (Citation2010) reported:

“We examined the ability of formaldehyde to disrupt hematopoiesis in a study of 94 workers in China (43 exposed to formaldehyde and 51 frequency-matched controls) by measuring complete blood counts and peripheral stem/progenitor cell colony formation. Further, myeloid progenitor cells, the target for leukemogenesis, were cultured from the workers to quantify the level of leukemia-specific chromosome changes, including monosomy 7 and trisomy 8, in metaphase spreads of these cells. Among exposed workers, peripheral blood cell counts were significantly lowered in a manner consistent with toxic effects on the bone marrow and leukemia-specific chromosome changes were significantly elevated in myeloid blood progenitor cells. These findings suggest that formaldehyde exposure can have an adverse effect on the hematopoietic system and that leukemia induction by formaldehyde is biologically plausible, which heightens concerns about its leukemogenic potential from occupational and environmental exposures.”

Mundt et al. (Citation2017) obtained most of the original data from this study via a National Cancer Institute Data Transfer Agreement and reanalyzed the reported findings based on individual worker’s measured average formaldehyde levels. Although the actual three measured formaldehyde exposures obtained on each worker in the study were not provided, their average was. This provided an opportunity to statistically evaluate and report possible relationships between the blood and chromosomal measures and level of exposure among the exposed part of the cohort—something the original Zhang et al. (Citation2010) publication failed to do. Mundt et al. highlighted the fact that what Zhang et al. described as “changes” in the cells of workers exposed to FA in were in fact not changes, as the study was a simple cross-sectional analysis, but rather differences from the average values observed in a non-exposed group of workers. Such differences in average values of hematopoietic parameters across groups can and do occur for many reasons, so describing them as changes caused by exposure goes beyond what the data show. Doing so fails to preserve the “basic distinction” between differences and changes that is essential for valid causal analysis and interpretation of data (Pearl Citation2009). Mundt et al. also demonstrated that none of the reported hematological parameters was correlated with individual-level FA exposure estimates. They also noted that there was a clear “lack of evidence that group differences in aneuploidy are significant to leukemogenesis” even if such genetic markers were on the causal pathway between formaldehyde exposure and AML and not side effects.

In response, Rothman, Zhang et al. (Citation2018) stated, based on observational and not interventional evidence, that “Aneuploidy of specific chromosomes is clearly an important mechanism of leukemia induction based on its presence in many cases of myeloid neoplasms, including acute myeloid leukemia (AML) and myelodysplastic syndromes.” This confuses association (“presence in many cases”) with causation (“an important mechanism of leukemia induction”) in a way that is consistent with the “no side-effects” conceptualization of leukemogenesis, i.e. the implicit assumption that the relevant causal DAG model is exposureaneuploidyleukemia rather than leukemiaexposureaneuploidy. However, Rothman et al. go on to agree with Mundt et al. that “There is, however, no direct evidence that higher aneuploidy rates in cultured myeloid progenitor cells are related to future risk of leukemia.” They then clarify their claim of biological plausibility for and FA-AML exposure-response relationship as follows: “As benzene is an established leukemogen, a known inducer of aneuploidy, and was associated with higher rates of monosomy 7 in the cultured myeloid progenitor cells of exposed workers, we reasoned that showing a similar association for workers exposed to formaldehyde supports the biological plausibility that formaldehyde causes myeloid leukemia.” This makes explicit the conflation of association with causation in claiming biological plausibility for an FA-AML causal relationship. The claim is based on an association for benzene, rather than on a demonstration that changes in FA cause, or plausibly could cause, changes in AML risk, and implies that they would have similar ranges of IPoC estimates for occupational exposure to each.

Even for benzene, the specific changes that Zhang et al. refer to as “leukemia-related aneuploidies” (specifically, monosomy 7 and trisomy 8) are not leukemia-related, insofar as they occur in terminally differentiated peripheral blood cells of exposed workers that are incapable of becoming leukemic. Moreover, as pointed out by Kerzic and Irons (Citation2017), these specific chromosomal changes are commonly found in the peripheral blood cells of healthy workers exposed to benzene, but not in patients with AML following benzene exposure; thus, they may serve as markers of exposure to benzene rather than as indicators of AML risk. That is, they may be what we have termed side effects of benzene exposure rather than causes of benzene-associated AML. They may be merely associative effects, but not key events in a mode of action analysis. Gentry et al. (Citation2013) discussed related points, including that

“[T]he assays used (CFU-GM) [by Zhang et al.] do not actually measure the proposed events in primitive cells involved in the development of acute myeloid leukemia. Evaluation of these data indicates that the aneuploidy measured could not have arisen in vivo, but rather arose during in vitro culture. The results of our critical review and reanalysis of the data, in combination with recent toxicological and mechanistic studies, do not support a mechanism for a causal association between formaldehyde exposure and myeloid or lymphoid malignancies.”

Despite these methodological and interpretational flaws, the Zhang et al. study proved influential (e.g. in the deliberations and conclusions of IARC’s Monograph 100 F committee, which admittedly did not have access to the updated analysis of the blood measures and aneuploidies by formaldehyde exposure measurements) and the notion that there might be a biological basis for believing that FA causes AML (or perhaps other leukemias) propagated, notwithstanding the lack of any empirically demonstrated causal link with AML from any line of scientific inquiry. For example, in reviewing the literature in 2021, Kang et al. stated:

“Through the literature-based network approach, we summarized qualitative associations between formaldehyde exposure and leukemia. Our results indicate that oxidative stress-mediated genetic changes induced by formaldehyde could disturb the hematopoietic system, possibly leading to leukemia. Furthermore, we suggested major genes that are thought to be affected by formaldehyde exposure and associated with leukemia development. Our suggestions can be used to complement experimental data for understanding and identifying the leukemogenic mechanism of formaldehyde.”

As a caveat for this association-based approach, they added that “To better understand the leukemogenicity of formaldehyde, reproducible experiments that determine the causality are needed” (Kang et al. Citation2021). This is an important caveat, insofar as oxidative damage is often a secondary effect of the actual toxic effect (e.g. in inflammation-mediated diseases) and because multiple double-blind clinical trials that have evaluated antioxidants as chemopreventive agents for various types of cancer have found instead that cancer risk increases with antioxidant dose (e.g. Goodman et al. Citation2004).

The suggestion that oxidative stress might provide the missing link between FA exposure and hypothesized increases in AML risk was advanced by Zhang and coworkers (again despite the fact that their cross-sectional study was incapable of demonstrating any changes in any parameter) in the following terms (Wei et al. Citation2017):

“Although FA did not induce leukemia in mice or rats exposed to life-long high levels of FA in a much earlier study (Kerns et al. Citation1983), we have recently shown that relatively low levels of FA induces toxic effects in BM [bone marrow] of mice exposed by short-term nose-only inhalation… Oxidative stress has been shown to occur in multiple tissues in FA-exposed rats and mice… and it is a proposed mechanism of leukemogenesis induced by the leukemogen benzene (McHale et al. Citation2012). … Our findings suggest that FA may induce BM toxicity by affecting myeloid progenitor growth and survival through oxidative damage. …Oxidative stress, apoptosis and dysregulation of CSF receptors are potential underlying mechanisms of FA-induced hematopoietic stem/progenitor toxicity.”

Likewise, Bernardini et al. (Citation2020) presented evidence of increased oxidative stress in the lung and inflammatory cells in bronchoalveolar lavage and liver and increased micronucleus (MN) frequency in bone marrow cells.

Accepting at face value reports that FA-exposed mice show bone marrow (BM) toxicity—specifically, increased MN frequency (Allegra et al. Citation2019, ) and decreased nucleated cell counts at 80 mg/m3 of FA in air (Yu et al. Citation2014)—highlights the puzzle of how even very high concentrations of FA could cause these effects in the bone marrow if they do not affect FA concentration in blood (Heck and Casanova Citation2004) and do not cause FA-derived DNA adducts in bone marrow (Allegra et al. Citation2019) or increase the rates of FA-induced DNA adducts or DNA-protein crosslinks above those explained by endogenous FA alone (Swenberg et al. Citation2011; Leng et al. Citation2019). Further consideration of endogenously produced formaldehyde (which in humans and many other mammals, despite a 1–1.5 min half-life, results in steady state blood concentrations exceeding 2 mg/L) (European Food Safety Authority Citation2014) as a cause of AML is beyond the scope of our illustration.

We propose that an overlooked plausible explanation for observed changes in animals experimentally highly exposed to formaldehyde is psychological stress from being experimented on. It is well established that psychological stress—e.g. from having freedom of movement restricted and being forced to breath highly irritating concentrations of formaldehyde for long periods—induces the types of changes being attributed to FA exposure, including increased MN, chromosomal aberrations, and genotoxic damage in bone marrow cells of rodents; increased ROS production and oxidative stress in many tissues; and hematological changes, including changes in both peripheral blood cell counts and bone marrow cell count (e.g. Oishi et al. Citation1999; Chung et al. Citation2010; Samarghandian et al. Citation2017; Shcherbinina et al. Citation2021). Forcing rodents for weeks to inhale concentrations of FA orders of magnitude higher than the odor and irritation thresholds (especially via nose-only systems that restrain freedom of motion) is a plausible source of psychological stress (Wong Citation2007), and such stress has demonstrable physiological consequences. For example, nose-only exposures even to clean air (zero concentration of FA) induces significant changes in gene expression for immune response, apoptosis, and signal transduction (Li et al. Citation2013). Recent papers attributing oxidative stress and other effects in bone marrow to FA exposure (e.g. Wei et al. Citation2017; Bernardini et al. Citation2020) have not controlled for the confounding by experiment-induced stress. The possibility that FA-associated changes are actually caused by experiment-induced stress could also explain the lack of observed dependence between these responses and FA exposure levels among workers exposed to formaldehyde (Mundt et al. Citation2017).

This qualitative causal hypothesis can be expressed via the following causal DAG: FA exposureexperimental conditionspsychological stressresponses (oxidative stress, MN, etc.)

No arrows to AML are shown because no observations have demonstrated that AML probability is changed by exposure to FA. This model implies that responses such as oxidative stresses, MN in bone marrow, etc. are conditionally independent of the quantitative amount of FA exposure, given the experimental conditions (e.g. forced inhalation of high concentration of irritant while being held immobile). It is perhaps worth noting that the traditional Bradford Hill considerations (of strong, consistent, biologically plausible-seeming, temporal, etc. associations), as well as more recent weight-of-evidence (WOE) systems based on similar heuristic considerations, do not address or refute the possibility of alternative mechanistic explanations for observed associations. Nor do they make use of conditional independence concepts or tests to ascertain whether data are consistent with hypothesized interventional causal interpretations of exposure-response associations. Modern causal methods that do consider alterative possible explanations (i.e. causal graph structures consistent with observations) and conditional independence tests may therefore contribute additional insights and clearer determinations about whether data provide evidence for possible interventional causal relationships than the Bradford Hill or WOE considerations alone (Cox Citation2018).

III. Discussion of results for formaldehyde compared to benzene

It has been recognized for many years that causal analysis of potential exposure-response relationships is very different for FA and benzene (Golden et al. Citation2006). For benzene, the hypothesis that benzene causes AML was anchored by the pioneering work of Aksoy et al. (Citation1974), demonstrating that sufficiently high and prolonged exposures to benzene, on the order of several hundred ppm for decades, clearly increase the risk of AML. Subsequent experimental research into the underlying mechanisms now helps explain the observed epidemiological association. First, by contrast, no similar clear increase in AML risk has been found for any level of FA exposure; indeed, most of the epidemiological evidence of an FA-AML association is negative (Allegra et al. Citation2019; Mundt et al. Citation2018; Collins and Lineker Citation2004). Second, it is clear that whereas benzene is systemically distributed, formaldehyde does not pass beyond the portal of entry (typically the nose) and cannot reach the bone marrow. Third, it is well established that benzene is metabolized in the liver (Phase 1) and bone marrow (Phase 2) and that levels of benzene and its metabolites are elevated in blood, urine, and bone marrow during and following benzene exposure. By contrast, no elevation of FA or its metabolites is found in blood or bone marrow (or anywhere distal to the portal of entry) during and following FA exposures in animals or people (e.g. Kleinnijenhuis et al. Citation2013). Finally, although oxidative stress, increased ROS production, immune effects, changes in apoptosis, chromosomal aberrations, and other hypothesized mechanisms have been propounded (largely by a single research group) for both FA and low levels of benzene to explain how they might cause AML, no empirical evidence of these being true for formaldehyde has been advanced, and no actual causation or clearly increased probability of AML has been observed in animals or humans for either FA or for low levels of benzene exposure. These hypothesized mechanisms are better interpreted from the perspective of the proposed causal models as showing side effects or markers of benzene exposure (e.g. Kerzic and Irons Citation2017) and FA exposure than as showing critical events along causal paths to AML. The extent to which FA-associated changes are caused by the stress elicited by experimental conditions, rather than by FA itself, remains to be determined, but we believe that the spectrum of changes associated with FA exposure in recent literature (e.g. Kang et al. Citation2021) is precisely what one might expect to see if FA has no causal relationship to AML risk. The epidemiological evidence furthermore provides no clear support for a conclusion that FA causes AML (Allegra et al. Citation2019; Mundt et al. Citation2018; Collins and Lineker Citation2004), or for that matter, any lymphohematopoietic malignancy.

Part of the causal network leading from benzene exposure to increased AML risk has been elucidated and is widely accepted. This part corresponds to the following causal chain: benzene exposuremetabolismtoxic metabolites in bone marrowincreased risk of AML

Previous sections argued that this fragment of an (as-yet unknown) completely elucidated causal network, perhaps similar to , together with quantitative estimates of the levels of PH and HQ required to cause bone marrow toxicity and the population distribution of levels of these metabolites produced by metabolism of different levels of benzene exposure concentration, suffices to put plausible upper bounds on increases in AML risk caused by benzene exposures and preventable by interventions that reduce them ( and ).

By contrast, no similar causal chain has been established for FA. Indeed, the body of toxicological, mechanistic and epidemiological evidence in many respects argues against some of the necessary links in the hypothesized causal chains (Allegra et al. Citation2019). It seems consistent with what is currently known that a DAG model such as the following might explain the observed animal experimental data for FA:                                                                                                                                                                                     AMLco-exposures                                                                                                                                                                                                                    FA exposureexperimental conditionspsychological stressresponses (oxidative stress, MN, etc.)

Here, the co-exposures refer to epidemiological data; experimental conditions refer to animal data; and responses such as oxidative stress, MN, and chromosomal aberrations occur in both people and animals under a wide range of conditions, including as consequences rather than only as causes of AML.

As noted above, many causal discovery and causal model validation algorithms developed and applied in epidemiology test whether the conditional independence relationships implied by a DAG model are consistent with those found in the observed data that they are hypothesized to explain (Runge et al. Citation2019). For example, the simple causal model XYZ, where X = formaldehyde exposure, Y = internal dose, and Z = AML risk, respectively, implies that if X is changed but Y does not (e.g. if an exposure such as FA is not systemically distributed and cannot reach the BM), then Z should not respond to the changes in X: the DAG implies that Z is conditionally independent of X, given Y. This appears to be the case for FA. If it is, then the IPoC for interventions that reduce FA exposures in an effort to reduce AML risk is zero: there is no known causal path leading from changes in exogenous FA to changes in AML risk.

Conclusions

The previous sections have shown that combining causal knowledge and conservative assumptions based on multiple types and sources of data gives plausible upper bounds on IPoC and CAS values for realistic benzene exposure scenarios (e.g. concentrations below 5 ppm) and AML, whereas realistic formaldehyde exposure scenarios have no causal connection to AML. This provides information useful for differentiating between relationships that are improbably causal (e.g. formaldehyde and AML) and those that likely are (e.g. high, prolonged benzene exposures and AML), as well as for understanding the distribution of impacts across individuals in a heterogeneous population.

For benzene and AML, contrary to previous negative theoretical general results for “probability of causation” from observational data, we have illustrated how usefully small quantitative plausible upper bounds for IPoC can be developed from a partially elucidated, but fully specified, causal model. These plausible upper bounds on IPoC and CAS values for realistic exposure scenarios (e.g. concentrations below 5 ppm) indicate that benzene is unlikely to be a preventable cause of most (e.g. 95% or more) of the AML cases in workers with these levels of exposures. This conclusion follows from the several elements used in the calculation around measurement concentrations and metabolite variability. The calculated IPoC and CAS curves such as those in and only attempt to provide useful bounds on how large the risks preventable by interventions might be based on current knowledge. Improving these bounds requires introducing and justifying more accurate modeling assumptions and approximations, or new data. The curves for plausible upper-bound estimates of CAS for AML as a function of benzene concentration in air are limited in that they hold only for constant exposure scenarios and for the assumed causal model (i.e. no excess risk of AML without bone marrow toxicity and no bone marrow toxicity without sufficiently high concentrations of PH and HQ), with the only uncertainties being about how much PH and HQ an individual will produce for a given exposure concentration. For formaldehyde and AML, by contrast, the situation is qualitatively different, with no evidence identified for a confirmed causal pathway leading from exposure to an increased risk of AML.

The interventional probability of causation (IPoC) and causal assigned share (CAS) methods developed and illustrated here are intended to help risk analysis practitioners base causal evaluations on interventional causation concepts and principles—that is, on data-driven estimates of how much change in risk would be caused by changing exposure—rather than on interpreting associations or making hard-to-verify judgments about risk attribution. We have proposed an approach to quantifying the IPoC and CAS that makes use of available, realistically limited, mechanistic knowledge and evidence, and that allows and invites improvement with new information as it becomes available. We hope that other practitioners will improve these ideas and methods useful and them and apply them to other chemicals and diseases where estimating interventional causal impacts can help to guide improved risk management decisions.

Abbreviations
AML=

acute myeloid leukemia

ANNL=

acute non-lymphoblastic leukemia

AS=

assigned share

BM=

bone marrow

BN=

Bayesian network

BO=

benzene oxide

BQ=

benzoquinone

CAI=

causal artificial intelligence

CAM=

Causal Additive Model

CART=

classification and regression tree

CAS=

causal assigned share

CAT=

catechol

CDF=

cumulative probability distribution functions

CFU-GM=

granulocyte-macrophage colony-forming units

CPT=

conditional probability table

CYP2E1=

cytochrome P450 2E1

DAG=

directed acyclic graph

DPX=

DNA-protein cross-links

FA=

formaldehyde

FCI=

Fast Causal Inference

GES=

Greedy Equivalence Search

GSH=

glutathione

HQ=

hydroquinone

HSC=

hematopoietic stem cell

IARC=

International Agency for Research on Cancer

ICP=

invariant causal prediction

IPoC=

interventional probability of causation

LinGAM=

Linear Non-Gaussian Acyclic Model

LSC=

leukemia stem cell

MDS=

myelodysplastic syndrome

ML=

myeloid leukemias

MN=

micronucleus

NOAEL=

no observed adverse effects level

OS=

oxygen species

PC=

probability of causation

PC Algorithm=

Peter-Clark Algorithm

PH=

phenol

ROS=

reactive oxygen species

SPMA=

S-phenylmercapturic acid

TSCE=

two-stage clonal expansion

Acknowledgments

The authors gratefully acknowledge the close reading and helpful comments of two anonymous reviewers. One reviewer, in particular, suggested adding comments and references on causal relevance of animal models and of various proposed modes of actions and biological mechanisms. We are grateful for these very substantive comments, which improved our final exposition.

Declaration of interest

This research was sponsored in part by a grant to Cox Associates, LLC from the Center for Truth in Science (https://truthinscience.org/about/mission/), a 501(c)3 nonprofit organization. The authors were paid by Cox Associates, LLC. The sponsors had no role in the design, conduct, analysis, interpretation or reporting, nor were reviewer comments or legal reviews invited or received. The methods and conclusions are solely those of the authors. The Center for Truth in Science states that “The Center commissions research projects conducted by independent scientists without political, cultural, technical, or ideological bias. The Center contributes to a healthy and balanced system in which judicial and regulatory decisions are based on objective, unbiased, sound, and comprehensive analyses of scientific evidence.” As part of this commitment, the authors do not know who funded the Center for this work or what they expected from our research. KAM and WJT have previously been retained as expert witnesses on behalf of defendants in litigation matters in which it has been alleged that benzene or formaldehyde caused various cancers. The authors have also served as consultants providing scientific advice to various corporations, law firms, or scientific/professional organizations. LAC offered oral comments to the National Academies in December, 2022 calling for better use of formal causal analysis methods in risk assessment of formaldehyde (https://downloads.regulations.gov/EPA-HQ-OPPT-2018-0438-0108/attachment_19.pdf). WJT also offered oral comments to the National Academies in December, 2022 calling for a closer evaluation of the epidemiological evidence, and potential biases, on formaldehyde and myeloid leukemias (https://www.youtube.com/watch?v=w8vXCJi02FY). KAM on several occasions has presented oral and written public comments to EPA IRIS throughout the evolution of the "IRIS Toxicological Review of Formaldehyde-Inhalation (USEPA Citation2022)" including to the associated NAS peer-review committee, as well as written comments to the French Agency for Food, Environmental and Occupational Health & Safety (ANSES) regarding ANSES Opinion under "Request No. 2021-SA-0031 'Occupational Disease -Formaldehyde and myeloid leukaemias.'” He was nominated to serve and at the time of this submission remained on the candidate list of "Experts Being Considered for Participation as Ad Hoc Peer Reviewers; Docket ID EPA-HQOPPT-2023-0613" specifically to serve as an ad hoc peer reviewer assisting EPA’s Science Advisory Committee on Chemicals (SACC) with their evaluation of the latest draft IRIS formaldehyde review. LAC, KAM, and WJT have not provided public comments regarding benzene regulation.

References

  • Agler R, De Boeck P. 2017. On the interpretation and use of mediation: multiple perspectives on mediation analysis. Front Psychol. 8:1984. doi: 10.3389/fpsyg.2017.01984.
  • Aksoy M, Erdem Ş, DinÇol G. 1974. Leukemia in shoe-workers exposed chronically to benzene. Blood. 44(6):837–841. doi: 10.1182/blood.V44.6.837.837.
  • Allegra A, Spatari G, Mattioli S, Curti S, Innao V, Ettari R, Allegra AG, Giorgianni C, Gangemi S, Musolino C. 2019. Formaldehyde exposure and acute myeloid leukemia: a review of the literature. Medicina (Kaunas). 55(10):638. doi: 10.3390/medicina55100638.
  • Álvarez-González B, Porras-Quesada P, Arenas-Rodríguez V, Tamayo-Gómez A, Vázquez-Alonso F, Martínez-González LJ, Hernández AF, Álvarez-Cubero MJ. 2023. Genetic variants of antioxidant and xenobiotic metabolizing enzymes and their association with prostate cancer: a meta-analysis and functional in silico analysis. Sci Total Environ. 898:165530. doi: 10.1016/j.scitotenv.2023.165530.
  • Andersen ME, Lutz RW, Liao KH, Lutz WK. 2006. Dose-incidence modeling: consequences of linking quantal measures of response to depletion of critical tissue targets. Toxicol Sci. 89(1):331–337. doi: 10.1093/toxsci/kfj024.
  • Andersen SK, Olesen KG, Jensen FV, Jensen F. 1989. HUGIN—A shell for building Bayesian belief universes for expert systems. Proceedings of the 11th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Vol. 2, pp. 1080–1085.
  • Beane Freeman LE, Blair A, Lubin JH, Stewart PA, Hayes RB, Hoover RN, Hauptmann M. 2009. Mortality from lymphohematopoietic malignancies among workers in formaldehyde industries: the National Cancer Institute Cohort. J Natl Cancer Inst. 101(10):751–761. doi: 10.1093/jnci/djp096.
  • Bernardini L, Barbosa E, Charão MF, Goethel G, Muller D, Bau C, Steffens NA, Santos Stein C, Moresco RN, Garcia SC, et al. 2020. Oxidative damage, inflammation, genotoxic effect, and global DNA methylation caused by inhalation of formaldehyde and the purpose of melatonin. Toxicol Res (Camb). 9(6):778–789. doi: 10.1093/toxres/tfaa079.
  • Bielza C, Larrañaga P. 2014. Bayesian networks in neuroscience: a survey. Front Comput Neurosci. 8:131. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199264/. doi: 10.3389/fncom.2014.00131.
  • Blair A. 2023. Mediation and moderation. https://ademos.people.uic.edu/Chapter14.html.
  • Blair A, Zheng T, Linos A, Stewart PA, Zhang YW, Cantor KP. 2001. Occupation and leukemia: a population-based case-control study in Iowa and Minnesota. Am J Ind Med. 40(1):3–14. doi: 10.1002/ajim.1066.
  • Boobis AR, Cohen SM, Dellarco V, McGregor D, Meek ME, Vickers C, Willcocks D, Farland W. 2006. IPCS framework for analyzing the relevance of a cancer mode of action for humans. Crit Rev Toxicol. 36(10):781–792. doi: 10.1080/10408440600977677.
  • Brown T, Dassonville C, Derbez M, Ramalho O, Kirchner S, Crump D, Mandin C. 2015. Relationships between socioeconomic and lifestyle factors and indoor air quality in French dwellings. Environ Res. 140:385–396. doi: 10.1016/j.envres.2015.04.012.
  • Checkoway H, Dell LD, Boffetta P, Gallagher AE, Crawford L, Lees PS, Mundt KA. 2015. Formaldehyde exposure and mortality risks from acute myeloid leukemia and other lymphohematopoietic malignancies in the US National Cancer Institute cohort study of workers in formaldehyde industries. J Occup Environ Med. 57(7):785–794. doi: 10.1097/JOM.0000000000000466.
  • Chen J, Xiao Q, Li X, Liu R, Long X, Liu Z, Xiong H, Li Y. 2022. The correlation of leukocyte-specific protein 1 (LSP1) rs3817198(T > C) polymorphism with breast cancer: a meta-analysis. Medicine (Baltimore). 101(45):e31548. doi: 10.1097/MD.0000000000031548.
  • Chung IM, Kim YM, Yoo MH, Shin MK, Kim CK, Suh SH. 2010. Immobilization stress induces endothelial dysfunction by oxidative stress via the activation of the angiotensin II/its type I receptor pathway. Atherosclerosis. 213(1):109–114. doi: 10.1016/j.atherosclerosis.2010.08.052.
  • Cole P, Adami HO, Trichopoulos D, Mandel J. 2010. Formaldehyde and lymphohematopoietic cancers: a review of two recent studies. Regul Toxicol Pharmacol. 58(2):161–166. doi: 10.1016/j.yrtph.2010.08.013.
  • Collins JJ, Ireland B, Buckley CF, Shepperly D. 2003. Lymphohaematopoeitic cancer mortality among workers with benzene exposure. Occup Environ Med. 60(9):676–679. doi: 10.1136/oem.60.9.676.
  • Collins JJ, Lineker GA. 2004. A review and meta-analysis of formaldehyde exposure and leukemia. Regul Toxicol Pharmacol. 40(2):81–91. doi: 10.1016/j.yrtph.2004.04.006.
  • Cox LA. Jr. 1996. Reassessing benzene risks using internal doses and Monte-Carlo uncertainty analysis. Environ Health Perspect. 104 Suppl 6(Suppl 6):1413–1429. doi: 10.1289/ehp.961041413.
  • Cox LA. Jr. 1999. A biomathematical model of hematotoxicity. Environ Int. 25(6–7):805–817. doi: 10.1016/S0160-4120(99)00055-0.
  • Cox LA. Jr. 2000. A biomathematical model of cyclophosphamide hematotoxicity. J Toxicol Environ Health A. 61(5-6):501–510. doi: 10.1080/00984100050166550.
  • Cox LA. Jr. 2006. Quantifying potential health impacts of cadmium in cigarettes on smoker risk of lung cancer: a portfolio-of-mechanisms approach. Risk Anal. 26(6):1581–1599. doi: 10.1111/j.1539-6924.2006.00848.x.
  • Cox LA. Jr. 2018. Modernizing the Bradford Hill criteria for assessing causal relationships in observational data. Crit Rev Toxicol. 48(8):682–712. doi: 10.1080/10408444.2018.1518404.
  • Cox LA, Jr, Ketelslegers HB, Lewis RJ. 2021. The shape of low-concentration dose-response functions for benzene: implications for human health risk assessment. Crit Rev Toxicol. 51(2):95–116. doi: 10.1080/10408444.2020.1860903.
  • Daniel R, Zhang J, Farewell D. 2021. Making apples from oranges: comparing noncollapsible effect estimators and their standard errors after adjustment for different covariate sets. Biom J. 63(3):528–557. doi: 10.1002/bimj.201900297.
  • Darwiche A. 2001. Recursive conditioning. Artif Intell. 126(1–2):5–41. https://www.sciencedirect.com/science/article/pii/S0004370200000692. doi: 10.1016/S0004-3702(00)00069-2.
  • Dawid P, Humphreys M, Musio M. 2024. Bounding causes of effects with mediators. Sociol. Methods Res. 53(1):28–56. doi: 10.1177/00491241211036161.
  • Dechter R. 1999. Bucket elimination: a unifying framework for reasoning. Artif Intell. 113(1–2):41–85. https://www.sciencedirect.com/science/article/pii/S0004370299000594. doi: 10.1016/S0004-3702(99)00059-4.
  • Degtiar I, Rose S. 2023. A review of generalizability and transportability. Annu Rev Stat Appl. 10(1):501–524. doi: 10.1146/annurev-statistics-042522-103837.
  • Dewi R, Hamid ZA, Rajab N, Shuib S, Razak SA. 2020. Genetic, epigenetic, and lineage-directed mechanisms in benzene-induced malignancies and hematotoxicity targeting hematopoietic stem cells niche. Hum Exp Toxicol. 39(5):577–595. doi: 10.1177/0960327119895570.
  • Didelez V, Stensrud MJ. 2022. On the logic of collapsibility for causal effect measures. Biom J. 64(2):235–242. doi: 10.1002/bimj.202000305.
  • European Food Safety Authority. 2014. Endogenous formaldehyde turnover in humans compared with exogenous contribution from food sources. EFSA J. 12(2):3550. doi: 10.2903/j.efsa.2014.3550.
  • Farris GM, Robinson SN, Gaido KW, Wong BA, Wong VA, Hahn WP, Shah RS. 1997. Benzene-induced hematotoxicity and bone marrow compensation in B6C3F1 mice. Fundam Appl Toxicol. 36(2):119–129. doi: 10.1006/faat.1997.2293.
  • Gallegos-Arreola MP, Zúñiga-González GM, Figuera LE, Puebla-Pérez AM, Márquez-Rosales MG, Gómez-Meda BC, Rosales-Reynoso MA. 2022. ESR2 gene variants (rs1256049, rs4986938, and rs1256030) and their association with breast cancer risk. PeerJ. 10:e13379. doi: 10.7717/peerj.13379.
  • Gamella JL, Heinze-Deml C. 2020. Active Invariant Causal Prediction: experiment Selection through Stability. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.
  • Gentry PR, Rodricks JV, Turnbull D, Bachand A, Van Landingham C, Shipp AM, Albertini RJ, Irons R. 2013. Formaldehyde exposure and leukemia: critical review and reevaluation of the results from a study that is the focus for evidence of biological plausibility. Crit Rev Toxicol. 43(8):661–670. doi: 10.3109/10408444.2013.818618.
  • Glasgow G, Ramkrishnan B, Smith AE. 2022. A simulation-based assessment of the ability to detect thresholds in chronic risk concentration-response functions in the presence of exposure measurement error. PLoS One. 17(3):e0264833. doi: 10.1371/journal.pone.0264833.
  • Glass DC, Gray CN, Jolley DJ, Gibbons C, Sim MR, Fritschi L, Adams GG, Bisby JA, Manuell R. 2003. Leukemia risk associated with low-level benzene exposure. Epidemiology. 14(5):569–577. doi: 10.1097/01.ede.0000082001.05563.e0.
  • Golden R. 2011. Identifying an indoor air exposure limit for formaldehyde considering both irritation and cancer hazards. Crit Rev Toxicol. 41(8):672–721. doi: 10.3109/10408444.2011.573467.
  • Golden R, Pyatt D, Shields PG. 2006. Formaldehyde as a potential human leukemogen: an assessment of biological plausibility. Crit Rev Toxicol. 36(2):135–153. doi: 10.1080/10408440500533208.
  • Goodman GE, Thornquist MD, Balmes J, Cullen MR, Meyskens FL, Jr, Omenn GS, Valanis B, Williams JH. Jr. 2004. The Beta-Carotene and Retinol Efficacy Trial: incidence of lung cancer and cardiovascular disease mortality during 6-year follow-up after stopping beta-carotene and retinol supplements. J Natl Cancer Inst. 96(23):1743–1750. doi: 10.1093/jnci/djh320.
  • Greenland S. 2015. Concepts and pitfalls in measuring and interpreting attributable fractions, prevented fractions, and causation probabilities. Ann Epidemiol. 25(3):155–161. doi: 10.1016/j.annepidem.2014.11.005.
  • Greenland S, Robins JM. 1988. Conceptual problems in the definition and interpretation of attributable fractions. Am J Epidemiol. 128(6):1185–1197. doi: 10.1093/oxfordjournals.aje.a115073.
  • Greenland S, Robins JR, Pearl J. 1999. Confounding and collapsibility in causal inference. Statist Sci. 14(1):29–46. doi: 10.1214/ss/1009211805.
  • Gross SA, Paustenbach DJ. 2018. Shanghai Health Study (2001–2009): what was learned about benzene health effects? Crit Rev Toxicol. 48(3):217–251. doi: 10.1080/10408444.2017.1401581.
  • Guo H, Ahn S, Zhang L. 2020. Benzene-associated immunosuppression and chronic inflammation in humans: a systematic review. Occup Environ Med. 16:oemed-2020-106517. doi: 10.1136/oemed-2020-106517.
  • Guo X, Zhong W, Chen Y, Zhang W, Ren J, Gao A. 2019. Benzene metabolites trigger pyroptosis and contribute to haematotoxicity via TET2 directly regulating the Aim2/Casp1 pathway. EBioMedicine. 47:578–589. doi: 10.1016/j.ebiom.2019.08.056.
  • Haiman CA, Patel YM, Stram DO, Carmella SG, Chen M, Wilkens LR, Le Marchand L, Hecht SS. 2016. Benzene uptake and glutathione S-transferase T1 status as determinants of S-phenylmercapturic acid in cigarette smokers in the multiethnic cohort. PLoS One. 11(3):e0150641. doi: 10.1371/journal.pone.0150641.
  • Hanahan D. 2022. Hallmarks of cancer: new dimensions. Cancer Discov. 12(1):31–46. doi: 10.1158/2159-8290.CD-21-1059.
  • Hanahan D, Weinberg RA. 2011. Hallmarks of cancer: the next generation. Cell. 144(5):646–674. PMID: 21376230. doi: 10.1016/j.cell.2011.02.013.
  • Harlow PH, Perry SJ, Stevens AJ, Flemming AJ. 2018. Comparative metabolism of xenobiotic chemicals by cytochrome P450s in the nematode Caenorhabditis elegans. Sci Rep. 8(1):13333. doi: 10.1038/s41598-018-31215-w.
  • Hauptmann M, Stewart PA, Lubin JH, Beane Freeman LE, Hornung RW, Herrick RF, Hoover RN, Fraumeni JF, Jr, Blair A, Hayes RB. 2009. Mortality from lymphohematopoietic malignancies and brain cancer among embalmers exposed to formaldehyde. J Natl Cancer Inst. 101(24):1696–1708. doi: 10.1093/jnci/djp416.
  • Hayes RB, Blair A, Stewart PA, Herrick RF, Mahar H. 1990. Mortality of U.S. embalmers and funeral directors. Am J Ind Med. 18(6):641–652. doi: 10.1002/ajim.4700180603.
  • Hayes RB, Yin SN, Dosemeci M, Li GL, Wacholder S, Travis LB, Li CY, Rothman N, Hoover RN, Linet MS. 1997. Benzene and the dose-related incidence of hematologic neoplasms in China. Chinese Academy of Preventive Medicine–National Cancer Institute Benzene Study Group. J Natl Cancer Inst. 89(14):1065–1071. doi: 10.1093/jnci/89.14.1065.
  • Heck HD, Casanova M. 2004. The implausibility of leukemia induction by formaldehyde: a critical review of the biological evidence on distant-site toxicity. Regul Toxicol Pharmacol. 40(2):92–106. doi: 10.1016/j.yrtph.2004.05.001.
  • Heinze-Deml C, Maathuis MH, Meinshausen N. 2018. Causal structure learning. Annu Rev Stat Appl. 5(1):371–391. Full text is at https://arxiv.org/pdf/1706.09141.pdf. doi: 10.1146/annurev-statistics-031017-100630.
  • Heinze-Deml C, Meinshausen N. 2020. Package ‘CompareCausalNetworks’. https://cran.r-project.org/web/packages/CompareCausalNetworks/CompareCausalNetworks.pdf.
  • Henderson RF. 1996. Species differences in the metabolism of benzene. Environ Health Perspect. 104 Suppl 6(Suppl 6):1173–1175. doi: 10.1289/ehp.961041173.
  • Hirabayashi Y, Yoon BI, Li GX, Kanno J, Inoue T. 2004. Mechanism of benzene-induced hematotoxicity and leukemogenicity: current review with implication of microarray analyses. Toxicol Pathol. 32 Suppl 2:12–16. doi: 10.1080/01926230490451725.
  • Hole PS, Darley RL, Tonks A. 2011. Do reactive oxygen species play a role in myeloid leukemias? Blood. 117(22):5816–5826. Erratum in: blood. 2014 Jan 30;123(5):798. doi: 10.1182/blood-2011-01-326025.
  • Hu L, Zhang Y, Miao W, Cheng T. 2019. Reactive oxygen species and Nrf2: functional and transcriptional regulators of hematopoiesis. Oxid Med Cell Longev. 2019:5153268. doi: 10.1155/2019/5153268.
  • Huntington-Klein N. 2022. The effect: an introduction to research design and causality. Boca Raton, FL: CRC Press.
  • IARC. 2012. IARC monographs on the evaluation of carcinogenic risks to humans. Formaldehyde. Volume 100F. Lyon, France: International Agency for Research on Cancer/WHO.
  • IARC. 2018. IARC monographs on the evaluation of carcinogenic risks to humans. Benzene. Volume 120. Lyon, France: International Agency for Research on Cancer/WHO.
  • Imbens GW, Rubin DB. 2015. Causal Inference for Statistics, Social, and Biomedical Sciences: an Introduction. New York, NY: Cambridge University Press.
  • Jacobs B, Kissinger A, Zanasi F. 2019. Causal Inference by String Diagram Surgery. In: bojańczyk, M., Simpson, A. (eds) Foundations of Software Science and Computation Structures. FoSSaCS 2019. Lecture Notes in Computer Science, vol 11425. Cham: Springer. doi: 10.1007/978-3-030-17127-8_18.
  • Jamall IS, Willhite CC. 2008. Is benzene exposure from gasoline carcinogenic? J Environ Monit. 10(2):176–187. doi: 10.1039/b712987d.
  • Ji Y, Nan Wang Q, Lin X, Qing Suo LJ. 2012. CYP1A1 MspI polymorphisms and lung cancer risk: an updated meta-analysis involving 20,209 subjects. Cytokine. 59(2):324–334. doi: 10.1016/j.cyto.2012.04.027.
  • Johnson GT, Harbison SC, McCluskey JD, Harbison RD. 2009. Characterization of cancer risk from airborne benzene exposure. Regul Toxicol Pharmacol. 55(3):361–366. doi: 10.1016/j.yrtph.2009.08.008.
  • Kanagal-Shamanna R, Zhao W, Vadhan-Raj S, Nguyen MH, Fernandez MH, Medeiros LJ, Bueso-Ramos CE. 2012. Over-expression of CYP2E1 mRNA and protein: implications of xenobiotic induced damage in patients with de novo acute myeloid leukemia with inv(16)(p13.1q22); CBFβ-MYH11. Int J Environ Res Public Health. 9(8):2788–2800. doi: 10.3390/ijerph9082788.
  • Kang SW. 2015. Superoxide dismutase 2 gene and cancer risk: evidence from an updated meta-analysis. Int J Clin Exp Med. 8(9):14647–14655.
  • Kang DS, Kim HS, Jung JH, Lee CM, Ahn YS, Seo YR. 2021. Formaldehyde exposure and leukemia risk: a comprehensive review and network-based toxicogenomic approach. Genes Environ. 43(1):13. doi: 10.1186/s41021-021-00183-5.
  • Kerns WD, Pavkov KL, Donofrio DJ, Gralla EJ, Swenberg JA. 1983. Carcinogenicity of formaldehyde in rats and mice after long-term inhalation exposure. Cancer Res. 43(9):4382–4392.
  • Kerzic PJ, Irons RD. 2017. Distribution of chromosome breakpoints in benzene-exposed and unexposed AML patients. Environ Toxicol Pharmacol. 55:212–216. doi: 10.1016/j.etap.2017.08.033.
  • Khalade A, Jaakkola MS, Pukkala E, Jaakkola JJ. 2010. Exposure to benzene at work and the risk of leukemia: a systematic review and meta-analysis. Environ Health. 9(1):31. doi: 10.1186/1476-069X-9-31.
  • Kleinnijenhuis AJ, Staal YC, Duistermaat E, Engel R, Woutersen RA. 2013. The determination of exogenous formaldehyde in blood of rats during and after inhalation exposure. Food Chem Toxicol. 52:105–112. doi: 10.1016/j.fct.2012.11.008.
  • Kolachana P, Subrahmanyam VV, Meyer KB, Zhang L, Smith MT. 1993. Benzene and its phenolic metabolites produce oxidative DNA damage in HL60 cells in vitro and in the bone marrow in vivo. Cancer Res. 53(5):1023–1026.
  • Koller D, Friedman N. 2009. Probabilistic graphical models: principles and techniques. Cambridge, MA: MIT Press.
  • Kossakowski JJ, Waldorp LJ, van der Maas HLJ. 2021. The search for causality: a comparison of different techniques for causal inference graphs. Psychol Methods. 26(6):719–742. doi: 10.1037/met0000390.34323582.
  • Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips K-A, Mooij TM, Roos-Blom M-J, Jervis S, van Leeuwen FE, Milne RL, Andrieu N, BRCA1 and BRCA2 Cohort Consortium., et al. 2017. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA. 317(23):2402–2416. doi: 10.1001/jama.2017.7112.
  • Kumar KV, Goturi A, Nagaraj M, Goud EVSS. 2022. Null genotypes of Glutathione S-transferase M1 and T1 and risk of oral cancer: a meta-analysis. J Oral Maxillofac Pathol. 26(4):592. doi: 10.4103/jomfp.jomfp_435_21.
  • Land M, Gefeller O. 2000. A multiplicative variant of the Shapley value for factorizing the risk of disease. In: Patrone, F., García-Jurado, I., Tijs, S. (eds) Game Practice: contributions from Applied Game Theory. Theory and Decision Library, vol 23. Springer, Boston, MA. doi: 10.1007/978-1-4615-4627-6_11.
  • Lauritzen SL, Spiegelhalter DJ. 1988. Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc Series B Methodological. 50(2):157–194. doi: 10.1111/j.2517-6161.1988.tb01721.x.
  • Legathe A, Hoener BA, Tozer TN. 1994. Pharmacokinetic interaction between benzene metabolites, phenol and hydroquinone, in B6C3F1 mice. Toxicol Appl Pharmacol. 124(1):131–138. doi: 10.1006/taap.1994.1016.
  • Leng J, Liu C-W, Hartwell HJ, Yu R, Lai Y, Bodnar WM, Lu K, Swenberg JA. 2019. Evaluation of inhaled low-dose formaldehyde-induced DNA adducts and DNA-protein cross-links by liquid chromatography-tandem mass spectrometry. Arch Toxicol. 93(3):763–773. doi: 10.1007/s00204-019-02393-x.
  • Li C, Fan X. 2020. On nonparametric conditional independence tests for continuous variables. WIREs Computational Stats. 12(3):e1489. doi: 10.1002/wics.1489.
  • Li M, Li A, He R, Dang W, Liu X, Yang T, Shi P, Bu X, Gao D, Zhang N, et al. 2019. Gene polymorphism of cytochrome P450 significantly affects lung cancer susceptibility. Cancer Med. 8(10):4892–4905. doi: 10.1002/cam4.2367.
  • Li H, Li D, He Z, Fan J, Li Q, Liu X, Guo P, Zhang H, Chen S, Li Q, et al. 2018. The effects of Nrf2 knockout on regulation of benzene-induced mouse hematotoxicity. Toxicol Appl Pharmacol. 358:56–67. doi: 10.1016/j.taap.2018.09.002.
  • Linet MS, Gilbert ES, Vermeulen R, Dores GM, Yin SN, Portengen L, Hayes RB, Ji BT, Lan Q, Li GL, Chinese Center for Disease Control and Prevention–US National Cancer Institute Benzene Study Group., et al. 2019. Benzene exposure response and risk of myeloid neoplasms in Chinese workers: a multicenter case-cohort study. J Natl Cancer Inst. 111(5):465–474. doi: 10.1093/jnci/djy143.
  • Lin W-Y, Fordham SE, Hungate E, Sunter NJ, Elstob C, Xu Y, Park C, Quante A, Strauch K, Gieger C, et al. 2021. Genome-wide association study identifies susceptibility loci for acute myeloid leukemia. Nat Commun. 12(1):6233. doi: 10.1038/s41467-021-26551-x.
  • Li XH, Chen JX, Yue GX, Liu YY, Zhao X, Guo XL, Liu Q, Jiang YM, Bai MH. 2013. Gene expression profile of the hippocampus of rats subjected to chronic immobilization stress. PLoS One. 8(3):e57621. doi: 10.1371/journal.pone.0057621.
  • Lutz WK, Lutz RW, Andersen ME. 2006. Dose-incidence relationships derived from superposition of distributions of individual susceptibility on mechanism-based dose responses for biological effects. Toxicol Sci. 90(1):33–38. doi: 10.1093/toxsci/kfj026.
  • Lutz WK, Lutz RW, Gaylor DW, Conolly RB. 2014. Dose–response relationship and extrapolation in toxicology. mechanistic and statistical considerations. In: reichl, FX., Schwenk, M. (eds) Regulatory Toxicology. Berlin/Heidelberg, Germany: Springer. doi: 10.1007/978-3-642-35374-1_72.
  • Maeda TN, Shimizu S. 2021. Causal additive models with unobserved variables. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence. Proceedings of Machine Learning Research. 161:p. 97–106. https://proceedings.mlr.pr.ess/v161/maeda21a.html.
  • Marchand T, Pinho S. 2021. Leukemic stem cells: from leukemic niche biology to treatment opportunities. Front Immunol. 12:775128. doi: 10.3389/fimmu.2021.775128.
  • Marrubini G, Castoldi AF, Coccini T, Manzo L. 2003. Prolonged ethanol ingestion enhances benzene myelotoxicity and lowers urinary concentrations of benzene metabolite levels in CD-1 male mice. Toxicol Sci. 75(1):16–24. doi: 10.1093/toxsci/kfg163.
  • Mathialagan RD, Abd Hamid Z, Ng QM, Rajab NF, Shuib S, Binti Abdul Razak SR. 2020. Bone marrow oxidative stress and acquired lineage-specific genotoxicity in hematopoietic stem/progenitor cells exposed to 1,4-benzoquinone. Int J Environ Res Public Health. 17(16):5865. doi: 10.3390/ijerph17165865.
  • McDonald TA, Holland NT, Skibola C, Duramad P, Smith MT. 2001. Hypothesis: phenol and hydroquinone derived mainly from diet and gastrointestinal flora activity are causal factors in leukemia. Leukemia. 15(1):10–20. doi: 10.1038/sj.leu.2401981.
  • McHale CM, Zhang L, Smith MT. 2012. Current understanding of the mechanism of benzene-induced leukemia in humans: implications for risk assessment. Carcinogenesis. 33(2):240–252. doi: 10.1093/carcin/bgr297.
  • McNally K, Sams C, Loizou GD, Jones K. 2017. Evidence for non-linear metabolism at low benzene exposures? A reanalysis of data. Chem Biol Interact. 278:256–268. doi: 10.1016/j.cbi.2017.09.002.
  • Medinsky MA, Kenyon EM, Seaton MJ, Schlosser PM. 1996. Mechanistic considerations in benzene physiological model development. Environ Health Perspect. 104 Suppl 6(Suppl 6):1399–1404. doi: 10.1289/ehp.961041399.
  • Meek ME, Bucher JR, Cohen SM, Dellarco V, Hill RN, Lehman-McKeeman LD, Longfellow DG, Pastoor T, Seed J, Patton DE. 2003. A framework for human relevance analysis of information on carcinogenic modes of action. Crit Rev Toxicol. 33(6):591–653. doi: 10.1080/713608373.
  • Meyers AR, Pinkerton LE, Hein MJ. 2013. Cohort mortality study of garment industry workers exposed to formaldehyde: update and internal comparisons. Am J Ind Med. 56(9):1027–1039. doi: 10.1002/ajim.22199.
  • Mooij JM, Magliacane S, Claassen T. 2020. Joint causal inference from multiple contexts. Journal of Machine Learning Research. 21:1–108.
  • Mundt KA, Dell LD, Boffetta P, Beckett EM, Lynch HN, Desai VJ, Lin CK, Thompson WJ. 2021. The importance of evaluating specific myeloid malignancies in epidemiological studies of environmental carcinogens. BMC Cancer. 21(1):227. doi: 10.1186/s12885-021-07908-3.
  • Mundt KA, Gallagher AE, Dell LD, Natelson EA, Boffetta P, Gentry PR. 2017. Does occupational exposure to formaldehyde cause hematotoxicity and leukemia-specific chromosome changes in cultured myeloid progenitor cells? Crit Rev Toxicol. 47(7):592–602. Erratum in: Crit Rev Toxicol. 2017 Aug;47(7):i. doi: 10.1080/10408444.2017.1301878.
  • Mundt KA, Gentry PR, Dell LD, Rodricks JV, Boffetta P. 2018. Six years after the NRC review of EPA's Draft IRIS toxicological review of formaldehyde: regulatory implications of new science in evaluating formaldehyde leukemogenicity. Regul Toxicol Pharmacol. 92:472–490. doi: 10.1016/j.yrtph.2017.11.006.
  • Natelson EA. 2007. Benzene-induced acute myeloid leukemia: a clinician’s perspective. Am J Hematol. 82(9):826–830. doi: 10.1002/ajh.20934.
  • Nourozi MA, Neghab M, Bazzaz JT, Nejat S, Mansoori Y, Shahtaheri SJ. 2018. Association between polymorphism of GSTP1, GSTT1, GSTM1 and CYP2E1 genes and susceptibility to benzene-induced hematotoxicity. Arch Toxicol. 92(6):1983–1990. doi: 10.1007/s00204-017-2104-9.
  • Oishi K, Yokoi M, Maekawa S, Sodeyama C, Shiraishi T, Kondo R, Kuriyama T, Machida K. 1999. Oxidative stress and haematological changes in immobilized rats. Acta Physiol Scand. 165(1):65–69. doi: 10.1046/j.1365-201x.1999.00482.x.
  • Pearl J. 1988. Probabilistic reasoning in intelligent systems: networks of plausible inference. San Francisco, CA: Morgan Kaufmann.
  • Pearl J. 2000. Causality: models, reasoning and inference. 1st ed. Cambridge, England: Cambridge University Press.
  • Pearl J. 2009. Causal inference in statistics: an overview. Statist Surv. 3:96–146. doi: 10.1214/09-SS057.
  • Pullen FK, Yiliqi I, Lam Y, Lennett D, Singla V, Rotkin-Ellman M, Sass J. 2021. A cumulative framework for identifying overburdened populations under the toxic substances control act: formaldehyde case study. Int J Environ Res Public Health. 18(11):6002. doi: 10.3390/ijerph18116002.
  • Qu Q, Shore R, Li G, Chen JX, Cohen LC, Melikian B, Eastmond AA, Rappaport D, Li S, Rupa H, et al. 2003. Validation and evaluation of biomarkers in workers exposed to benzene in China. Res Rep Health Eff Inst. 115:1–72; discussion 73–87.
  • Rappaport SM, Kim S, Lan Q, Li G, Vermeulen R, Waidyanatha S, Zhang L, Yin S, Smith MT, Rothman N. 2010. Human benzene metabolism following occupational and environmental exposures. Chem Biol Interact. 184(1–2):189–195. doi: 10.1016/j.cbi.2009.12.017.
  • Reingruber H, Pontel LB. 2018. Formaldehyde metabolism and its impact on human health. Curr. Opin. Toxicol. 9:28–34. https://www.sciencedirect.com/science/article/abs/pii/S2468202017301456. doi: 10.1016/j.cotox.2018.07.001.
  • Riccomagno E, Smith JQ, Thwaites P. 2010. Algebraic discrete causal models. Working Paper. Coventry: University of Warwick. Centre for Research in Statistical Methodology. Working papers, Vol. 2010 (No.11).
  • Richardson DB. 2009. Multistage modeling of leukemia in benzene workers: a simple approach to fitting the 2-stage clonal expansion model. Am J Epidemiol. 169(1):78–85. doi: 10.1093/aje/kwn284.
  • Robertson SE, Steingrimsson JA, Joyce NR, Stuart EA, Dahabreh IJ. 2022. Estimating subgroup effects in generalizability and transportability analyses. Am J Epidemiol. 193(1):149–158. doi: 10.1093/aje/kwac036.
  • Robins JM, Greenland S. 1989. Estimability and estimation of excess and etiologic fractions. Stat Med. 8(7):845–859. doi: 10.1002/sim.4780080709.
  • Rockhill B, Newman B, Weinberg C. 1998. Use and misuse of population attributable fractions. Am J Public Health. 88(1):15–19. doi: 10.2105/ajph.88.1.15.
  • Rohmer J. 2020. Uncertainties in conditional probability tables of discrete Bayesian Belief Networks: A comprehensive review. Eng Appl Artif Intell. 88:103384. https://brgm.hal.science/hal-02386579/file/Rohmer_EAAI_HAL.pdf. doi: 10.1016/j.engappai.2019.103384.
  • Ross D, Zhou H. 2010. Relationships between metabolic and non-metabolic susceptibility factors in benzene toxicity. Chem Biol Interact. 184(1–2):222–228. doi: 10.1016/j.cbi.2009.11.017.
  • Rothman N, Smith MT, Hayes RB, Traver RD, Hoener B, Campleman S, Li GL, Dosemeci M, Linet M, Zhang L, et al. 1997. Benzene poisoning, a risk factor for hematological malignancy, is associated with the NQO1 609C-->T mutation and rapid fractional excretion of chlorzoxazone. Cancer Res. 57(14):2839–2842.
  • Rothman N, Zhang L, Smith MT, Vermeulen R, Lan Q. 2018. Formaldehyde, hematotoxicity, and chromosomal changes-response. Cancer Epidemiol Biomarkers Prev. 27(1):120–121. doi: 10.1158/1055-9965.EPI-17-0804.
  • Runge J, Bathiany S, Bollt E, Camps-Valls G, Coumou D, Deyle E, Glymour C, Kretschmer M, Mahecha MD, Muñoz-Marí J, et al. 2019. Inferring causation from time series in Earth system sciences. Nat Commun. 10(1):2553. doi: 10.1038/s41467-019-10105-3.
  • Saberi Hosnijeh F, Christopher Y, Peeters P, Romieu I, Xun W, Riboli E, Raaschou-Nielsen O, Tjønneland A, Becker N, Nieters A, et al. 2013. Occupation and risk of lymphoid and myeloid leukaemia in the European Prospective Investigation into Cancer and Nutrition (EPIC). Occup Environ Med. 70(7):464–470. doi: 10.1136/oemed-2012-101135.
  • Samarghandian S, Azimi-Nezhad M, Farkhondeh T, Samini F. 2017. Anti-oxidative effects of curcumin on immobilization-induced oxidative stress in rat brain, liver and kidney. Biomed Pharmacother. 87:223–229. doi: 10.1016/j.biopha.2016.12.105.
  • Schnatter AR, Armstrong TW, Nicolich MJ, Thompson FS, Katz AM, Huebner WW, Pearlman ED. 1996. Lymphohaematopoietic malignancies and quantitative estimates of exposure to benzene in Canadian petroleum distribution workers. Occup Environ Med. 53(11):773–781. doi: 10.1136/oem.53.11.773.
  • Schnatter AR, Rosamilia K, Wojcik NC. 2005. Review of the literature on benzene exposure and leukemia subtypes. Chem Biol Interact. 153–154:9–21. doi: 10.1016/j.cbi.2005.03.039.
  • Shafer G, Shenoy PP. 1990. Probability propagation. Ann Math Artif Intell. 2(1–4):327–351. doi: 10.1007/BF01531015.
  • Shallis RM, Weiss JJ, Deziel NC, Gore SD. 2021. A clandestine culprit with critical consequences: benzene and acute myeloid leukemia. Blood Rev. 47:100736. doi: 10.1016/j.blre.2020.100736.
  • Shcherbinina VD, Petrova MV, Glinin TS, Daev EV. 2021. Genotoxic effect of restraint and stress pheromone on somatic and germ cells of mouse males Mus musculus L. Ecol Genet. 19(2):169–179. doi: 10.17816/ecogen65208.
  • Shlush LI, Zandi S, Mitchell A, Chen WC, Brandwein JM, Gupta V, Kennedy JA, Schimmer AD, Schuh AC, Yee KW, et al. 2014. Identification of pre-leukaemic haematopoietic stem cells in acute leukaemia. Nature. 506(7488):328–333. doi: 10.1038/nature13038.
  • Sonich-Mullin C, Fielder R, Wiltse J, Baetcke K, Dempsey J, Fenner-Crisp P, Grant D, Hartley M, Knaap A, Kroese DM; International Programme on Chemical Safety., et al. 2001. IPCS conceptual framework for evaluating a mode of action for chemical carcinogenesis. Regul Toxicol Pharmacol. 34(2):146–152. doi: 10.1006/rtph.2001.1493.
  • Subrahmanyam VV, Doane-Setzer P, Steinmetz KL, Ross D, Smith MT. 1990. Phenol-induced stimulation of hydroquinone bioactivation in mouse bone marrow in vivo: possible implications in benzene myelotoxicity. Toxicology. 62(1):107–116. doi: 10.1016/0300-483x(90)90035-f.
  • Subramaniam RP, Chen C, Crump KS, Devoney D, Fox JF, Portier CJ, Schlosser PM, Thompson CM, White P. 2008. Uncertainties in biologically-based modeling of formaldehyde-induced respiratory cancer risk: identification of key issues. Risk Anal. 28(4):907–923. doi: 10.1111/j.1539-6924.2008.01083.x.
  • Suzuki E, Yamamoto E, Tsuda T. 2012. On the relations between excess fraction, attributable fraction, and etiologic fraction. Am J Epidemiol. 175(6):567–575. doi: 10.1093/aje/kwr333.
  • Swenberg JA, Lu K, Moeller BC, Gao L, Upton PB, Nakamura J, Starr TB. 2011. Endogenous versus exogenous DNA adducts: their role in carcinogenesis, epidemiology, and risk assessment. Toxicol Sci. 120 Suppl 1(Suppl 1):S130–S45. doi: 10.1093/toxsci/kfq371.
  • Talibov M, Lehtinen-Jacks S, Martinsen JI, Kjærheim K, Lynge E, Sparén P, Tryggvadottir L, Weiderpass E, Kauppinen T, Kyyrönen P, et al. 2014. Occupational exposure to solvents and acute myeloid leukemia: a population-based, case-control study in four Nordic countries. Scand J Work Environ Health. 40(5):511–517. doi: 10.5271/sjweh.3436.
  • Textor J, van der Zander B, Gilthorpe MS, Liskiewicz M, Ellison GT. 2016. Robust causal inference using directed acyclic graphs: the R package 'dagitty. Int J Epidemiol. 45(6):1887–1894.
  • Thomas R, Hubbard AE, McHale CM, Zhang L, Rappaport SM, Lan Q, Rothman N, Vermeulen R, Guyton KZ, Jinot J, et al. 2014. Characterization of changes in gene expression and biochemical pathways at low levels of benzene exposure. PLoS One. 9(5):e91828. doi: 10.1371/journal.pone.0091828.
  • Tillman H, Janke LJ, Funk A, Vogel P, Rehg JE. 2020. Morphologic and immunohistochemical characterization of spontaneous lymphoma/leukemia in NSG mice. Vet Pathol. 57(1):160–171. doi: 10.1177/0300985819882631.
  • Trombetti S, Cesaro E, Catapano R, Sessa R, Lo Bianco A, Izzo P, Grosso M. 2021. Oxidative stress and ROS-mediated signaling in leukemia: novel promising perspectives to eradicate chemoresistant cells in myeloid leukemia. Int J Mol Sci. 22(5):2470. doi: 10.3390/ijms22052470.
  • USEPA. 2000. IRIS Summary. Benzene. Last updated 01/19/2000. Accessed 11/01/2023: https://iris.epa.gov/ChemicalLanding/&substance_nmbr=276.
  • USEPA. 2022. IRIS Toxicological Review of Formaldehyde-Inhalation (External Review Draft, 2022). U.S. Environmental Protection Agency, Washington, DC, EPA/635/R-22/039, 2022.
  • Uzma N, Kumar BS, Hazari MA. 2010. Exposure to benzene induces oxidative stress, alters the immune response and expression of p53 in gasoline filling workers. Am J Ind Med. 53(12):1264–1270. doi: 10.1002/ajim.20901.
  • Valentine JL, Lee SS, Seaton MJ, Asgharian B, Farris G, Corton JC, Gonzalez FJ, Medinsky MA. 1996. Reduction of benzene metabolism and toxicity in mice that lack CYP2E1 expression. Toxicol Appl Pharmacol. 141(1):205–213. doi: 10.1006/taap.1996.0277.
  • Vermeulen R, Lan Q, Qu Q, Linet MS, Zhang L, Li G, Portengen L, Vlaanderen J, Sungkyoon K, Hayes RB, et al. 2023. Nonlinear low dose hematotoxicity of benzene; a pooled analyses of two studies among Chinese exposed workers. Environ Int. 177:108007. doi: 10.1016/j.envint.2023.108007.
  • Wang L, He X, Bi Y, Ma Q. 2012. Stem cell and benzene-induced malignancy and hematotoxicity. Chem Res Toxicol. 25(7):1303–1315. doi: 10.1021/tx3001169.
  • Ward JM. 2006. Lymphomas and leukemias in mice. Exp Toxicol Pathol. 57(5-6):377–381. doi: 10.1016/j.etp.2006.01.007.
  • Wei C, Wen H, Yuan L, McHale CM, Li H, Wang K, Yuan J, Yang X, Zhang L. 2017. Formaldehyde induces toxicity in mouse bone marrow and hematopoietic stem/progenitor cells and enhances benzene-induced adverse effects. Arch Toxicol. 91(2):921–933. doi: 10.1007/s00204-016-1760-5.
  • Weinberger N. 2023. Intervening and letting go: on the adequacy of equilibrium causal models. Erkenn. 88(6):2467–2491. doi: 10.1007/s10670-021-00463-0.
  • Whitcomb BW, Naimi AI. 2021. Defining, quantifying, and interpreting "noncollapsibility" in epidemiologic studies of measures of "effect”. Am J Epidemiol. 190(5):697–700. doi: 10.1093/aje/kwaa267.
  • Wong BA. 2007. Inhalation exposure systems: design, methods and operation. Toxicol Pathol. 35(1):3–14. doi: 10.1080/01926230601060017.
  • Xin X, Jin Z, Gu H, Li Y, Wu T, Hua T, Wang H. 2016. Association between glutathione S-transferase M1/T1 gene polymorphisms and susceptibility to endometriosis: A systematic review and meta-analysis. Exp Ther Med. 11(5):1633–1646. doi: 10.3892/etm.2016.3110.
  • Yang L, Rau R, Goodell M. 2015. DNMT3A in haematological malignancies. Nat Rev Cancer. 15(3):152–165. doi: 10.1038/nrc3895.
  • Yaris F, Dikici M, Akbulut T, Yaris E, Sabuncu H. 2004. Story of benzene and leukemia: epidemiologic approach of Muzaffer Aksoy. J Occup Health. 46(3):244–247. doi: 10.1539/joh.46.244.
  • Yi M, Li A, Zhou L, Chu Q, Song Y, Wu K. 2020. The global burden and attributable risk factor analysis of acute myeloid leukemia in 195 countries and territories from 1990 to 2017: estimates based on the global burden of disease study 2017. J Hematol Oncol. 13(1):72. doi: 10.1186/s13045-020-00908-z.
  • Yin SN, Li GL, Tain FD, Fu ZI, Jin C, Chen YJ, Luo SJ, Ye PZ, Zhang JZ, Wang GC. 1987. Leukaemia in benzene workers: a retrospective cohort study. Br J Ind Med. 44(2):124–128. doi: 10.1136/oem.44.2.124.
  • Young AL, Tong RS, Birmann BM, Druley TE. 2019. Clonal hematopoiesis and risk of acute myeloid leukemia. Haematologica. 104(12):2410–2417. doi: 10.3324/haematol.2018.215269.
  • Yu GY, Song XF, Liu Y, Sun ZW. 2014. Inhaled formaldehyde induces bone marrow toxicity via oxidative stress in exposed mice. Asian Pac J Cancer Prev. 15(13):5253–5257. doi: 10.7314/apjcp.2014.15.13.5253.
  • Zarth AT, Murphy SE, Hecht SS. 2015. Benzene oxide is a substrate for glutathione S-transferases. Chem Biol Interact. 242:390–395. doi: 10.1016/j.cbi.2015.11.005.
  • Zeka A, Gore R, Kriebel D. 2011. The two-stage clonal expansion model in occupational cancer epidemiology: results from three cohort studies. Occup Environ Med. 68(8):618–624. doi: 10.1136/oem.2009.053983.
  • Zhan P, Wang Q, Qian Q, Wei SZ, Yu LK. 2011. CYP1A1 MspI and exon7 gene polymorphisms and lung cancer risk: an updated meta-analysis and review. J Exp Clin Cancer Res. 30(1):99. doi: 10.1186/1756-9966-30-99.
  • Zhang L, Eastmond DA, Smith MT. 2002. The nature of chromosomal aberrations detected in humans exposed to benzene. Crit Rev Toxicol. 32(1):1–42. doi: 10.1080/20024091064165.
  • Zhang W, Poole D. 1996. Exploiting Causal Independence in Bayesian Network Inference. jair. 5(1):301–328. doi: 10.1613/jair.305.
  • Zhang L, Steinmaus C, Eastmond DA, Xin XK, Smith MT. 2009. Formaldehyde exposure and leukemia: a new meta-analysis and potential mechanisms. Mutat Res. 681(2–3):150–168. doi: 10.1016/j.mrrev.2008.07.002.
  • Zhang L, Tang X, Rothman N, Vermeulen R, Ji Z, Shen M, Qiu C, Guo W, Liu S, Reiss B, et al. 2010. Occupational exposure to formaldehyde, hematotoxicity, and leukemia-specific chromosome changes in cultured myeloid progenitor cells. Cancer Epidemiol Biomarkers Prev. 19(1):80–88. doi: 10.1158/1055-9965.EPI-09-0762.
  • Zhao J, Sui P, Wu B, Chen A, Lu Y, Hou F, Cheng X, Cui S, Song J, Huang G, et al. 2021. Benzene induces rapid leukemic transformation after prolonged hematotoxicity in a murine model. Leukemia. 35(2):595–600. doi: 10.1038/s41375-020-0894-x.