Abstract
A computational approach is adapted to analyze the parameter identifiability of a compartmental model. The model is intended to describe the progression of the COVID-19 pandemic in Chile during the initial phase in early 2020 when government declared quarantine measures. The computational approach to analyze the structural and practical identifiability is applied in two parts, one for synthetic data and another for some Chilean regional data. The first part defines the identifiable parameter sets when these recover the true parameters used to create the synthetic data. The second part compares the results derived from synthetic data, estimating the identifiable parameter sets from regional Chilean epidemic data. Experiments provide evidence of the loss of identifiability if some initial conditions are estimated, the period of time used to fit is before the peak, and if a significant proportion of the population is involved in quarantine periods.
1. Introduction
1.1. Scope
The COVID-19 pandemic has received overwhelming news coverage worldwide. We recall that the coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was declared a global pandemic by the World Health Organization (WHO) on March 11, 2020 [Citation11, Citation64]. This highly contagious unprecedented virus has impacted government and public institutions, strained the health care systems, restricted people in their homes, and caused country-wide lockdowns resulting in a global economic crisis [Citation20, Citation32, Citation61]. The impact of COVID-19 at the time of revision of this work (July 5, 2023) amounts to roughly 767 million cases moreover, 6.95 million deaths worldwide, of which 2.96 million have been reported in the Americas.
The morbidity, mortality, and economic indicators associated with the COVID-19 pandemic point to a devastating picture for Latin American countries. High poverty rates, poor leadership, high prevalence of underlying health conditions such as obesity and diabetes, and frail healthcare systems have exacerbated the impact of the novel coronavirus in this region [Citation9, Citation23, Citation30, Citation45]. In Chile, the Ministry of Health reported the first COVID-19 case on March 3, 2020, becoming the fifth country in Latin America after Brazil, Mexico, Ecuador and Argentina to report COVID-19 cases. Soon after, the Chilean government put in place a number of interventions including the closure of all daycares, schools, and universities (March 16), border controls, telework recommendations (March 18), closure of non-essential businesses (March 19), national night curfews (March 22), as well as targeted lockdowns at the level of municipalities since March 26, 2020. As of July 6, 2023, Chile has accumulated 5288437 cases and 61586 deaths [Citation40]. It is also worth noting that Chile has tested at a higher rate than any other Latin American country [Citation58]. Fortunately, Chile's mass COVID-19 vaccination campaign to immunize about 80% of the total population started in February 2021, and more than 90% of that target population has been fully vaccinated as of July 6, 2023.
One way to understand and explain these epidemic phenomena is by applying mathematical models and surveillance data, and the COVID-19 outbreak has been studied with different focuses, differing in their uses, their mathematical form, and objectives [Citation1, Citation6, Citation31, Citation43, Citation44, Citation47]. Since the first case reported of the COVID-19 disease, and the multiple models that have surged to respond to the emergency and its dynamics [Citation53, Citation54], we can understand their contributions, as well as their limitations [Citation2, Citation4, Citation62]. The mathematical models utilized to confront the coronavirus 2019 disease improve situational awareness, assess epidemiological characteristics, and inform the base of evidence for prevention strategies. Specifically, for models derived from the famous SEIR model, one must guarantee that information recovered by them is highly accurate and that one confidently understands the biological mechanisms to lay the ground for developing optimal strategies for public health. The successful application of such epidemic models depends upon our ability to estimate key transmission parameters. These estimates are subject to two principal sources of uncertainty: the noise of the data, and the assumptions built into the model. Both difficulties give rise to the problem of non-identifiability of parameters. That is, the information recovered by applying the model to available datasets is not representative of the dynamics that we desire to interpret. This problem has been studied intensively and there are various approaches to handle non-identifiability. For instance, in [Citation16, Citation22, Citation29, Citation48, Citation57, Citation69] guidelines and criteria are proposed to avoid such identifiability difficulties. Ignoring these factors can result in misleading inferences with potentially incorrect public health policy decisions [Citation12, Citation14, Citation51].
We are inspired by the dynamics of COVID-19 that occurred in Chile in the initial phase of 2020 to explore these factors in estimating parameters. Then, we propose a compartmental model with nonlinear differential equations for the Chilean dynamics, and we use it to distinguish the noise from the data and the structure. With this in mind, we developed an identifiability study employing a computational approach such as that described in [Citation51]. The identifiability analysis, as explained in [Citation50], determines how well the parameters of a model can be estimated from experimental data, so it is crucial for interpreting and determining confidence in model parameter values. Structural identifiability focuses on a mathematical model and the practical identifiability on data. Hence a model with structurally identifiable parameters may still, be nonidentifiable in practice due to a lack of information gleaned from real data. Thus, it is essential to understand both types of identifiability when applying an epidemic model to a specific situation [Citation27, Citation39, Citation51]. Therefore, we applied a computational approach to investigate the identifiability of the parameters from the compartmental model mentioned. This idea motivates an alternative parameter identifiability analysis to a complex compartment model, where other techniques can be rigorous. Specifically, we employ synthetic data from our model using various parameter sets. This parameter estimation is carried out with a defined optimization process with the simulated annealing method. Fitting all data points generated in the simulated epidemic curve, we aim to recover the parameters assumed to create synthetic data. If this approach is successful, we may confirm structural identifiability, which we aim to validate by adding a measurement error structure through a bootstrapping process [Citation14, Citation51]. For the case of practical identifiability, we only use a limited number of data points of the epidemic during the initial phase (e.g. pre-peak). We employed our sets of identifiable parameters to fit regional Chilean data to compare the conclusions from the previous analysis based on synthetic data.
1.2. Related work
We first recall that introductions to compartmental epidemiological models are provided in [Citation3, Citation5, Citation18, Citation35, Citation60, Citation66]. This class of models goes back to the well-known work by Kermack and McKendrick [Citation28]. Within a compartmental model the population is subdivided into at least two epidemiological compartments (e.g. susceptible and infected). The rates of progression between the compartments, as well as the incidence and possibly birth and death of individuals, need to be specified, which leads to a dynamical system, that is a system of coupled, nonlinear ordinary differential equations (ODEs) that describe the progression of the epidemic.
In many cases, the compartmental approach formulated as a dynamical system provides indications of the basic reproduction number , where this parameter plays the role of a threshold value for the system's dynamics. This role of as a ‘threshold value’ differs from the original epidemiological significance, namely that gives the number of secondary cases one infectious individual will produce in a population consisting only of susceptible individuals [Citation35, p. 21].
The conclusions obtained by a compartmental model are derived from parameter estimations and allow describing the dynamics present in a data set. Then, the identifiability analysis is crucial to guaranteeing that the parameter estimates are adequate. Specifically, the identifiability analysis addresses whether the parameters of a compartmental model can be uniquely identified. There are different strategies for the identifiability, for example, for models expressed by linear ODEs [Citation38, Citation65], nonlinear ODEs [Citation39], or partial differential equations [Citation50]. The type of data to fit also plays an essential role. For example, one needs to select between incidence or prevalence curves [Citation12, Citation19, Citation56]. In the present theory two types of identifiability are considered: structural and practical. The first involves the model and its structure, and the study can be centred on theoretical or computational techniques (e.g. [Citation13, Citation16, Citation19, Citation25, Citation33, Citation39, Citation50, Citation57]), and the second involves the data (e.g. [Citation22, Citation27, Citation29, Citation56, Citation57]). The analysis is separate because a model can be structurally identifiable, but real-world data employed to fit the model results may lead to unidentifiable parameters for reasons related to the characteristics of the data available.
We extend the computational approach developed by Rossa and Chowell [Citation51] to assess parameter identifiability in a compartmental model defined by nonlinear ODEs. Then, the computational approach that combines the optimization process determined by the simulated annealing method and a bootstrap process allows us to assess the identifiability of the corresponding parameters. In particular, structural identifiability is assessed by employing synthetic data with known parameters and the practical one with the same synthetic data, but the model is fitted only to the initial phases of the data curve (e.g. pre-peak periods). In both computations we expect to recover the set of assumed parameters with small uncertainty and error. These properties ensure that the model is indeed associated with identifiable parameters. Finally, we apply the identifiability analysis to various sets of parameters to estimate. We vary their elements in increasing numbers and measure the ability of the model to estimate the parameter sets in each case by fitting the data curves generated with assumed parameters. This procedure can be understood as a sensitivity analysis.
In the application to Chilean data, we consider the parameter sets identifiable from our computational approach (with synthetic data) and fit some regional Chilean data. We hope to confirm the conclusions from the identifiability study with our application to the Chilean case.
In our study, a parameter set is identifiable if it satisfies some criteria defined in Section 2.5, which are an adaptation from the definition presented in [Citation51], considering analyzing the confidence intervals and the mean square error (MSE).
2. Methods
2.1. Compartmental model
We apply our methodology to a complex compartmental transmission model inspired by the strategies implemented by the Chilean government to mitigate the COVID-19 emergency in the initial phase of the outbreak. For this purpose, we consider a single population and adopt a simplified version of the model described in [Citation15], combined with the description of individuals in quarantine proposed in [Citation26]. Individuals within the model are classified as susceptible (S), in home quarantine (Q), latent (), partially infectious but not yet symptomatic (), asymptomatic (A), infectious and will not be tested (), infectious and will be tested and isolated (), hospitalized/isolated infectious (J), recovered (R), and deceased (D). Constant population size is assumed, i.e. (1) (1) Note that we stipulate one single class of asymptomatic individuals. At the same time, in [Citation15] a distinction is made between asymptomatic individuals who will not be tested and those who are asymptomatic and will be tested and isolated.
It is assumed that home-quarantined individuals do not have contact with infected individuals, due to severe travel restrictions and rigorous supervision by their local communities, do not have contact with infected individuals. The parameters p and represent the percentage of individuals in quarantine and the quarantine duration, respectively. Therefore, class Q removes susceptible individuals from the infection dynamics when or , and there is no quarantine when .
Five classes can contribute to new infections, namely , A, , , and J. Each class has a relative transmission linked with the parameters , , q and h. These parameters are the relative transmissibilities from partially infectious population , asymptomatic population A, asymptomatic/infected population that wear PPE, and the infected population isolated or in hospitals J. Consequently, the product hJ models the contribution of hospitalized individuals to transmission.
If we denote by the rate at which individuals move from class X to class Y, then susceptible individuals move to at the rate where β denotes the overall transmission rate. Individuals in progress to at rate . Individuals from are partially infectious, with relative transmissibility , and progress at rate , where a proportion become asymptomatic and partially infectious (relative transmissibility ), and become fully infectious. Among the proportion that become fully infectious, will be tested, while will be undetected. Asymptomatic individuals who are not tested and symptomatic individuals wear personal protective equipment (PPE) (such as wearing masks in public, handwashing, etc.) and thus have relative transmissibility q, which quantifies the effectiveness of those protective behaviours. Individuals within classes A and (who are not tested) recover at rate . Those tested (class ) will progress to the hospitalized and isolated class at diagnosis rate α. Relative transmission within hospitals and isolated places may occur, but our analysis assumes perfect isolation. Hospitalized and isolated indiividuals progress to the recovered class at rate or to the deceased class at rate δ. Therefore, the model is defined by the following system of non-linear ODEs for a single population, where all variables in capital letters are functions of time t, i.e. it is understood that , and so on, and a prime denotes the derivative, that is , etc.: (2) (2) The auxiliary variable C tracks the cumulative number of diagnosed/reported cases from the start of the outbreak, and represents the new diagnosed cases. The quantity C is not a state of the system of equations but simply a class to track the cumulative incidences cases; thus, individuals from population are not moving to class C. The initial values of these state variables are denoted by , where A schematic of the transmissions is provided in Figure .
The basic, control, and effective control reproduction numbers , and quantify the epidemic impact of infectious disease in a population and the effectiveness of control measures. The basic reproduction number defines the average number of secondary cases generated by primary infectious individuals during the early transmission period when the population is completely susceptible and without control interventions. The control reproduction number defines the average number of new cases generated by infectious individuals when the population susceptible is with some control measure or intervention. The effective control reproduction number is the reproduction number that varies with time, and it is defined as partially susceptible population, where if , the disease transmission continues, and if the disease transmission declines. Therefore, while the basic reproduction number and control reproduction number captures the transmission potential of infectious disease during the early epidemic, the effective control reproduction number , captures changes in the transmission potential over time.
To calculate these numbers, we employ the well-known next-generation matrix approach [Citation59]. This yields the expression where ρ denotes the spectral radius and for the present model, the matrices and are given by The result is (3) (3) Moreover, the corresponding effective reproduction number is defined as where is the proportion of susceptible individuals in the population at time t, where the total population N is assumed to be constant (see (Equation1(1) (1) )). The explicit expression (Equation3(3) (3) ) implies the following contributions of the individual compartments: where (4) (4) This parameter is a key epidemic parameter to measure the transmissibility of a pathogen; by (Equation3(3) (3) ) we see that is a function of several parameters of the epidemic model (Equation2(2) (2) ), including transmission rates and infectious periods of the epidemiological states that contribute to new contagions. In addition, we can observe that the basic reproduction number corresponds to the control reproduction number in the absence of interventions, i.e. when q = 1, h = 0, and (where q is interpreted as the level of no protection and 1−q is the level of protection due to the use of PPE). These assumptions lead to the expression
2.2. Simulated data
Using our model proposed for COVID-19, we simulate various scenarios fixing their parameters and initial conditions. These curves will help analyze the ability of our model to capture such dynamics. Then a parameter estimation process is necessary together with a bootstrapping procedure to examine the identifiability of several of its parameters, as will be explained in the following. For this, we assume the values and references summarized in Table for the parameters, and for the initial conditions the values for a total population N = 500, 000. These values are used to simulate the following scenarios.
2.2.1. Scenario 1: initial phase for three values of without quarantine
We consider three values of that generate three values of β via (Equation3(3) (3) ). These values along with other parameters are summarized in Table . The corresponding incidence curves are generated for , see Figure (a). These incidence curves will be our datasets for the identifiability study. Notice that the curve for effectively grows more slowly than the others, and its maximum is smaller than the curves with and .
2.2.2. Scenario 2: initial phase for three values of with quarantine.
For this scenario, we consider the same values for and corresponding values of β as in Scenario 1 since expression (Equation3(3) (3) ) does not depend on the parameters p or λ. The pairs (p = 0.01, ) and (p = 0.05, ), with the other parameters as given in Table , produce the datasets shown in Figure (b,c), respectively. Again, the curve with grows more slowly than the others, and its maximum is smaller than the curves with and . Besides, the maxima of the curves with p = 0.05 are consistently smaller than those of the corresponding curves with p = 0.01, which evidences the effect of the quarantine modelled.
Using the bootstrapping approach, we intend to use these curves defined for each scenario to generate multiple samples from the best-fit model. Before doing this, we need to define the parameter estimation process to explain the bootstrap approach and finally the identifiability study.
2.3. Parameter estimation
To estimate the parameter values of our COVID-19 model, we need to use one optimization process to fit the model to each simulated dataset. Then, the method used for this purpose is outlined as follows, executed in Matlab (Mathworks, Inc).
2.3.1. Mixed simulated annealing method with a least-squares approach (SA-LSQ)
The SA-LSQ approach is defined as a parameter estimation process that combines the simulated annealing and least-squares optimization algorithms to minimize the Euclidean distance between the incidence curve , defined by our COVID-19 model (Equation2(2) (2) ), and the time series data () that corresponds to time series data with , where T is the final time (date). Here Θ is the set of parameters to estimate of our COVID-19 model.
In our approach, we obtain the set of estimated parameters that satisfies with initial guess and the objective function for a given normalizing coefficient R.
We wish to avoid attaining local minima and to guarantee that the optimization process finds the global minimum. Consequently, due to the sensitivity of the solution with respect to the initial guess utilized for these processes, we decided, before starting the SA fitting, to select the best initial guess running several Latin hypercube sampling (LHS) runs. This is done by the built-in Matlab function lhsdesign(), applied to the bounds defined for each parameter to estimate.
Together with simulated annealing (SA), using the Matlab built-in function simulannealbnd(), and the best initial guess obtained with LHS, we estimated the parameters that are transformed as the initial guess to apply the least squares method (LSQ) with lsqcurvefit() Matlab function to speed up the steepest descent phase of the simulated annealing algorithm.
More details about the procedure are summarized in Algorithm 1.
This method is a powerful optimization tool, particularly for fitting models that describe epidemic phenomena with real data, such as those developed in [Citation7, Citation8, Citation36, Citation68, Citation70] Additionally, the SA technique was inspired by Metropolis et al. [Citation37] to select the new characterized solutions. Even so, its adaptation allows the application to a frequentist paradigm.
Specifically, to study parameter identifiability, we define various experiments that use the generated curves in Scenarios 1 and 2, and we intend to estimate or recover some parameters of interest from these curves. Therefore parameters of interest are selected from , while the parameters not selected are fixed to the respective value of Table .
2.3.2. Sets of parameters to be estimated
For the data generated using Scenarios 1 and 2, we define the sets of parameters to estimate given in Table . Each combination explored will be mentioned as an experiment. For example, Experiment 101 indicates to data from Scenario 1 to estimate the parameter set 1 and Experiment 201-1 corresponds to data from Scenario 2 to estimate the parameter set 1, with p = 0.01, where Experiment 201-2 will be with p = 0.05 (see Tables and ). Therefore, for Scenario 1, we have 24 experiments, that is, the number of values assumed of (three) times the number of parameter sets (eight). Likewise, for Scenario 2, we have 84 experiments corresponding to the number of values of (three) times the number of values of p (two) times the number of parameter sets (fourteen).
2.4. Bootstrapping method
This method uses the parametric bootstrapping approach with Poisson error structure (see [Citation14, Citation51]) to repeatedly sample observations from the best-fit obtained by the SA-LSQ fit of our COVID-19 model to an time series data. This process can be summarized in the following steps:
Obtain the deterministic model solution or total daily incidence series from the best-fit obtained using the SA-LSQ estimation.
Generate S replicated datasets assuming the Poisson error structure. To generate these simulated data, we incorporate the Poisson error structure using the incidence curve , where for each time we generate a new incidence value using a Poisson random variable with mean . Therefore, this new dataset represents an incidence curve for the system, assuming the time series follows a Poisson distribution centred on the mean at time points .
Re-estimate model parameters for each of the S simulated realizations (using the SA-LQS method), which are denoted by for .
Finally, using the set of for , we construct the confidence intervals for each estimated parameter. Also, for each set of estimated parameters, is calculated to obtain a distribution of values as well.
2.5. Parameter identifiability
2.5.1. Preliminaries
The complexity of our mathematical model involves ten epidemiological states or compartments and an auxiliary variable that indicates the cumulative number of symptomatic diagnosed or new cases reported. In addition to the ten system states, our model consists of 15 parameters (see Table ), where five classes contribute to a new infection , which also contribute to the representation of in these classes, see Equations (Equation3(3) (3) ) and (Equation4(4) (4) ). Then, the estimated individual parameters are used to compute . Some initial conditions of the model are unknown and possibly need to be estimated along with the other model parameters. For our case we consider the values and to estimate together to the parameters α, β, , and p. This defines the set from parameters to estimate , where the remaining parameters and initial conditions are known or assumed.
To estimate the parameter sets, we fit the model (with other parameters and initial values fixed) to each dataset using the auxiliary variable and the curve of the daily cases generated. The parameter estimation procedure employs the methods detailed in Section 2.3. The initial parameter prediction affects the solution for the model as local minima occur. To guarantee the global minimum obtained in the optimization processes, we generate a multi-start (using LHS in Algorithm 1) repeating these processes for a random amount of initial parameters, whose ranges are defined as follows: Here, the particular choice of the wide range for aims to avoid restrictions during the calibration of the model due to its sensitivity.
We focus the parameter identifiability study for (Equation2(2) (2) ) on characterizing the identifiable parameters using uncertainty studies to assess the stability of the parameters estimated. Then, the goal is to evaluate the identifiability using the mean value, the confidence intervals, and the mean square error (MSE). The step-by-step description of the identifiability analysis is detailed in the next subsection.
2.5.2. Identifiability analysis process
In the study of identifiability, we must follow the following steps that rely on definitions and computations presented in detail in [Citation14, Citation51]:
Assume that is the set of parameters to be estimated and is the initial guess.
Estimate Θ fitting the model to the dataset generated, using the SA-LSQ procedure with the values of the initial conditions and parameters given. The hat symbol indicates an estimated parameter. Then, we have the set
Using the bootstrapping process we obtain the confidence intervals and their frequency distributions for each parameter estimated.
Compute the mean squared error (MSE) for each parameter estimated. Here Θ represents the true parameter value, and represents the estimated value of the parameter for the jth bootstrap realization.
For each set of estimated parameters, is calculated to obtain a distribution of values as well.
Following the process described, we say that a model parameter is identifiable if its confidence interval lies in a finite range of values and contains the estimated (and real) parameter value. The preferred situations is that these values have a central position or are close to the mean value. Besides, a small confidence interval indicates that the parameter can be identifiable, while a broader range could indicate a lack of identifiability. When an estimated parameter has a low MSE and a small confidence interval, the we may assert that the parameter is identifiable from the model. On the other hand, larger confidence intervals or larger MSE values may suggest non-identifiability. Here confidence interval is considered to be small if its lower and upper bounds are not smaller or greater, respectively, than 10% of the value estimated. Besides, the smallness of the MSE depends on the nature of the estimated parameter. For instance, if the parameter represents a probability, then we say that MSE is small if it is a value close to zero, but if the parameter represents the quantity of population, one acceptable range can be considered no more than 0.01% of the total population.
Specifically, to analyze structural identifiability, we will fit the entire curves generated for each scenario to recover the values from the parameters assumed in each case. Moreover, to investigate practical identifiability, we will consider different fit times. This is done to address situations when only data of the initial phase (before the peak or maximum value) are available.
3. Results
3.1. Structural identifiability
By applying the identifiability methodology, we obtain results for each one of the steps 1 to 5 outlined in Section 2.5.2. The first type of results corresponds to the parameter estimation process that leads to the best-fit model solution. Secondly, we have 250 replicated datasets applying the bootstrap process to re-estimate model parameters. Figures and (Appendix) show the confidence intervals (represented by vertical lines) and the MSE bar plots for the experiments of Scenario 1, and Figures S1 and S2 (in the supplemental material) display analogous results for the experiments of Scenario 2. Observing the results obtained and registered in these figures, we concentrate our investigation on Tables and , in which we classify the parameters estimated between identifiable and non-identifiable. A parameter is considered identifiable with narrow confidence interval, a mean value centred and close to the real value, and the MSE is small. If at least one of these identifiability criteria is not satisfied, then the parameter is considered non-identifiable.
From Table we conclude that all estimated parameters are identifiable for Experiments 1 to 4, i.e. those without initial conditions to fit. For Experiments 5 to 7 in which the initial conditions are also estimated, we note that all the parameters except the initial condition are identifiable, maintaining mainly the identifiability of the parameter of interest . Only Experiment 8 exhibits non-identifiability situations for all parameters except for the dataset generated under the assumption . Here we can suspect that the identifiability improves when the force of infection is greater as in this last case. Nevertheless, we will continue to explore this part later. Another observation is that the estimated can be robust to variation or bias in the other estimated parameters.
Analogously, we deduce from Table that the parameters estimated in Experiments 1, 2, 3, and 5 are identifiable. The parameters in Experiments 8, 9, 10, and 12 are also identifiable (except for the initial condition ). For Experiments 4, 7, and 9, the parameter is identifiable even though no other parameters are identifiable. Particularly in Experiment 7, all the remaining parameters are non-identifiable. Other situations occur; for example, in Experiment 6, all estimated parameters except those based on data generated with and p = 0.05 are identifiable (red boxes in column Experiment 6 from Table ). For Experiment 13, all estimated parameters based on data generated with and 5 are identifiable. For Experiment 14, the results are not good, but the parameter estimated using data generated with p = 0.01 is identifiable. A similar situation occurs with in Experiment 11, where this quantity is identifiable only by fitting the data generated with and p = 0.1.
3.2. Practical identifiability
The results obtained for this part are in Supplemental Material, including coloured tables, namely Tables S2 and S3 that display information analogous to Tables and . Tables S2 and S3 allow one to analyze the identifiability. For this study, we investigate the performance of our model to capture the initial phases of an epidemic disease, examining the choices of the fit periods, namely 20, 30, or 40 days. According to Table S2, in Experiment 1 the estimated parameters are identifiable. In Experiments 3 and 5 with fit times 30 and 40 days, the parameters are identifiable, even though not all estimated parameters are identifiable. In contrast to this result, in Experiment 2 with a fitting period of 20 days, the parameters that define are identifiable, but the computed value is not. This situation could be linked with the inverse relation of with parameter α (see Equation (Equation3(3) (3) )). Remember that represents the time from symptom onset to isolation or hospitalization. Thus a variation in α of 0.1 to 0.2 implies a change of 5 days from the period of symptoms onset to isolation, a very sensitive parameter. Notably, for this experiment, the confidence intervals and the MSE are plotted in Figure S3, where we can see that these intervals are not very large but lack precision, mainly for the case of a fit period of 20 days, and the epidemic growth is given by and 4. For this and other experiments, the parameter tends to be identifiable when fit periods are greater than 20 days, and . The parameter and the initial condition and for all experiments of Scenario 1 are not identifiable. This suggests a lack of identifiability especially of the parameter when we have fewer data to fit.
Table S2 also shows that the estimated parameters for Experiment 1 are identifiable. This result differs from Experiments 12, 13, and 14, where all parameters are non-identifiable. For Scenario 2, we can observe that for most experiments with a fit period of 20 days (except for Experiments 1 and 9 with ), the parameter is non-identifiable. Moreover, all estimated parameters are also non-identifiable, especially for these cases, excluding Experiment 2. These results suggest using more fit data to guarantee identifiability, as it occurs more frequently in experiments with longer fit periods.
Another situation to comment on here is that for most experiments where the quarantine parameter p is estimated, the results mainly for result non-identifiable, excluding only Experiment 5 for the fit period 40 days, and assumed as 4 and 5, and Experiment 6 again for the fit period 40 days and and 4. Besides, for some quarantine parameters estimated from Experiments 5 and 6 with fit period 40 days, the result is identifiable, mostly when the fit data are generated assuming p = 0.01. We can think that the lack of identifiability increases when a larger part of the population is quarantined in the initial phase.
4. Application to regional data of Chile
We now apply our methodology to the data collected for some Chilean regions. From the structural and practical results obtained for our model, we apply the study to two cases, one to scan the initial phase (first 30 days) and another for an intermediate phase (first 120 days), considering the data before the change of quarantine criteria applied by the Chilean government. To explain the selection of regions and the phenomena of COVID-19 in Chile, we include the following subsection.
4.1. The early transmission of COVID-19 in Chile
We wish to apply the COVID-19 model of Section 2.1 to some administrative regions of Chile (see Figure ), where there are regions without quarantine periods, others with more than one quarantine period, a situation that the Chilean government named ‘dynamical quarantines’. This policy was in effect until July 27 2020. In addition, this measure was applied only to municipalities with a greater incidence of positive cases for COVID-19. For this reason, we consider a decoupled analysis to select the regions to analyze until July 27 2020, assuming the day of the first positive case as the first day for the timeline of each region and July 27 2020 is the last day. Since the onset of COVID-19 in Chile, the Ministry of Health has been reporting new cases of infection and death daily for each region and new cases of recovered persons at the national level. The official information (see [Citation24, Citation41] for details) also includes the daily numbers of polymerase chain reaction (PCR) tests applied, as well as the daily numbers of critical patients and occupancy of intensive care units (ICUs). A timeline of strategies and measures implemented to contain the pandemic is summarized in Table S1. Several times the Chilean Ministry of Health changed the case and death definitions. Table lists the most relevant of these changes along with their dates. Considering the difficulties associated with the changes in case definitions, we decided to include data reported for symptomatic and asymptomatic cases in the COVID-19 model. Total data correspond to the addition of symptomatic and asymptomatic cases reported because an asymptomatic (tested) case may become symptomatic later.
4.2. Experiments applied to regional data
Analyzing the type of information and data published by the Chilean government we identified regions with more than one quarantine period. Then, considering these multiple quarantine periods and the conclusions obtained from our experimental identifiability study, we decided to select only those regions with exactly one and without quarantine periods from the Chilean data. Specifically, we selected those administrative regions that satisfy at least one of these conditions, namely those with
exactly one unique quarantine period for the first 30 or 120 days,
no quarantine periods in the first 30 or 120 days.
We also prefer data with an increasing number of reported cases, avoiding having consecutive null case counts, i.e. we select robust datasets for capturing epidemic growth. Overall, in light of the experimental results based on simulated data to analyze the (structural and practical) identifiability of the parameters, we know that the best results are achieved for the sets of estimated parameters and . However, to observe the quarantine parameter, we also consider the parameter sets and , for the case that the region has a single quarantine period, even knowing that these parameters are not always identifiable from the experimental results in the initial phase. Table provides more details about parameter identifiability results.
Subsequently, we filter the regional data for two periods, one for the first 30 days and the other for 120 days, and we select some of those that satisfy the above mentioned conditions. We have established the following regions for our application.
Regions 1 (Tarapacá) and 8 (Bío-Bío), without quarantine periods in the initial phase and one quarantine in the final phase.
Regions 3 (Atacama) and 4 (Coquimbo) were without quarantine during the 120 days.
Region 10 (Los Lagos) had one quarantine in the initial phase.
Then, with this regional data, we apply our identifiability study to two events,
For regions without quarantine, we estimate the parameter sets (Experiment 1) and (Experiment 2). We select a fit period of 30 or 120 days depending on the phase analyzed.
For regions with quarantine, we estimate the parameter sets (Experiment 1) and (Experiment 2).
We are preserving the same consideration for the time fit.
The left plots in Figures S4 and S5 display the regional data distributions, including some dates of quarantine measures and important holidays.
Establishing the experiments and the Chilean data for the analysis, we need to fix the values of those parameters that are not estimated. In Table , we summarize the values assumed for the parameters. For example, we consider the parameters for the time from symptom onset to isolation or hospitalization , and the proportions of fully infectious individuals who undergo testing are averaging. In addition, we assume the quarantine parameters and p = 0 when the experiment does not involve quarantine periods. Alternatively, when there is quarantine, the quarantine parameter p is estimated depending on the region, being the parameter the quarantine duration applied to it (see Table ).
Establishing the experiments and the Chilean data for the analysis, we need to fix the values of those parameters that are not estimated. In Table , we summarize the values assumed for the parameters. For example, we consider the parameters for the time from symptom onset to isolation or hospitalization , and the proportions of fully infectious individuals who undergo testing are averaging. In addition, we assume the quarantine parameters and p = 0 when the experiment does not involve quarantine periods. Alternatively, when there is quarantine, the quarantine parameter p is estimated depending on the region, being the parameter the quarantine duration applied to it (see Table ).
The initial conditions satisfy the considerations assumed in the identifiability study, i.e. where for the case when and are estimated, these quantities are assumed to satisfy and . In those cases where and are assumed, the values and are utilized.
4.3. Application results
Finally, applying our methodology to Chilean regions data selected for the experiments defined, we obtained the parameter estimations (Table ), the fits plotted in Figures S4 and S5 and Table which we constructed following the same procedure as for Tables and . We observe the 95% confidence intervals obtained by the 250 replicates of the bootstrapping process and the MSE (see Figures S6 to S14 in Supplemental Material). According to the coloured table, the parameters β and (computed) are usually identifiable, except for Region 10, which has a quarantine period in the first 30 days. Quarantine leads some points to fit, so perhaps our model is unsuccessful in this case, and the estimates generated are non-identifiable. As expected, given the summary of results in Table , Region 8 has a short quarantine period in the middle of the final phase, and it turns out difficult to fit data from this region during 120 days to estimate the sets of identifiable parameters. On the other hand, for Region 1, with a prolonged quarantine period in the final phase, conceivably, the initial conditions incorporate more noise in the estimations because it is difficult to estimate the parameters in Experiment 2.
In short of these two last regions, we can see that the duration of quarantine and the phase analyzed can be decisive when estimating parameters; these examples suggest that we need to explore more combinations of quarantine periods in our COVID-19 model. In addition, the parameters estimated from experiments applied to Regions 3 and 4 are identifiable, as expected, given the structural and practical identifiability demonstrated for our model using our methodology. In the same way, we verify the identifiability obtained for Experiments 1 and 2 applied to Region 8, and Region 1 has identifiable parameters only when we use Experiment 2. However, their confidence intervals are not distant when we apply Experiment 1. The noise from initial conditions is possibly more prevalent in Region 1 than Region 8.
5. Discussion and conclusions
We present and apply a computational approach developed by Roosa and Chowell [Citation51] for simple epidemic models. However, we propose adapting it to a complex compartmental model inspired by the first wave of the COVID-19 outbreak in Chile, where quarantines were declared involving different Chilean regions in early phases. Specifically, we propose a methodology to analyze the structural and practical parameter identifiability for a COVID-19 model. We explore the parameter uncertainty computationally through a parametric bootstrapping approach to achieve that goal. In that exploration, we involve synthetic data generated with the same COVID-19 model to measure the capability of our model to recover the parameters assumed to satisfy the identifiability structural and practical. Those criteria involve observing the confidence intervals and the MSE (mean square error), constructed from empirical distributions resulting from the bootstrapping process. In this first attempt, we selected some experiments that would allow us to validate the structural and practical parameter identifiability. We consider different variables, for example, the number of parameters to fit, the time to fit, and the quarantine periods. Then, in our methodology, we construct coloured tables (e.g. Table ) to visualize these qualities. Overall in the results obtained, we observe that the parameter β, and in consequence , are identifiable (when the rest of the parameters are known). The following best parameter sets in terms of structural identifiability are the sets without initial conditions to estimate, especially in the case without quarantines involved. Wherein the cases with quarantine are more sensitive as a percentage of the population in quarantine increases and the epidemic growth is , evidence of a lack of identifiability in the model when the quarantine dynamics are considered. Then, we would take quarantine periods studied in the models with caution. Expanding the analysis to the initial phases, we find a significant loss of parameter identifiability. It compared with the structural results. Possibly verifying the conclusion shown in [Citation52] indicates the non-identifiability in pre-peak moments. However, we are interested in analyzing the initial period to recover the control reproduction number. Therefore, we consider applying our computational approach to various initial epidemic periods to evaluate our COVID-19 model. In our experiments, we confirm that the parameter β is recovered, and in the cases when the initial conditions are also estimated and fit time is days (see Tables S2 and S3 in the supplemental material) identifiability is guaranteed. Showing a characteristic dependence between the fit times and the growth rates, besides quarantine periods, which also affect identifiability, but maintain the parameter β identifiable.
Finally, collecting the conclusions from the experiment with synthetic data, we select the parameter sets (β), and () for periods without quarantine, and (), and () for periods with quarantine, to fit with Chilean data for some regions. The model parameters are not identifiable in the Chilean regions with a quarantine period in the initial phase. On the other hand, regions without quarantine periods have better results. In particular, when more data () is fit to estimate initial conditions is guaranteed practical identifiability. (see Table ). For future work, we see the necessity of selecting the fit times for each Chilean region because each one has a particular quarantine declared; besides, the quality data also is variable. The study developed by Freire-Flores et al. in [Citation21] also analyzes the quarantine periods in some Chilean regions using computations of reproduction numbers on different periods characterized by extraordinary government measures. Then we can use these times of fitting with our approach and compare with these results, where, in particular, the results for the parameter β in Region 8 are close.
Our findings further underscore the importance of investigating the identifiability of model parameters through structural and practical identifiability analysis to interpret the results better and avoid misleading model inferences during an epidemic outbreak [Citation22, Citation51, Citation56]. On the one hand, structural identifiability analysis should be conducted before the model is applied to real data since it can help guide the modeller about any parameters in the model that are not identifiable even in the scenario that the entire epidemic trajectory has been observed. Parameters deemed not structurally identifiable should be flagged by the modeller and determine whether the model can still be used to infer other quantities of interest. Analyzing a parameter's practical identifiability is unnecessary if it is not structurally identifiable. On the other hand, our results indicate that modellers should be especially careful when calibrating complex epidemic models using limited data of the early epidemic phases, as one or more parameters are often difficult to identify from limited data depending on the model's complexity level. Parameter identifiability occurs more frequently in cases involving more complex epidemic dynamics where parameters change over time due to the implementation and relaxation of social distancing interventions. Unsurprisingly, a parameter is unidentifiable when its function is to model dynamics for an event (e.g. quarantine measures) that has not occurred. As more data on an unfolding epidemic becomes available, model parameters tend to become identifiable, especially when the epidemic has already reached its peak. This highlights the challenges modellers face when using models to generate real-time epidemic predictions as the epidemic evolves.
Supplemental Material
Download PDF (5.9 MB)Disclosure statement
No potential conflict of interest was reported by the author(s).
Availability of data and materials
The data were recorded by the Chilean Government [Citation24], which can be accessed via a GitHub repository supported by Ministery of Science, Technology, Knowledge, and Innovation of Chile [Citation41].
Additional information
Funding
References
- A. Adiga, D. Dubhashi, B. Lewis, M. Marathe, S. Venkatramanan, and A. Vullikanti, Mathematical models for COVID-19 pandemic: A comparative analysis, J. Indian Inst. Sci. 100 (2020), pp. 793–807.
- A. Afzal, C. Saleel, S. Bhattacharyya, N. Satish, O.D. Samuel, and I.A. Badruddin, Merits and limitations of mathematical modeling and computational simulations in mitigation of COVID-19 pandemic: A comprehensive review, Arch. Comput. Meth. Eng. 29 (2022), pp. 1311–1337.
- R.M. Anderson and R.M. May, Infectious Diseases of Humans: Dynamics and Control, Oxford University Press, 1991.
- M. Biggerstaff, R. Slayton, M. Johansson, and J. Butler, Improving pandemic response: Employing mathematical modeling to confront xoronavirus disease 2019, Clin. Infect. Dis. 74 (2022), pp. 913–917.
- F. Brauer and C. Castillo-Chavez, Mathematical Models in Population Biology and Epidemiology, Springer, New York, 2012.
- J.H. Buckner, G. Chowell, and M.R. Springborn, Dynamic prioritization of COVID-19 vaccines when social distancing is limited for essential workers, Proc. Nat. Acad. Sci. 118(16) (2021), p. e2025786118.
- R. Bürger, G. Chowell, and L.Y. Lara-Díaz, Comparative analysis of phenomenological growth models applied to epidemic outbreaks, Math. Biosci. Eng. 16 (2019), pp. 4250–4273.
- R. Bürger, G. Chowell, and L.Y. Lara-Díaz, Measuring differences between phenomenological growth models applied to epidemiology, Math. Biosci. 334 (2021), p. 108558. https://doi.org/10.1016/j.mbs.2021.108558
- T. Burki, COVID-19 in Latin America, Lancet Infect. Dis. 20(5) (2020), pp. 547–548. https://doi.org/10.1016/s1473-3099(20)30303-0
- Census Chile, Census Chile (2017). Accessed February 7, 2021. Available at http://www.censo2017.cl/descargas/home/sintesis-de-resultados-censo2017.pdf.
- J.W. Chan, S. Yuan, K. Kok, K.W. To, H. Chu, J. Yang, F. Xing, J. Liu, C.Y. Yip, R.W.S. Poon, H.W. Tsoi, S.K.F. Lo, K.H. Chan, V.K.M. Poon, W.M. Chan, J.D. Ip, J.P. Cai, V.C.C. Cheng, H. Chen, C.K.M. Hui, and K.Y. Yuen, A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: A study of a family cluster, Lancet 395 (2020), pp. 514–523.
- J. Chapman and N. Evans, The structural identifiability of susceptible–infective–recovered type epidemic models with incomplete immunity and birth targeted vaccination, Biomed. Signal Process. Control 4 (2009), pp. 278–284.
- O.T. Chis, J.R. Banga, and E. Balsa-Canto, Structural identifiability of systems biology models: A critical comparison of methods, PLoS One 6(11) (2011), p. e27755.
- G. Chowell, Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecast, Infect. Disease Model. 2 (2017), pp. 379–398.
- G. Chowell, D. Chowell, K. Roosa, R. Dhillon, and S. Devabhaktuni, Sustainable social distancing through facemask use and testing during the COVID-19 pandemic, preprint (2020). Available at medRXiv.
- G. Chowell, S. Dahal, Y. Liyanage, A. Tariq, and N. Tuncer, Structural identifiability analysis of epidemic models based on differential equations: A primer, preprint (2022). Available at arXiv.
- G. Chowell, P. Fenimore, M. Castillo-Garsow, and C. Castillo-Chavez, SARS outbreaks in Ontario, Hong Kong and Singapore: The role of diagnosis and isolation as a control mechanism, J. Theor. Biol. 224 (2003), pp. 1–8.
- O. Diekmann, H. Heesterbeek, and T. Britton, Mathematical Tools for Understanding Infectious Disease Dynamics, Vol. 7. Princeton University Press, 2013.
- N. Evans, L. White, M. Chapman, K. Godfrey, and M. Chappell, The structural identifiability of the susceptible infected recovered model with seasonal forcing, Math. Biosci. 194 (2005), pp. 175–197.
- S. Flaxman, S. Mishra, A. Gandy, H.J.T. Unwin, T.A. Mellan, H. Coupland,E Eeeetal, H. Zhu, T. Berah, J.W. Eaton, M. Monod, A.C. Ghani, C.A. Donnelly, S. Riley, M.A.C. Vollmer, N.M. Ferguson, L.C. Okell, S. Bhatt, and Imperial College COVID-19 Response Team, Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe, Nature 194 (2005), pp. 175–197.
- D. Freire-Flores, N. Llanovarced-Kawles, A. Sanchez-Daza, and Á. Olivera-Nappa, On the heterogeneous spread of COVID-19 in Chile, Chaos Solit. Fract. 150 (2021), p. 111156. Available at https://www.sciencedirect.com/science/article/pii/S0960077921005105.
- L. Gallo, M. Frasca, V. Latora, and G. Russo, Lack of practical identifiability may hamper reliable predictions in COVID-19 epidemic models, Sci. Adv. 8 (2022), p. eabg5234. https://doi.org/10.1126/sciadv.abg5234.
- P.J. Garcia, A. Alarcón, A. Bayer, P. Buss, G. Guerra, H. Ribeiro, K. Rojas, R. Saenz, N. Salgado de Snyder, G. Solimano, R. Torres, S. Tobar, R. Tuesca, G. Vargas, and R. Atun, COVID-19 response in Latin America, Am. J. Trop. Med. Hyg. 103 (2020), pp. 1765–1772.
- Government of Chile, COVID-19 data. Accessed February 16, 2021. Official COVID-19. Available at https://www.gob.cl/coronavirus/cifrasoficiales/#reportes.
- H. Hong, A. Ovchinnikov, G. Pogudin, and C. Yap, SIAN: Software for structural identifiability analysis of ODE models, Bioinformatics 35 (2019), pp. 2873–2874. https://doi.org/10.1093/bioinformatics/bty1069.
- J. Jia, J. Ding, S. Liu, G. Liao, J. Li, B. Duan, G. Wang, and R. Zhang, Modeling the control of COVID-19: Impact of policy interventions and meteorological factors (2020).
- Y.H. Kao and M.C. Eisenberg, Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment, Epidemics 25 (2018), pp. 89–100.Available at https://www.sciencedirect.com/science/article/pii/S1755436517301627.
- W. Kermack and A. McKendrick, A contribution to the mathematical theory of epidemics, Proc. Roy. Soc. A 115 (1927), pp. 700–721.
- E. Kharazmi, M. Cai, X. Zheng, Z. Zhang, G. Lin, and G.E. Karniadakis, Identifiability and predictability of integer- and fractional-order epidemiological models using physics-informed neural networks, Nature Comput. Sci. 1 (2021), pp. 744–753.
- T. Kirby, South America prepares for the impact of COVID-19, Lancet Respir. Med. 8(6) (2020), pp. 551–552.
- H.J. Li, L. Wan, Z. Wang, C.X.Z. u, A. Moustakas, and S. Pei, Mathematical modelling of the pandemic of 2019 novel coronavirus (COVID-19): Patterns, dynamics, prediction, and control, Front. Phys. 9 (2021), p. 738602.
- Y. Li, H. Campbell, D. Kulkarni, A. Harpur, M. Nundy, X. Wang, and H. Nair, The temporal association of introducing and lifting non-pharmaceutical interventions with the time-varying reproduction number (R) of SARS-CoV-2: A modelling study across 131 countries, Lancet Infect. Dis. 21 (2021), pp. 193–202. Available at https://www.sciencedirect.com/science/article/pii/S1473309920307854.
- T.S. Ligon, F. Fröhlich, O.T. Chiş, J.R. Banga, E. Balsa-Canto, and J. Hasenauer, GenSSI 2.0: Multi-experiment structural identifiability analysis of SBML models, Bioinformatics 34 (2017), pp. 1421–1423. https://doi.org/10.1093/bioinformatics/btx735.
- N. Linton, T. Kobayashi, Y. Yang, K. Hayashi, A. Akhmetzhanov, S.M. Jung, B. Yuan, R. Kinoshita, and H. Nishiura, Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data, J. Clin. Med. 9 (2020), p. 538. https://doi.org/10.3390/jcm9020538.
- M. Martcheva, An Introduction to Mathematical Epidemiology, Springer, New York, 2015.
- T. McKinley, A.R. Cook, and R. Deardon, Inference in epidemic models without likelihoods, Int. J. Biostat. 5(1) 2009. https://doi.org/10.2202/1557-4679.1171.
- N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (2004), pp. 1087–1092. https://doi.org/10.1063/1.1699114.
- H. Miao, C. Dykes, L.M. Demeter, J. Cavenaugh, S.Y. Park, A.S. Perelson, and H. Wu, Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments, Bull. Math. Biol. 70 (2008), pp. 1749–1771.
- H. Miao, X. Xia, A.S. Perelson, and H. Wu, On identifiability of nonlinear ODE models and applications in viral dynamics, SIAM Rev. 53 (2011), pp. 3–39.
- Ministerio de Salud, Chile, Ministerio de Salud, Chile. Accessed July 29, 2021. Available at http://www.minsal.cl.
- Ministery of Science, Technology, Knowledge and Innovation of Chile, Website of the official database for COVID-19 research. Accessed February 16, 2021. Available at http://www.minciencia.gob.cl/covid19.
- K. Mizumoto, K. Kagaya, A. Zarebski, and G. Chowell, Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020, Euro Surveill. 25(10) (2020). https://doi.org/10.2807/1560-7917.es.2020.25.10.2000180.
- A. Morciglio, B. Zhang, G. Chowell, J.M. Hyman, and Y. Jiang, Mask-ematics: Modeling the effects of masks in COVID-19 transmission in high-risk environments, Epidemiologia 2 (2021), pp. 207–226. https://doi.org/10.3390/epidemiologia2020016.
- V.K. Murty and J. Wu, Mathematics of Public Health, Springer, Cham, 2022.
- J.C. Navarro, J. Arrivillaga-Henríquez, J. Salazar-Loor, and A.J. Rodriguez-Morales, COVID-19 and dengue, co-epidemics in Ecuador and other countries in Latin America: Pushing strained health care systems over the edge, Travel Med. Infect. Dis. 37 (2020), p. 101656. Available at https://www.sciencedirect.com/science/article/pii/S1477893920301241.
- H. Nishiura, T. Kobayashi, T. Miyama, A. Suzuki, S.M. Jung, K. Hayashi, R. Kinoshita, Y. Yang, B. Yuan, A.R. Akhmetzhanov, and N.M. Linton, Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19), Int. J. Infect. Dis 94 (2020), pp. 154–155.
- R. Padmanabhan, H.S. Abed, N. Meskin, T. Khattab, M. Shraim, and M.A. Al-Hitmi, A review of mathematical model-based scenario analysis and interventions for COVID-19, Comput. Meth. Programs Biomed. 209 (2021), p. 106301. Available at https://www.sciencedirect.com/science/article/pii/S0169260721003758.
- J. Paireau, A. Andronico, N. Hozé, M. Layan, P. Crépey, A. Roumagnac, M. Lavielle, P.Y. Boëlle, and S. Cauchemez, An ensemble model based on early predictors to forecast COVID-19 health care demand in France, Proc. Nat. Acad. Sci. 119 (2022), p. e2103302119. https://doi.org/10.1073/pnas.2103302119.
- L. Peng, W. Yang, D. Zhang, C. Zhuge, and L. Hong, Epidemic analysis of COVID-19 in China by dynamical modeling, (2020). Available at medRxiv https://www.medrxiv.org/content/early/2020/02/18/2020.02.16.20023465.
- M. Renardy, D. Kirschner, and M. Eisenberg, Structural identifiability analysis of age-structured pde epidemic models, J. Math. Biol. 84 (2022), pp. 1–30.
- K. Roosa and G. Chowell, Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models, Theor. Biol. Med. Model. 16(1) (2019). https://doi.org/10.1186/s12976-018-0097-6.
- T. Sauer, T. Berry, D. Ebeigbe, M.M. Norton, A.J. Whalen, and S.J. Schiff, Identifiability of infection model parameters early in an epidemic, SIAM J. Control Optim. 60(2) (2022), pp. S27–S48.
- S. Shankar, S.S. Mohakuda, A. Kumar, P. Nazneen, A.K. Yadav, K. Chatterjee, and K. Chatterjee, Systematic review of predictive mathematical models of COVID-19 epidemic, Medical J. Armed Forc. India 77 (2021), pp. S385–S392. Special issue: COVID-19 – navigating through challenges. Available at https://www.sciencedirect.com/science/article/pii/S0377123721001258.
- A. Tariq, T. Chakhaia, S. Dahal, A. Ewing, X. Hua, S.K. Ofori, O. Prince, A.D. Salindri, A.E. Adeniyi, J.M. Banda, P. Skums, R. Luo, L.Y. Lara-Díaz, R. Bürger, I.C.H. Fung, E. Shim, A. Kirpich, A. Srivastava, and G. Chowell, An investigation of spatial-temporal patterns and predictions of the coronavirus 2019 pandemic in colombia, 2020–2021, PLOS Negl. Trop. Dis. 16 (2022), p. e0010228. https://doi.org/10.1371/journal.pntd.0010228.
- E.C.C.W. T.N.C.P.E.R.E. Team, The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) −china (2020).
- N. Tuncer and T.T. Le, Structural and practical identifiability analysis of outbreak models, Math. Biosci. 299 (2018), pp. 1–18. Available at https://www.sciencedirect.com/science/article/pii/S0025556417303164.
- N. Tuncer, A. Timsina, M. Nuno, G. Chowell, and M. Martcheva, Parameter identifiability and optimal control of an SARS-CoV-2 model early in the pandemic, J. Biol. Dyn. 16 (2022), pp. 412–438.
- E.A. Undurraga, G. Chowell, and K. Mizumoto, COVID-19 case fatality risk by age and gender in a high testing setting in Latin America: Chile, March–August 2020, Infect. Dis. Pov. 10(1) (2021).
- P. van den Driessche and J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission, Math. Biosci. 180 (2002), pp. 29–48. Available at https://www.sciencedirect.com/science/article/pii/S0025556402001086.
- E. Vynnycky, and R.E. White, An Introduction to Infectious Disease Modelling. OUP Oxford, 2010.
- P.G.T. Walker, C. Whittaker, O.J. Watson, M. Baguelin, P. Winskill, A. Hamlet, B.A. Djafaara, Z. Cucunubá, D.O. Mesa, W. Green, H. Thompson, S. Nayagam, K.E.C. Ainslie, S. Bhatia, S. Bhatt, A. Boonyasiri, O. Boyd, N.F. Brazeau, L. Cattarino, G. Cuomo-Dannenburg, A. Dighe, C.A. Donnelly, I. Dorigatti, S.L. van Elsland, R. FitzJohn, H. Fu, K.A.M. Gaythorpe, L. Geidelberg, N. Grassly, D. Haw, S. Hayes, W. Hinsley, N. Imai, D. Jorgensen, E. Knock, D. Laydon, S. Mishra, G. Nedjati-Gilani, L.C. Okell, H.J. Unwin, R. Verity, M. Vollmer, C.E. Walters, H. Wang, Y. Wang, X. Xi, D.G. Lalloo, N.M. Ferguson, and A.C. Ghani, The impact of COVID-19 and strategies for mitigation and suppression in low- and middle-income countries, Science 369 (2020), pp. 413–422. https://doi.org/10.1126/science.abc0035.
- J. Wang, Mathematical models for COVID-19: applications, limitations, and potentials, J. Publ. Health Emerg. 4 (2020), pp. 9–9, https://doi.org/10.21037/jphe-2020-05.
- Wikipedia Commons, Map of Chile. Accessed February 7, 2021. Available at https://commons.wikimedia.org/wiki/File:Mapa-chile.svg.
- World Health Organization, WHO Director-General's opening remarks at the media briefing on COVID-19-11. March 2020 World Health Organization 2020 [April 23]. Available at https://bit.ly/2A8aCIO.
- H. Wu, H. Zhu, H. Miao, and A. Perelson, Parameter identifiability and estimation of HIV/AIDS dynamic models, Bull. Math. Biol. 70 (2008), pp. 785–799.
- P. Yan and G. Chowell, Quantitative Methods for Investigating Infectious Disease Outbreaks, Texts in Applied Mathematics, Springer, Cham, 2019.
- C. You, Y. Deng, W. Hu, J. Sun, Q. Lin, F. Zhou, C.H. Pang, Y. Zhang, Z. Chen, and X.H. Zhou, Estimation of the time-varying reproduction number of COVID-19 outbreak in China, Int. J. Hyg. Environ. Health 228 (2020), p. 113555. Available at https://www.sciencedirect.com/science/article/pii/S1438463920302133.
- C. Zhan, Y. Zheng, T.H.Z. ai, and B. Li, Identifying epidemic spreading dynamics of COVID-19 by pseudocoevolutionary simulated annealing optimizers, Neural Comput. Applic. 33 (2021), pp. 4915–4928.
- S. Zhang, J. Ponce, Z. Zhang, G. Lin, and G. Karniadakis, An integrated framework for building trustworthy data-driven epidemiological models: Application to the COVID-19 outbreak in New York City, PLOS Comput. Biol. 17 (2021), pp. 1–29. https://doi.org/10.1371/journal.pcbi.1009334.
- Z. Zhao, X. Li, F. Liu, G. Zhu, C. Ma, and L. Wang, Prediction of the COVID-19 spread in African countries and implications for prevention and control: A case study in South Africa, Egypt, Algeria, Nigeria, Senegal and Kenya, Sci. Tot. Environ. 729 (2020), p. 138959. Available at https://www.sciencedirect.com/science/article/pii/S0048969720324761.
- C. Zhou, Evaluating new evidence in the early dynamics of the novel coronavirus COVID-19 outbreak in Wuhan, China with real time domestic traffic and potential asymptomatic transmissions, (2020). Available at medRxiv https://www.medrxiv.org/content/early/2020/02/20/2020.02.15.20023440.
- L. Zou, F. Ruan, M. Huang, L. Liang, H. Huang, Z. Hong, J. Yu, M. Kang, Y. Song, J. Xia, Q. Guo, T. Song, J. He, H.L. Yen, M. Peiris, and J. Wu, SARS-CoV-2 viral load in upper respiratory specimens of infected patients, New Engl. J. Med. 382 (2020), pp. 1177–1179. PMID: 32074444. https://doi.org/10.1056/NEJMc2001737.