Full article: A computational approach to identifiability analysis for a model of the propagation and control of COVID-19 in Chile

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

A computational approach is adapted to analyze the parameter identifiability of a compartmental model. The model is intended to describe the progression of the COVID-19 pandemic in Chile during the initial phase in early 2020 when government declared quarantine measures. The computational approach to analyze the structural and practical identifiability is applied in two parts, one for synthetic data and another for some Chilean regional data. The first part defines the identifiable parameter sets when these recover the true parameters used to create the synthetic data. The second part compares the results derived from synthetic data, estimating the identifiable parameter sets from regional Chilean epidemic data. Experiments provide evidence of the loss of identifiability if some initial conditions are estimated, the period of time used to fit is before the peak, and if a significant proportion of the population is involved in quarantine periods.

Keywords:

1. Introduction

1.1. Scope

The COVID-19 pandemic has received overwhelming news coverage worldwide. We recall that the coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was declared a global pandemic by the World Health Organization (WHO) on March 11, 2020 [Citation11, Citation64]. This highly contagious unprecedented virus has impacted government and public institutions, strained the health care systems, restricted people in their homes, and caused country-wide lockdowns resulting in a global economic crisis [Citation20, Citation32, Citation61]. The impact of COVID-19 at the time of revision of this work (July 5, 2023) amounts to roughly 767 million cases moreover, 6.95 million deaths worldwide, of which 2.96 million have been reported in the Americas.

The morbidity, mortality, and economic indicators associated with the COVID-19 pandemic point to a devastating picture for Latin American countries. High poverty rates, poor leadership, high prevalence of underlying health conditions such as obesity and diabetes, and frail healthcare systems have exacerbated the impact of the novel coronavirus in this region [Citation9, Citation23, Citation30, Citation45]. In Chile, the Ministry of Health reported the first COVID-19 case on March 3, 2020, becoming the fifth country in Latin America after Brazil, Mexico, Ecuador and Argentina to report COVID-19 cases. Soon after, the Chilean government put in place a number of interventions including the closure of all daycares, schools, and universities (March 16), border controls, telework recommendations (March 18), closure of non-essential businesses (March 19), national night curfews (March 22), as well as targeted lockdowns at the level of municipalities since March 26, 2020. As of July 6, 2023, Chile has accumulated 5288437 cases and 61586 deaths [Citation40]. It is also worth noting that Chile has tested at a higher rate than any other Latin American country [Citation58]. Fortunately, Chile's mass COVID-19 vaccination campaign to immunize about 80% of the total population started in February 2021, and more than 90% of that target population has been fully vaccinated as of July 6, 2023.

One way to understand and explain these epidemic phenomena is by applying mathematical models and surveillance data, and the COVID-19 outbreak has been studied with different focuses, differing in their uses, their mathematical form, and objectives [Citation1, Citation6, Citation31, Citation43, Citation44, Citation47]. Since the first case reported of the COVID-19 disease, and the multiple models that have surged to respond to the emergency and its dynamics [Citation53, Citation54], we can understand their contributions, as well as their limitations [Citation2, Citation4, Citation62]. The mathematical models utilized to confront the coronavirus 2019 disease improve situational awareness, assess epidemiological characteristics, and inform the base of evidence for prevention strategies. Specifically, for models derived from the famous SEIR model, one must guarantee that information recovered by them is highly accurate and that one confidently understands the biological mechanisms to lay the ground for developing optimal strategies for public health. The successful application of such epidemic models depends upon our ability to estimate key transmission parameters. These estimates are subject to two principal sources of uncertainty: the noise of the data, and the assumptions built into the model. Both difficulties give rise to the problem of non-identifiability of parameters. That is, the information recovered by applying the model to available datasets is not representative of the dynamics that we desire to interpret. This problem has been studied intensively and there are various approaches to handle non-identifiability. For instance, in [Citation16, Citation22, Citation29, Citation48, Citation57, Citation69] guidelines and criteria are proposed to avoid such identifiability difficulties. Ignoring these factors can result in misleading inferences with potentially incorrect public health policy decisions [Citation12, Citation14, Citation51].

We are inspired by the dynamics of COVID-19 that occurred in Chile in the initial phase of 2020 to explore these factors in estimating parameters. Then, we propose a compartmental model with nonlinear differential equations for the Chilean dynamics, and we use it to distinguish the noise from the data and the structure. With this in mind, we developed an identifiability study employing a computational approach such as that described in [Citation51]. The identifiability analysis, as explained in [Citation50], determines how well the parameters of a model can be estimated from experimental data, so it is crucial for interpreting and determining confidence in model parameter values. Structural identifiability focuses on a mathematical model and the practical identifiability on data. Hence a model with structurally identifiable parameters may still, be nonidentifiable in practice due to a lack of information gleaned from real data. Thus, it is essential to understand both types of identifiability when applying an epidemic model to a specific situation [Citation27, Citation39, Citation51]. Therefore, we applied a computational approach to investigate the identifiability of the parameters from the compartmental model mentioned. This idea motivates an alternative parameter identifiability analysis to a complex compartment model, where other techniques can be rigorous. Specifically, we employ synthetic data from our model using various parameter sets. This parameter estimation is carried out with a defined optimization process with the simulated annealing method. Fitting all data points generated in the simulated epidemic curve, we aim to recover the parameters assumed to create synthetic data. If this approach is successful, we may confirm structural identifiability, which we aim to validate by adding a measurement error structure through a bootstrapping process [Citation14, Citation51]. For the case of practical identifiability, we only use a limited number of data points of the epidemic during the initial phase (e.g. pre-peak). We employed our sets of identifiable parameters to fit regional Chilean data to compare the conclusions from the previous analysis based on synthetic data.

1.2. Related work

We first recall that introductions to compartmental epidemiological models are provided in [Citation3, Citation5, Citation18, Citation35, Citation60, Citation66]. This class of models goes back to the well-known work by Kermack and McKendrick [Citation28]. Within a compartmental model the population is subdivided into at least two epidemiological compartments (e.g. susceptible and infected). The rates of progression between the compartments, as well as the incidence and possibly birth and death of individuals, need to be specified, which leads to a dynamical system, that is a system of coupled, nonlinear ordinary differential equations (ODEs) that describe the progression of the epidemic.

In many cases, the compartmental approach formulated as a dynamical system provides indications of the basic reproduction number $R_{0}$ , where this parameter plays the role of a threshold value for the system's dynamics. This role of $R_{0}$ as a ‘threshold value’ differs from the original epidemiological significance, namely that $R_{0}$ gives the number of secondary cases one infectious individual will produce in a population consisting only of susceptible individuals [Citation35, p. 21].

The conclusions obtained by a compartmental model are derived from parameter estimations and allow describing the dynamics present in a data set. Then, the identifiability analysis is crucial to guaranteeing that the parameter estimates are adequate. Specifically, the identifiability analysis addresses whether the parameters of a compartmental model can be uniquely identified. There are different strategies for the identifiability, for example, for models expressed by linear ODEs [Citation38, Citation65], nonlinear ODEs [Citation39], or partial differential equations [Citation50]. The type of data to fit also plays an essential role. For example, one needs to select between incidence or prevalence curves [Citation12, Citation19, Citation56]. In the present theory two types of identifiability are considered: structural and practical. The first involves the model and its structure, and the study can be centred on theoretical or computational techniques (e.g. [Citation13, Citation16, Citation19, Citation25, Citation33, Citation39, Citation50, Citation57]), and the second involves the data (e.g. [Citation22, Citation27, Citation29, Citation56, Citation57]). The analysis is separate because a model can be structurally identifiable, but real-world data employed to fit the model results may lead to unidentifiable parameters for reasons related to the characteristics of the data available.

We extend the computational approach developed by Rossa and Chowell [Citation51] to assess parameter identifiability in a compartmental model defined by nonlinear ODEs. Then, the computational approach that combines the optimization process determined by the simulated annealing method and a bootstrap process allows us to assess the identifiability of the corresponding parameters. In particular, structural identifiability is assessed by employing synthetic data with known parameters and the practical one with the same synthetic data, but the model is fitted only to the initial phases of the data curve (e.g. pre-peak periods). In both computations we expect to recover the set of assumed parameters with small uncertainty and error. These properties ensure that the model is indeed associated with identifiable parameters. Finally, we apply the identifiability analysis to various sets of parameters to estimate. We vary their elements in increasing numbers and measure the ability of the model to estimate the parameter sets in each case by fitting the data curves generated with assumed parameters. This procedure can be understood as a sensitivity analysis.

In the application to Chilean data, we consider the parameter sets identifiable from our computational approach (with synthetic data) and fit some regional Chilean data. We hope to confirm the conclusions from the identifiability study with our application to the Chilean case.

In our study, a parameter set is identifiable if it satisfies some criteria defined in Section 2.5, which are an adaptation from the definition presented in [Citation51], considering analyzing the confidence intervals and the mean square error (MSE).

2. Methods

2.1. Compartmental model

We apply our methodology to a complex compartmental transmission model inspired by the strategies implemented by the Chilean government to mitigate the COVID-19 emergency in the initial phase of the outbreak. For this purpose, we consider a single population and adopt a simplified version of the model described in [Citation15], combined with the description of individuals in quarantine proposed in [Citation26]. Individuals within the model are classified as susceptible (S), in home quarantine (Q), latent ( $E_{1}$ ), partially infectious but not yet symptomatic ( $E_{2}$ ), asymptomatic (A), infectious and will not be tested ( $I_{n}$ ), infectious and will be tested and isolated ( $I_{s}$ ), hospitalized/isolated infectious (J), recovered (R), and deceased (D). Constant population size is assumed, i.e. (1) $N := S + Q + E_{1} + E_{2} + A + I_{n} + I_{s} + J + R + D = c o n s t .$ (1) Note that we stipulate one single class of asymptomatic individuals. At the same time, in [Citation15] a distinction is made between asymptomatic individuals who will not be tested and those who are asymptomatic and will be tested and isolated.

It is assumed that home-quarantined individuals do not have contact with infected individuals, due to severe travel restrictions and rigorous supervision by their local communities, do not have contact with infected individuals. The parameters p and $1 / λ$ represent the percentage of individuals in quarantine and the quarantine duration, respectively. Therefore, class Q removes susceptible individuals from the infection dynamics when $p \neq 0$ or $λ \neq 0$ , and there is no quarantine when $p = λ = 0$ .

Five classes can contribute to new infections, namely $E_{2}$ , A, $I_{n}$ , $I_{s}$ , and J. Each class has a relative transmission linked with the parameters $q_{e}$ , $q_{a}$ , q and h. These parameters are the relative transmissibilities from partially infectious population $E_{2}$ , asymptomatic population A, asymptomatic/infected population that wear PPE, and the infected population isolated or in hospitals J. Consequently, the product hJ models the contribution of hospitalized individuals to transmission.

If we denote by $r_{X \to Y}$ the rate at which individuals move from class X to class Y, then susceptible individuals move to $E_{1}$ at the rate $r_{S \to E_{1}} = \frac{β}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J),$ where β denotes the overall transmission rate. Individuals in $E_{1}$ progress to $E_{2}$ at rate $κ_{1}$ . Individuals from $E_{2}$ are partially infectious, with relative transmissibility $q_{e}$ , and progress at rate $κ_{2}$ , where a proportion $ρ_{a}$ become asymptomatic and partially infectious (relative transmissibility $q_{a}$ ), and $1 - ρ_{a}$ become fully infectious. Among the proportion $1 - ρ_{a}$ that become fully infectious, $ρ_{s}$ will be tested, while $1 - ρ_{s}$ will be undetected. Asymptomatic individuals who are not tested and symptomatic individuals wear personal protective equipment (PPE) (such as wearing masks in public, handwashing, etc.) and thus have relative transmissibility q, which quantifies the effectiveness of those protective behaviours. Individuals within classes A and $I_{n}$ (who are not tested) recover at rate $γ_{1}$ . Those tested (class $I_{s}$ ) will progress to the hospitalized and isolated class at diagnosis rate α. Relative transmission within hospitals and isolated places may occur, but our analysis assumes perfect isolation. Hospitalized and isolated indiividuals progress to the recovered class at rate $γ_{2}$ or to the deceased class at rate δ. Therefore, the model is defined by the following system of non-linear ODEs for a single population, where all variables in capital letters are functions of time t, i.e. it is understood that $S = S (t)$ , $E_{1} = E_{1} (t)$ and so on, and a prime denotes the derivative, that is $S^{'} \equiv d S / d t$ , etc.: (2) $\begin{aligned} S^{'} & = - \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - p S + λ Q, \\ Q^{'} & = p S - λ Q, \\ E_{1}^{'} & = \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - κ_{1} E_{1}, \\ E_{2}^{'} & = κ_{1} E_{1} - κ_{2} E_{2}, \\ A^{'} & = κ_{2} ρ_{a} E_{2} - γ_{1} A, \\ I_{n}^{'} & = κ_{2} (1 - ρ_{a}) (1 - ρ_{s}) E_{2} - γ_{1} I_{n}, \\ I_{s}^{'} & = κ_{2} (1 - ρ_{a}) ρ_{s} E_{2} - α I_{s}, \\ J^{'} & = α I_{s} - (γ_{2} + δ) J, \\ R^{'} & = γ_{1} (A + I_{n}) + γ_{2} J, \\ D^{'} & = δ J, \\ C^{'} & = α I_{s} . \end{aligned}$ (2) The auxiliary variable C tracks the cumulative number of diagnosed/reported cases from the start of the outbreak, and $C^{'}$ represents the new diagnosed cases. The quantity C is not a state of the system of equations but simply a class to track the cumulative incidences cases; thus, individuals from population are not moving to class C. The initial values of these state variables are denoted by $S (0), Q (0), E_{1} (0), \dots, C (0)$ , where $S (0) = N - Q (0) - E_{1} (0) - E_{2} (0) - A (0) - I_{n} (0) - I_{s} (0) - J (0) - R (0) - D (0) .$ A schematic of the transmissions is provided in Figure .

Figure 1. Schematic of the transmission of COVID-19 for one population.

The basic, control, and effective control reproduction numbers $R_{0}$ , $R_{0}^{c}$ and $R_{e}^{c}$ quantify the epidemic impact of infectious disease in a population and the effectiveness of control measures. The basic reproduction number $R_{0}$ defines the average number of secondary cases generated by primary infectious individuals during the early transmission period when the population is completely susceptible and without control interventions. The control reproduction number $R_{0}^{c}$ defines the average number of new cases generated by infectious individuals when the population susceptible is with some control measure or intervention. The effective control reproduction number $R_{e}^{c}$ is the reproduction number that varies with time, and it is defined as partially susceptible population, where if $R_{e}^{c} > 1$ , the disease transmission continues, and if $R_{e}^{c} < 1$ the disease transmission declines. Therefore, while the basic reproduction number $R_{0}$ and control reproduction number $R_{0}^{c}$ captures the transmission potential of infectious disease during the early epidemic, the effective control reproduction number $R_{e}^{c}$ , captures changes in the transmission potential over time.

To calculate these numbers, we employ the well-known next-generation matrix approach [Citation59]. This yields the expression $R_{0}^{c} = ρ (F V^{- 1}),$ where ρ denotes the spectral radius and for the present model, the matrices $F$ and $V$ are given by $\begin{aligned} F & = [\begin{matrix} 0 & β q_{e} & β q_{a} q & β q & β q & β h \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{matrix}], \\ V & = [\begin{matrix} κ_{1} & 0 & 0 & 0 & 0 & 0 \\ - κ_{1} & κ_{2} & 0 & 0 & 0 & 0 \\ 0 & - κ_{2} ρ_{a} & γ_{1} & 0 & 0 & 0 \\ 0 & - κ_{2} (1 - ρ_{a}) (1 - ρ_{s}) & 0 & γ_{1} & 0 & 0 \\ 0 & - κ_{2} (1 - ρ_{a}) ρ_{s} & 0 & 0 & α & 0 \\ 0 & 0 & 0 & 0 & - α & γ_{2} + δ \end{matrix}] . \end{aligned}$ The result is (3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) Moreover, the corresponding effective reproduction number is defined as $R_{e}^{c} (t) = R_{0}^{c} S (t) / N,$ where $S (t) / N$ is the proportion of susceptible individuals in the population at time t, where the total population N is assumed to be constant (see (Equation1(1) $N := S + Q + E_{1} + E_{2} + A + I_{n} + I_{s} + J + R + D = c o n s t .$ (1) )). The explicit expression (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) ) implies the following contributions of the individual compartments: $\begin{aligned} R^{E_{2}} & = \frac{β q_{e}}{κ_{2}}, R^{A} = \frac{β q_{a} q ρ_{a}}{γ_{1}}, R^{I_{n}} = \frac{β q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}}, \\ R^{I_{s}} & = \frac{β q ρ_{s} (1 - ρ_{a})}{α}, R^{J} = \frac{β h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}, \end{aligned}$ where (4) $R_{0}^{c} = R^{E_{2}} + R^{A} + R^{I_{n}} + R^{I_{s}} + R^{J} .$ (4) This parameter $R_{0}^{c}$ is a key epidemic parameter to measure the transmissibility of a pathogen; by (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) ) we see that $R_{0}^{c}$ is a function of several parameters of the epidemic model (Equation2(2) $\begin{aligned} S^{'} & = - \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - p S + λ Q, \\ Q^{'} & = p S - λ Q, \\ E_{1}^{'} & = \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - κ_{1} E_{1}, \\ E_{2}^{'} & = κ_{1} E_{1} - κ_{2} E_{2}, \\ A^{'} & = κ_{2} ρ_{a} E_{2} - γ_{1} A, \\ I_{n}^{'} & = κ_{2} (1 - ρ_{a}) (1 - ρ_{s}) E_{2} - γ_{1} I_{n}, \\ I_{s}^{'} & = κ_{2} (1 - ρ_{a}) ρ_{s} E_{2} - α I_{s}, \\ J^{'} & = α I_{s} - (γ_{2} + δ) J, \\ R^{'} & = γ_{1} (A + I_{n}) + γ_{2} J, \\ D^{'} & = δ J, \\ C^{'} & = α I_{s} . \end{aligned}$ (2) ), including transmission rates and infectious periods of the epidemiological states that contribute to new contagions. In addition, we can observe that the basic reproduction number $R_{0}$ corresponds to the control reproduction number $R_{0}^{c}$ in the absence of interventions, i.e. when q = 1, h = 0, and $ρ_{s} = 0$ (where q is interpreted as the level of no protection and 1−q is the level of protection due to the use of PPE). These assumptions lead to the expression $R_{0} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} ρ_{a}}{γ_{1}} + \frac{(1 - ρ_{a})}{γ_{1}}) .$

2.2. Simulated data

Using our model proposed for COVID-19, we simulate various scenarios fixing their parameters and initial conditions. These curves will help analyze the ability of our model to capture such dynamics. Then a parameter estimation process is necessary together with a bootstrapping procedure to examine the identifiability of several of its parameters, as will be explained in the following. For this, we assume the values and references summarized in Table for the parameters, and for the initial conditions the values $\begin{aligned} J (0) & = C (0) = 1, Q (0) = D (0) = R (0) = 0, I_{n} (0) = 1, I_{s} (0) = 20, \\ E_{1} (0) & = 2 J (0), E_{2} (0) = 4 J (0) - E_{1} (0), A (0) = 4, \\ S (0) & = N - E_{1} (0) - E_{2} (0) - I_{n} (0) - I_{s} (0) - A (0) - J (0) - R (0) - D (0) \end{aligned}$ for a total population N = 500, 000. These values are used to simulate the following scenarios.

Table 1. Values of model parameters selected to generate simulated data.

Display Table

2.2.1. Scenario 1: initial phase for three values of $R_{0}^{c}$ without quarantine

We consider three values of $R_{0}^{c}$ that generate three values of β via (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) ). These values along with other parameters are summarized in Table . The corresponding incidence curves $t \mapsto C^{'} (t)$ are generated for $t \in [0, 200]$ , see Figure (a). These incidence curves will be our datasets for the identifiability study. Notice that the curve for $R_{0}^{c} = 3$ effectively grows more slowly than the others, and its maximum is smaller than the curves with $R_{0}^{c} = 4$ and $R_{0}^{c} = 5$ .

Figure 2. Incidence curves $C^{'} = C^{'} (t)$ for synthetic data based on the parameters of Table (a) of Scenario 1, (b) of Scenario 2 with p = 0.01, (c) of Scenario 2 with p = 0.05. Magenta, green, and blue curves correspond to $R_{0}^{c} = 3$ , 4, and 5, respectively.

Figure 2. Incidence curves C′=C′(t) for synthetic data based on the parameters of Table 1(a) of Scenario 1, (b) of Scenario 2 with p = 0.01, (c) of Scenario 2 with p = 0.05. Magenta, green, and blue curves correspond to R0c=3, 4, and 5, respectively.

2.2.2. Scenario 2: initial phase for three values of $R_{0}^{c}$ with quarantine.

For this scenario, we consider the same values for $R_{0}^{c}$ and corresponding values of β as in Scenario 1 since expression (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) ) does not depend on the parameters p or λ. The pairs (p = 0.01, $λ = 1 / 15$ ) and (p = 0.05, $λ = 1 / 15$ ), with the other parameters as given in Table , produce the datasets shown in Figure (b,c), respectively. Again, the curve with $R_{0}^{c} = 3$ grows more slowly than the others, and its maximum is smaller than the curves with $R_{0}^{c} = 4$ and $R_{0}^{c} = 5$ . Besides, the maxima of the curves with p = 0.05 are consistently smaller than those of the corresponding curves with p = 0.01, which evidences the effect of the quarantine modelled.

Using the bootstrapping approach, we intend to use these curves defined for each scenario to generate multiple samples from the best-fit model. Before doing this, we need to define the parameter estimation process to explain the bootstrap approach and finally the identifiability study.

2.3. Parameter estimation

To estimate the parameter values of our COVID-19 model, we need to use one optimization process to fit the model to each simulated dataset. Then, the method used for this purpose is outlined as follows, executed in Matlab (Mathworks, Inc).

2.3.1. Mixed simulated annealing method with a least-squares approach (SA-LSQ)

The SA-LSQ approach is defined as a parameter estimation process that combines the simulated annealing and least-squares optimization algorithms to minimize the Euclidean distance between the incidence curve $C^{'} (t, Θ)$ , defined by our COVID-19 model (Equation2(2) $\begin{aligned} S^{'} & = - \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - p S + λ Q, \\ Q^{'} & = p S - λ Q, \\ E_{1}^{'} & = \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - κ_{1} E_{1}, \\ E_{2}^{'} & = κ_{1} E_{1} - κ_{2} E_{2}, \\ A^{'} & = κ_{2} ρ_{a} E_{2} - γ_{1} A, \\ I_{n}^{'} & = κ_{2} (1 - ρ_{a}) (1 - ρ_{s}) E_{2} - γ_{1} I_{n}, \\ I_{s}^{'} & = κ_{2} (1 - ρ_{a}) ρ_{s} E_{2} - α I_{s}, \\ J^{'} & = α I_{s} - (γ_{2} + δ) J, \\ R^{'} & = γ_{1} (A + I_{n}) + γ_{2} J, \\ D^{'} & = δ J, \\ C^{'} & = α I_{s} . \end{aligned}$ (2) ), and the time series data ( ${d a t a}_{t}$ ) that corresponds to time series data with $t \in [0, T]$ , where T is the final time (date). Here Θ is the set of parameters to estimate of our COVID-19 model.

In our approach, we obtain the set of estimated parameters $\hat{Θ}$ that satisfies $\hat{Θ} = \underset{Θ \in S_{Θ}}{a r g m i n} f (Θ, R),$ with initial guess $S_{Θ}$ and the objective function $f (Θ, R) := \frac{1}{R} \sum_{t_{i} \leq T} {| C^{'} (t_{i}, Θ) - {d a t a}_{t_{i}} |}^{2},$ for a given normalizing coefficient R.

We wish to avoid attaining local minima and to guarantee that the optimization process finds the global minimum. Consequently, due to the sensitivity of the solution with respect to the initial guess utilized for these processes, we decided, before starting the SA fitting, to select the best initial guess running several Latin hypercube sampling (LHS) runs. This is done by the built-in Matlab function lhsdesign(), applied to the bounds defined for each parameter to estimate.

Together with simulated annealing (SA), using the Matlab built-in function simulannealbnd(), and the best initial guess obtained with LHS, we estimated the parameters that are transformed as the initial guess to apply the least squares method (LSQ) with lsqcurvefit() Matlab function to speed up the steepest descent phase of the simulated annealing algorithm.

More details about the procedure are summarized in Algorithm 1.

This method is a powerful optimization tool, particularly for fitting models that describe epidemic phenomena with real data, such as those developed in [Citation7, Citation8, Citation36, Citation68, Citation70] Additionally, the SA technique was inspired by Metropolis et al. [Citation37] to select the new characterized solutions. Even so, its adaptation allows the application to a frequentist paradigm.

Specifically, to study parameter identifiability, we define various experiments that use the generated curves in Scenarios 1 and 2, and we intend to estimate or recover some parameters of interest from these curves. Therefore parameters of interest are selected from ${α, β, ρ_{s}, I_{s} (0), J (0), p}$ , while the parameters not selected are fixed to the respective value of Table .

2.3.2. Sets of parameters to be estimated

For the data generated using Scenarios 1 and 2, we define the sets of parameters to estimate given in Table . Each combination explored will be mentioned as an experiment. For example, Experiment 101 indicates to data from Scenario 1 to estimate the parameter set 1 and Experiment 201-1 corresponds to data from Scenario 2 to estimate the parameter set 1, with p = 0.01, where Experiment 201-2 will be with p = 0.05 (see Tables and ). Therefore, for Scenario 1, we have 24 experiments, that is, the number of values assumed of $R_{0}^{c}$ (three) times the number of parameter sets (eight). Likewise, for Scenario 2, we have 84 experiments corresponding to the number of values of $R_{0}^{c}$ (three) times the number of values of p (two) times the number of parameter sets (fourteen).

2.4. Bootstrapping method

This method uses the parametric bootstrapping approach with Poisson error structure (see [Citation14, Citation51]) to repeatedly sample observations from the best-fit obtained by the SA-LSQ fit of our COVID-19 model to an time series data. This process can be summarized in the following steps:

Obtain the deterministic model solution $C (t, \hat{Θ})$ or total daily incidence series from the best-fit obtained using the SA-LSQ estimation.
Generate S replicated datasets assuming the Poisson error structure. To generate these simulated data, we incorporate the Poisson error structure using the incidence curve $C^{'} (t, \hat{Θ})$ , where for each time $t_{i}$ we generate a new incidence value using a Poisson random variable with mean $C^{'} (t_{i}, \hat{Θ})$ . Therefore, this new dataset represents an incidence curve for the system, assuming the time series follows a Poisson distribution centred on the mean at time points $t_{i}$ .
Re-estimate model parameters for each of the S simulated realizations (using the SA-LQS method), which are denoted by ${\hat{Θ}}_{i}$ for $i = 1, \dots, S$ .
Finally, using the set of ${\hat{Θ}}_{i}$ for $i = 1, \dots, S$ , we construct the confidence intervals for each estimated parameter. Also, for each set of estimated parameters, $R_{0}^{c}$ is calculated to obtain a distribution of $R_{0}^{c}$ values as well.

2.5. Parameter identifiability

2.5.1. Preliminaries

The complexity of our mathematical model involves ten epidemiological states or compartments and an auxiliary variable $C (t)$ that indicates the cumulative number of symptomatic diagnosed or new cases reported. In addition to the ten system states, our model consists of 15 parameters (see Table ), where five classes contribute to a new infection $(E_{2}, A, I_{n}, I_{s}, J)$ , which also contribute to the representation of $R_{0}^{c}$ in these classes, see Equations (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) ) and (Equation4(4) $R_{0}^{c} = R^{E_{2}} + R^{A} + R^{I_{n}} + R^{I_{s}} + R^{J} .$ (4) ). Then, the estimated individual parameters are used to compute $R_{0}^{c}$ . Some initial conditions of the model are unknown and possibly need to be estimated along with the other model parameters. For our case we consider the values $I_{s} (0)$ and $J (0)$ to estimate together to the parameters α, β, $ρ_{s}$ , and p. This defines the set from parameters to estimate ${α, β, ρ_{s}, p, I_{s} (0), J (0)}$ , where the remaining parameters and initial conditions are known or assumed.

To estimate the parameter sets, we fit the model (with other parameters and initial values fixed) to each dataset using the auxiliary variable $C^{'}$ and the curve of the daily cases generated. The parameter estimation procedure employs the methods detailed in Section 2.3. The initial parameter prediction affects the solution for the model as local minima occur. To guarantee the global minimum obtained in the optimization processes, we generate a multi-start (using LHS in Algorithm 1) repeating these processes for a random amount of initial parameters, whose ranges are defined as follows: $\begin{aligned} 0 < β < 5, 0.1 < α < 2, 0 < J (0) < C (0), \\ 0 < I_{s} (0) < 300, 0 < ρ_{s} < 1, a n d 0 < p < 1. \end{aligned}$ Here, the particular choice of the wide range for $I_{s} (0)$ aims to avoid restrictions during the calibration of the model due to its sensitivity.

We focus the parameter identifiability study for (Equation2(2) $\begin{aligned} S^{'} & = - \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - p S + λ Q, \\ Q^{'} & = p S - λ Q, \\ E_{1}^{'} & = \frac{β S}{N} (q_{e} E_{2} + q_{a} q A + q I_{n} + q I_{s} + h J) - κ_{1} E_{1}, \\ E_{2}^{'} & = κ_{1} E_{1} - κ_{2} E_{2}, \\ A^{'} & = κ_{2} ρ_{a} E_{2} - γ_{1} A, \\ I_{n}^{'} & = κ_{2} (1 - ρ_{a}) (1 - ρ_{s}) E_{2} - γ_{1} I_{n}, \\ I_{s}^{'} & = κ_{2} (1 - ρ_{a}) ρ_{s} E_{2} - α I_{s}, \\ J^{'} & = α I_{s} - (γ_{2} + δ) J, \\ R^{'} & = γ_{1} (A + I_{n}) + γ_{2} J, \\ D^{'} & = δ J, \\ C^{'} & = α I_{s} . \end{aligned}$ (2) ) on characterizing the identifiable parameters using uncertainty studies to assess the stability of the parameters estimated. Then, the goal is to evaluate the identifiability using the mean value, the $95 %$ confidence intervals, and the mean square error (MSE). The step-by-step description of the identifiability analysis is detailed in the next subsection.

2.5.2. Identifiability analysis process

In the study of identifiability, we must follow the following steps that rely on definitions and computations presented in detail in [Citation14, Citation51]:

Assume that $Θ = {α, β, ρ_{s}, p, I_{s} (0), J (0)}$ is the set of parameters to be estimated and $Θ_{0}$ is the initial guess.
Estimate Θ fitting the model to the dataset generated, using the SA-LSQ procedure with the values of the initial conditions and parameters given. The hat symbol indicates an estimated parameter. Then, we have the set $\hat{Θ} = {\hat{α}, \hat{β}, {\hat{ρ}}_{s}, \hat{p}, \hat{I_{s}} (0), \hat{J} (0)} .$
Using the bootstrapping process we obtain the $95 %$ confidence intervals and their frequency distributions for each parameter estimated.
Compute the mean squared error (MSE) $M S E := \frac{1}{S} \sum_{j = 1}^{S} (Θ - {\hat{Θ}}_{j})^{2}$ for each parameter estimated. Here Θ represents the true parameter value, and ${\hat{Θ}}_{j}$ represents the estimated value of the parameter for the jth bootstrap realization.
For each set of estimated parameters, $R_{0}^{c}$ is calculated to obtain a distribution of $R_{0}^{c}$ values as well.

Following the process described, we say that a model parameter is identifiable if its confidence interval lies in a finite range of values and contains the estimated (and real) parameter value. The preferred situations is that these values have a central position or are close to the mean value. Besides, a small confidence interval indicates that the parameter can be identifiable, while a broader range could indicate a lack of identifiability. When an estimated parameter has a low MSE and a small confidence interval, the we may assert that the parameter is identifiable from the model. On the other hand, larger confidence intervals or larger MSE values may suggest non-identifiability. Here confidence interval is considered to be small if its lower and upper bounds are not smaller or greater, respectively, than 10% of the value estimated. Besides, the smallness of the MSE depends on the nature of the estimated parameter. For instance, if the parameter represents a probability, then we say that MSE is small if it is a value close to zero, but if the parameter represents the quantity of population, one acceptable range can be considered no more than 0.01% of the total population.

Specifically, to analyze structural identifiability, we will fit the entire curves generated for each scenario to recover the values from the parameters assumed in each case. Moreover, to investigate practical identifiability, we will consider different fit times. This is done to address situations when only data of the initial phase (before the peak or maximum value) are available.

3. Results

3.1. Structural identifiability

By applying the identifiability methodology, we obtain results for each one of the steps 1 to 5 outlined in Section 2.5.2. The first type of results corresponds to the parameter estimation process that leads to the best-fit model solution. Secondly, we have 250 replicated datasets applying the bootstrap process to re-estimate model parameters. Figures and (Appendix) show the confidence intervals (represented by vertical lines) and the MSE bar plots for the experiments of Scenario 1, and Figures S1 and S2 (in the supplemental material) display analogous results for the experiments of Scenario 2. Observing the results obtained and registered in these figures, we concentrate our investigation on Tables and , in which we classify the parameters estimated between identifiable and non-identifiable. A parameter is considered identifiable with narrow confidence interval, a mean value centred and close to the real value, and the MSE is small. If at least one of these identifiability criteria is not satisfied, then the parameter is considered non-identifiable.

From Table we conclude that all estimated parameters are identifiable for Experiments 1 to 4, i.e. those without initial conditions to fit. For Experiments 5 to 7 in which the initial conditions are also estimated, we note that all the parameters except the initial condition $J (0)$ are identifiable, maintaining mainly the identifiability of the parameter of interest $R_{0}^{c}$ . Only Experiment 8 exhibits non-identifiability situations for all parameters except for the dataset generated under the assumption $R_{0}^{c} = 5$ . Here we can suspect that the identifiability improves when the force of infection is greater as in this last case. Nevertheless, we will continue to explore this part later. Another observation is that the estimated $R_{0}^{c}$ can be robust to variation or bias in the other estimated parameters.

Analogously, we deduce from Table that the parameters estimated in Experiments 1, 2, 3, and 5 are identifiable. The parameters in Experiments 8, 9, 10, and 12 are also identifiable (except for the initial condition $J (0)$ ). For Experiments 4, 7, and 9, the parameter $R_{0}^{c}$ is identifiable even though no other parameters are identifiable. Particularly in Experiment 7, all the remaining parameters are non-identifiable. Other situations occur; for example, in Experiment 6, all estimated parameters except those based on data generated with $R_{0}^{c} = 3$ and p = 0.05 are identifiable (red boxes in column Experiment 6 from Table ). For Experiment 13, all estimated parameters based on data generated with $R_{0}^{c} = 4$ and 5 are identifiable. For Experiment 14, the results are not good, but the parameter $R_{0}^{c}$ estimated using data generated with p = 0.01 is identifiable. A similar situation occurs with $R_{0}^{c}$ in Experiment 11, where this quantity is identifiable only by fitting the data generated with $R_{0}^{c} = 5$ and p = 0.1.

3.2. Practical identifiability

The results obtained for this part are in Supplemental Material, including coloured tables, namely Tables S2 and S3 that display information analogous to Tables and . Tables S2 and S3 allow one to analyze the identifiability. For this study, we investigate the performance of our model to capture the initial phases of an epidemic disease, examining the choices of the fit periods, namely 20, 30, or 40 days. According to Table S2, in Experiment 1 the estimated parameters are identifiable. In Experiments 3 and 5 with fit times 30 and 40 days, the parameters $R_{0}^{c}$ are identifiable, even though not all estimated parameters are identifiable. In contrast to this result, in Experiment 2 with a fitting period of 20 days, the parameters that define $R_{0}^{c}$ are identifiable, but the computed value $R_{0}^{c}$ is not. This situation could be linked with the inverse relation of $R_{0}$ with parameter α (see Equation (Equation3(3) $R_{0}^{c} = β (\frac{q_{e}}{κ_{2}} + \frac{q_{a} q ρ_{a}}{γ_{1}} + \frac{q (1 - ρ_{s}) (1 - ρ_{a})}{γ_{1}} + \frac{q ρ_{s} (1 - ρ_{a})}{α} + \frac{h ρ_{s} (1 - ρ_{a})}{γ_{2} + δ}) .$ (3) )). Remember that $1 / α$ represents the time from symptom onset to isolation or hospitalization. Thus a variation in α of 0.1 to 0.2 implies a change of 5 days from the period of symptoms onset to isolation, a very sensitive parameter. Notably, for this experiment, the confidence intervals and the MSE are plotted in Figure S3, where we can see that these intervals are not very large but lack precision, mainly for the case of a fit period of 20 days, and the epidemic growth is given by $R_{0}^{c} = 3$ and 4. For this and other experiments, the parameter $R_{0}^{c}$ tends to be identifiable when fit periods are greater than 20 days, and $R_{0}^{c} \leq 4$ . The parameter $ρ_{s}$ and the initial condition $I_{s} (0)$ and $J (0)$ for all experiments of Scenario 1 are not identifiable. This suggests a lack of identifiability especially of the parameter $ρ_{s}$ when we have fewer data to fit.

Table S2 also shows that the estimated parameters for Experiment 1 are identifiable. This result differs from Experiments 12, 13, and 14, where all parameters are non-identifiable. For Scenario 2, we can observe that for most experiments with a fit period of 20 days (except for Experiments 1 and 9 with $R_{0}^{c} = 3$ ), the parameter $R_{0}^{c}$ is non-identifiable. Moreover, all estimated parameters are also non-identifiable, especially for these cases, excluding Experiment 2. These results suggest using more fit data to guarantee identifiability, as it occurs more frequently in experiments with longer fit periods.

Another situation to comment on here is that for most experiments where the quarantine parameter p is estimated, the results mainly for $R_{0}^{c}$ result non-identifiable, excluding only Experiment 5 for the fit period 40 days, and $R_{0}^{c}$ assumed as 4 and 5, and Experiment 6 again for the fit period 40 days and $R_{0}^{c} = 3$ and 4. Besides, for some quarantine parameters estimated from Experiments 5 and 6 with fit period 40 days, the result is identifiable, mostly when the fit data are generated assuming p = 0.01. We can think that the lack of identifiability increases when a larger part of the population is quarantined in the initial phase.

4. Application to regional data of Chile

We now apply our methodology to the data collected for some Chilean regions. From the structural and practical results obtained for our model, we apply the study to two cases, one to scan the initial phase (first 30 days) and another for an intermediate phase (first 120 days), considering the data before the change of quarantine criteria applied by the Chilean government. To explain the selection of regions and the phenomena of COVID-19 in Chile, we include the following subsection.

4.1. The early transmission of COVID-19 in Chile

We wish to apply the COVID-19 model of Section 2.1 to some administrative regions of Chile (see Figure ), where there are regions without quarantine periods, others with more than one quarantine period, a situation that the Chilean government named ‘dynamical quarantines’. This policy was in effect until July 27 2020. In addition, this measure was applied only to municipalities with a greater incidence of positive cases for COVID-19. For this reason, we consider a decoupled analysis to select the regions to analyze until July 27 2020, assuming the day of the first positive case as the first day for the timeline of each region and July 27 2020 is the last day. Since the onset of COVID-19 in Chile, the Ministry of Health has been reporting new cases of infection and death daily for each region and new cases of recovered persons at the national level. The official information (see [Citation24, Citation41] for details) also includes the daily numbers of polymerase chain reaction (PCR) tests applied, as well as the daily numbers of critical patients and occupancy of intensive care units (ICUs). A timeline of strategies and measures implemented to contain the pandemic is summarized in Table S1. Several times the Chilean Ministry of Health changed the case and death definitions. Table lists the most relevant of these changes along with their dates. Considering the difficulties associated with the changes in case definitions, we decided to include data reported for symptomatic and asymptomatic cases in the COVID-19 model. Total data correspond to the addition of symptomatic and asymptomatic cases reported because an asymptomatic (tested) case may become symptomatic later.

Figure 3. The 16 administrative regions of Chile [Citation63] and their population according to the 2017 census [Citation10]. The Roman numbers are the official administrative numbers of the geographic regions. The greater Santiago area is the Metropolitan region (RM) and is counted as Region 13. Note that numbering is not strictly ordered from north to south; regions 14 to 16 have been created by dividing existing regions. The total population at the end of 2020 is estimated at 19.1 million.

Table 5. Changes in criteria of reported data.

Download CSV Display Table

4.2. Experiments applied to regional data

Analyzing the type of information and data published by the Chilean government we identified regions with more than one quarantine period. Then, considering these multiple quarantine periods and the conclusions obtained from our experimental identifiability study, we decided to select only those regions with exactly one and without quarantine periods from the Chilean data. Specifically, we selected those administrative regions that satisfy at least one of these conditions, namely those with

exactly one unique quarantine period for the first 30 or 120 days,
no quarantine periods in the first 30 or 120 days.

We also prefer data with an increasing number of reported cases, avoiding having consecutive null case counts, i.e. we select robust datasets for capturing epidemic growth. Overall, in light of the experimental results based on simulated data to analyze the (structural and practical) identifiability of the parameters, we know that the best results are achieved for the sets of estimated parameters $(β, R_{0}^{c})$ and $(β, J (0), I_{s} (0), R_{0}^{c})$ . However, to observe the quarantine parameter, we also consider the parameter sets $(β, p, R_{0}^{c})$ and $(β, p, J (0), I_{s} (0), R_{0}^{c})$ , for the case that the region has a single quarantine period, even knowing that these parameters are not always identifiable from the experimental results in the initial phase. Table provides more details about parameter identifiability results.

Table 6. Summary of the identifiable parameters concluded from the structural and practical identifiability study used to Chilean data.

Display Table

Subsequently, we filter the regional data for two periods, one for the first 30 days and the other for 120 days, and we select some of those that satisfy the above mentioned conditions. We have established the following regions for our application.

Regions 1 (Tarapacá) and 8 (Bío-Bío), without quarantine periods in the initial phase and one quarantine in the final phase.
Regions 3 (Atacama) and 4 (Coquimbo) were without quarantine during the 120 days.
Region 10 (Los Lagos) had one quarantine in the initial phase.

Then, with this regional data, we apply our identifiability study to two events,

For regions without quarantine, we estimate the parameter sets $(β, R_{0}^{c})$ (Experiment 1) and $(β, J (0), I_{s} (0), R_{0}^{c})$ (Experiment 2). We select a fit period of 30 or 120 days depending on the phase analyzed.
For regions with quarantine, we estimate the parameter sets $(β, p, R_{0}^{c})$ (Experiment 1) and $(β, p, J (0), I_{s} (0), R_{0}^{c})$ (Experiment 2).

We are preserving the same consideration for the time fit.

The left plots in Figures S4 and S5 display the regional data distributions, including some dates of quarantine measures and important holidays.

Establishing the experiments and the Chilean data for the analysis, we need to fix the values of those parameters that are not estimated. In Table , we summarize the values assumed for the parameters. For example, we consider the parameters for the time from symptom onset to isolation or hospitalization $1 / α$ , and the proportions of fully infectious individuals who undergo testing $ρ_{s}$ are averaging. In addition, we assume the quarantine parameters $λ = 0$ and p = 0 when the experiment does not involve quarantine periods. Alternatively, when there is quarantine, the quarantine parameter p is estimated depending on the region, being the parameter $1 / λ$ the quarantine duration applied to it (see Table ).

Table 7. Fixed parameters for running experiments and duration of quarantine periods.

Display Table

The initial conditions satisfy the considerations assumed in the identifiability study, i.e. $\begin{aligned} R (0) = Q (0) = D (0) = 0, E_{1} (0) = 2 J (0), \\ E_{2} (0) = 4 J (0) - E_{1} (0), I_{n} (0) = min {J (0), 1}, \\ S (0) = N - E_{1} (0) - E_{2} (0) - I_{n} (0) - I_{s} (0) - A (0) - J (0), \end{aligned}$ where for the case when $J (0)$ and $I_{s} (0)$ are estimated, these quantities are assumed to satisfy $0 < J (0) < C (0)$ and $0 < I_{s} (0) < 300$ . In those cases where $J (0)$ and $I_{s} (0)$ are assumed, the values $J (0) = 1$ and $I_{s} (0) = 10$ are utilized.

Table 8. Parameter estimation results for the application with Chilean data.

Display Table

4.3. Application results

Finally, applying our methodology to Chilean regions data selected for the experiments defined, we obtained the parameter estimations (Table ), the fits plotted in Figures S4 and S5 and Table which we constructed following the same procedure as for Tables and . We observe the 95% confidence intervals obtained by the 250 replicates of the bootstrapping process and the MSE (see Figures S6 to S14 in Supplemental Material). According to the coloured table, the parameters β and $R_{0}^{c}$ (computed) are usually identifiable, except for Region 10, which has a quarantine period in the first 30 days. Quarantine leads some points to fit, so perhaps our model is unsuccessful in this case, and the estimates generated are non-identifiable. As expected, given the summary of results in Table , Region 8 has a short quarantine period in the middle of the final phase, and it turns out difficult to fit data from this region during 120 days to estimate the sets of identifiable parameters. On the other hand, for Region 1, with a prolonged quarantine period in the final phase, conceivably, the initial conditions incorporate more noise in the estimations because it is difficult to estimate the parameters in Experiment 2.

Table 9. Parameter identifiability verification. A green box indicates that the parameter estimated satisfies the parameter identifiability conditions (confidence intervals obtained via bootstrapping are narrow; their mean values are centered; the MSE is small). A red box means the parameter that does not meet the identifiability criteria. Cases not studied are painted in gray.

Display Table

In short of these two last regions, we can see that the duration of quarantine and the phase analyzed can be decisive when estimating parameters; these examples suggest that we need to explore more combinations of quarantine periods in our COVID-19 model. In addition, the parameters estimated from experiments applied to Regions 3 and 4 are identifiable, as expected, given the structural and practical identifiability demonstrated for our model using our methodology. In the same way, we verify the identifiability obtained for Experiments 1 and 2 applied to Region 8, and Region 1 has identifiable parameters only when we use Experiment 2. However, their confidence intervals are not distant when we apply Experiment 1. The noise from initial conditions is possibly more prevalent in Region 1 than Region 8.

5. Discussion and conclusions

We present and apply a computational approach developed by Roosa and Chowell [Citation51] for simple epidemic models. However, we propose adapting it to a complex compartmental model inspired by the first wave of the COVID-19 outbreak in Chile, where quarantines were declared involving different Chilean regions in early phases. Specifically, we propose a methodology to analyze the structural and practical parameter identifiability for a COVID-19 model. We explore the parameter uncertainty computationally through a parametric bootstrapping approach to achieve that goal. In that exploration, we involve synthetic data generated with the same COVID-19 model to measure the capability of our model to recover the parameters assumed to satisfy the identifiability structural and practical. Those criteria involve observing the $95 %$ confidence intervals and the MSE (mean square error), constructed from empirical distributions resulting from the bootstrapping process. In this first attempt, we selected some experiments that would allow us to validate the structural and practical parameter identifiability. We consider different variables, for example, the number of parameters to fit, the time to fit, and the quarantine periods. Then, in our methodology, we construct coloured tables (e.g. Table ) to visualize these qualities. Overall in the results obtained, we observe that the parameter β, and in consequence $R_{0}^{c}$ , are identifiable (when the rest of the parameters are known). The following best parameter sets in terms of structural identifiability are the sets without initial conditions to estimate, especially in the case without quarantines involved. Wherein the cases with quarantine are more sensitive as a percentage of the population in quarantine increases and the epidemic growth is $R_{0} < 4$ , evidence of a lack of identifiability in the model when the quarantine dynamics are considered. Then, we would take quarantine periods studied in the models with caution. Expanding the analysis to the initial phases, we find a significant loss of parameter identifiability. It compared with the structural results. Possibly verifying the conclusion shown in [Citation52] indicates the non-identifiability in pre-peak moments. However, we are interested in analyzing the initial period to recover the control reproduction number. Therefore, we consider applying our computational approach to various initial epidemic periods to evaluate our COVID-19 model. In our experiments, we confirm that the parameter β is recovered, and in the cases when the initial conditions are also estimated and fit time is $T_{f i t} > 20$ days (see Tables S2 and S3 in the supplemental material) identifiability is guaranteed. Showing a characteristic dependence between the fit times and the growth rates, besides quarantine periods, which also affect identifiability, but maintain the parameter β identifiable.

Finally, collecting the conclusions from the experiment with synthetic data, we select the parameter sets (β), and ( $β, J (0), I_{s} (0)$ ) for periods without quarantine, and ( $β, p$ ), and ( $β, p, J (0), I_{s} (0)$ ) for periods with quarantine, to fit with Chilean data for some regions. The model parameters are not identifiable in the Chilean regions with a quarantine period in the initial phase. On the other hand, regions without quarantine periods have better results. In particular, when more data ( $T_{f i t} > 20$ ) is fit to estimate initial conditions is guaranteed practical identifiability. (see Table ). For future work, we see the necessity of selecting the fit times for each Chilean region because each one has a particular quarantine declared; besides, the quality data also is variable. The study developed by Freire-Flores et al. in [Citation21] also analyzes the quarantine periods in some Chilean regions using computations of reproduction numbers on different periods characterized by extraordinary government measures. Then we can use these times of fitting with our approach and compare with these results, where, in particular, the results for the parameter β in Region 8 are close.

Our findings further underscore the importance of investigating the identifiability of model parameters through structural and practical identifiability analysis to interpret the results better and avoid misleading model inferences during an epidemic outbreak [Citation22, Citation51, Citation56]. On the one hand, structural identifiability analysis should be conducted before the model is applied to real data since it can help guide the modeller about any parameters in the model that are not identifiable even in the scenario that the entire epidemic trajectory has been observed. Parameters deemed not structurally identifiable should be flagged by the modeller and determine whether the model can still be used to infer other quantities of interest. Analyzing a parameter's practical identifiability is unnecessary if it is not structurally identifiable. On the other hand, our results indicate that modellers should be especially careful when calibrating complex epidemic models using limited data of the early epidemic phases, as one or more parameters are often difficult to identify from limited data depending on the model's complexity level. Parameter identifiability occurs more frequently in cases involving more complex epidemic dynamics where parameters change over time due to the implementation and relaxation of social distancing interventions. Unsurprisingly, a parameter is unidentifiable when its function is to model dynamics for an event (e.g. quarantine measures) that has not occurred. As more data on an unfolding epidemic becomes available, model parameters tend to become identifiable, especially when the epidemic has already reached its peak. This highlights the challenges modellers face when using models to generate real-time epidemic predictions as the epidemic evolves.

Supplemental material

Supplemental Material

Download PDF (5.9 MB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Availability of data and materials

The data were recorded by the Chilean Government [Citation24], which can be accessed via a GitHub repository supported by Ministery of Science, Technology, Knowledge, and Innovation of Chile [Citation41].

Additional information

Funding

RB is supported by project MATH-Amsud 22-MATH-05 ‘NOTION: NOn-local conservaTION laws for engineering, biological and epidemiological applications: theoretical and numerical’ and from ANID (Chile) through Fondecyt project 1210610; Anillo project ANID/ACT210030; Centro de Modelamiento Matemático (CMM), project FB210005 of BASAL funds for Centers of Excellence; and Centro de Recursos Hídricos para la Agricultura y la Minería (CRHIAM), project ANID/FONDAP/15130015. GC is partially supported by National Science Foundation (NSF) grants #2026797, #2034003, and National Institutes of Health (NIH) R01 GM 130900, and by project ANID/PCI/MEC80170119. IK would like to thank the German Research Foundation (DFG) for financial support of the project within the Cluster of Excellence ‘Data-Integrated Simulation Science’ (EXC 2075), DFG Project 432343452. LYLD is supported by Agencia Nacional de Investigación y Desarrollo (ANID) scholarship ANID-PCHA/Doctorado Nacional/2019-21190640 and Centro de Modelamiento Matemático (CMM), project FB210005 of BASAL funds for Centers of Excellence.

References

A. Adiga, D. Dubhashi, B. Lewis, M. Marathe, S. Venkatramanan, and A. Vullikanti, Mathematical models for COVID-19 pandemic: A comparative analysis, J. Indian Inst. Sci. 100 (2020), pp. 793–807.
PubMed Web of Science ®Google Scholar
A. Afzal, C. Saleel, S. Bhattacharyya, N. Satish, O.D. Samuel, and I.A. Badruddin, Merits and limitations of mathematical modeling and computational simulations in mitigation of COVID-19 pandemic: A comprehensive review, Arch. Comput. Meth. Eng. 29 (2022), pp. 1311–1337.
PubMed Web of Science ®Google Scholar
R.M. Anderson and R.M. May, Infectious Diseases of Humans: Dynamics and Control, Oxford University Press, 1991.
Google Scholar
M. Biggerstaff, R. Slayton, M. Johansson, and J. Butler, Improving pandemic response: Employing mathematical modeling to confront xoronavirus disease 2019, Clin. Infect. Dis. 74 (2022), pp. 913–917.
PubMed Web of Science ®Google Scholar
F. Brauer and C. Castillo-Chavez, Mathematical Models in Population Biology and Epidemiology, Springer, New York, 2012.
Google Scholar
J.H. Buckner, G. Chowell, and M.R. Springborn, Dynamic prioritization of COVID-19 vaccines when social distancing is limited for essential workers, Proc. Nat. Acad. Sci. 118(16) (2021), p. e2025786118.
PubMed Web of Science ®Google Scholar
R. Bürger, G. Chowell, and L.Y. Lara-Díaz, Comparative analysis of phenomenological growth models applied to epidemic outbreaks, Math. Biosci. Eng. 16 (2019), pp. 4250–4273.
PubMed Web of Science ®Google Scholar
R. Bürger, G. Chowell, and L.Y. Lara-Díaz, Measuring differences between phenomenological growth models applied to epidemiology, Math. Biosci. 334 (2021), p. 108558. https://doi.org/10.1016/j.mbs.2021.108558
PubMed Web of Science ®Google Scholar
T. Burki, COVID-19 in Latin America, Lancet Infect. Dis. 20(5) (2020), pp. 547–548. https://doi.org/10.1016/s1473-3099(20)30303-0
PubMed Web of Science ®Google Scholar
Census Chile, Census Chile (2017). Accessed February 7, 2021. Available at http://www.censo2017.cl/descargas/home/sintesis-de-resultados-censo2017.pdf.
Google Scholar
J.W. Chan, S. Yuan, K. Kok, K.W. To, H. Chu, J. Yang, F. Xing, J. Liu, C.Y. Yip, R.W.S. Poon, H.W. Tsoi, S.K.F. Lo, K.H. Chan, V.K.M. Poon, W.M. Chan, J.D. Ip, J.P. Cai, V.C.C. Cheng, H. Chen, C.K.M. Hui, and K.Y. Yuen, A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: A study of a family cluster, Lancet 395 (2020), pp. 514–523.
PubMed Web of Science ®Google Scholar
J. Chapman and N. Evans, The structural identifiability of susceptible–infective–recovered type epidemic models with incomplete immunity and birth targeted vaccination, Biomed. Signal Process. Control 4 (2009), pp. 278–284.
Web of Science ®Google Scholar
O.T. Chis, J.R. Banga, and E. Balsa-Canto, Structural identifiability of systems biology models: A critical comparison of methods, PLoS One 6(11) (2011), p. e27755.
PubMed Web of Science ®Google Scholar
G. Chowell, Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecast, Infect. Disease Model. 2 (2017), pp. 379–398.
PubMedGoogle Scholar
G. Chowell, D. Chowell, K. Roosa, R. Dhillon, and S. Devabhaktuni, Sustainable social distancing through facemask use and testing during the COVID-19 pandemic, preprint (2020). Available at medRXiv.
Google Scholar
G. Chowell, S. Dahal, Y. Liyanage, A. Tariq, and N. Tuncer, Structural identifiability analysis of epidemic models based on differential equations: A primer, preprint (2022). Available at arXiv.
Google Scholar
G. Chowell, P. Fenimore, M. Castillo-Garsow, and C. Castillo-Chavez, SARS outbreaks in Ontario, Hong Kong and Singapore: The role of diagnosis and isolation as a control mechanism, J. Theor. Biol. 224 (2003), pp. 1–8.
PubMed Web of Science ®Google Scholar
O. Diekmann, H. Heesterbeek, and T. Britton, Mathematical Tools for Understanding Infectious Disease Dynamics, Vol. 7. Princeton University Press, 2013.
Google Scholar
N. Evans, L. White, M. Chapman, K. Godfrey, and M. Chappell, The structural identifiability of the susceptible infected recovered model with seasonal forcing, Math. Biosci. 194 (2005), pp. 175–197.
PubMed Web of Science ®Google Scholar
S. Flaxman, S. Mishra, A. Gandy, H.J.T. Unwin, T.A. Mellan, H. Coupland,E Eeeetal, H. Zhu, T. Berah, J.W. Eaton, M. Monod, A.C. Ghani, C.A. Donnelly, S. Riley, M.A.C. Vollmer, N.M. Ferguson, L.C. Okell, S. Bhatt, and Imperial College COVID-19 Response Team, Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe, Nature 194 (2005), pp. 175–197.
Google Scholar
D. Freire-Flores, N. Llanovarced-Kawles, A. Sanchez-Daza, and Á. Olivera-Nappa, On the heterogeneous spread of COVID-19 in Chile, Chaos Solit. Fract. 150 (2021), p. 111156. Available at https://www.sciencedirect.com/science/article/pii/S0960077921005105.
PubMed Web of Science ®Google Scholar
L. Gallo, M. Frasca, V. Latora, and G. Russo, Lack of practical identifiability may hamper reliable predictions in COVID-19 epidemic models, Sci. Adv. 8 (2022), p. eabg5234. https://doi.org/10.1126/sciadv.abg5234.
PubMed Web of Science ®Google Scholar
P.J. Garcia, A. Alarcón, A. Bayer, P. Buss, G. Guerra, H. Ribeiro, K. Rojas, R. Saenz, N. Salgado de Snyder, G. Solimano, R. Torres, S. Tobar, R. Tuesca, G. Vargas, and R. Atun, COVID-19 response in Latin America, Am. J. Trop. Med. Hyg. 103 (2020), pp. 1765–1772.
PubMed Web of Science ®Google Scholar
Government of Chile, COVID-19 data. Accessed February 16, 2021. Official COVID-19. Available at https://www.gob.cl/coronavirus/cifrasoficiales/#reportes.
Google Scholar
H. Hong, A. Ovchinnikov, G. Pogudin, and C. Yap, SIAN: Software for structural identifiability analysis of ODE models, Bioinformatics 35 (2019), pp. 2873–2874. https://doi.org/10.1093/bioinformatics/bty1069.
PubMed Web of Science ®Google Scholar
J. Jia, J. Ding, S. Liu, G. Liao, J. Li, B. Duan, G. Wang, and R. Zhang, Modeling the control of COVID-19: Impact of policy interventions and meteorological factors (2020).
Google Scholar
Y.H. Kao and M.C. Eisenberg, Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment, Epidemics 25 (2018), pp. 89–100.Available at https://www.sciencedirect.com/science/article/pii/S1755436517301627.
PubMed Web of Science ®Google Scholar
W. Kermack and A. McKendrick, A contribution to the mathematical theory of epidemics, Proc. Roy. Soc. A 115 (1927), pp. 700–721.
Google Scholar
E. Kharazmi, M. Cai, X. Zheng, Z. Zhang, G. Lin, and G.E. Karniadakis, Identifiability and predictability of integer- and fractional-order epidemiological models using physics-informed neural networks, Nature Comput. Sci. 1 (2021), pp. 744–753.
PubMedGoogle Scholar
T. Kirby, South America prepares for the impact of COVID-19, Lancet Respir. Med. 8(6) (2020), pp. 551–552.
PubMed Web of Science ®Google Scholar
H.J. Li, L. Wan, Z. Wang, C.X.Z. u, A. Moustakas, and S. Pei, Mathematical modelling of the pandemic of 2019 novel coronavirus (COVID-19): Patterns, dynamics, prediction, and control, Front. Phys. 9 (2021), p. 738602.
Web of Science ®Google Scholar
Y. Li, H. Campbell, D. Kulkarni, A. Harpur, M. Nundy, X. Wang, and H. Nair, The temporal association of introducing and lifting non-pharmaceutical interventions with the time-varying reproduction number (R) of SARS-CoV-2: A modelling study across 131 countries, Lancet Infect. Dis. 21 (2021), pp. 193–202. Available at https://www.sciencedirect.com/science/article/pii/S1473309920307854.
PubMed Web of Science ®Google Scholar
T.S. Ligon, F. Fröhlich, O.T. Chiş, J.R. Banga, E. Balsa-Canto, and J. Hasenauer, GenSSI 2.0: Multi-experiment structural identifiability analysis of SBML models, Bioinformatics 34 (2017), pp. 1421–1423. https://doi.org/10.1093/bioinformatics/btx735.
Web of Science ®Google Scholar
N. Linton, T. Kobayashi, Y. Yang, K. Hayashi, A. Akhmetzhanov, S.M. Jung, B. Yuan, R. Kinoshita, and H. Nishiura, Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data, J. Clin. Med. 9 (2020), p. 538. https://doi.org/10.3390/jcm9020538.
PubMed Web of Science ®Google Scholar
M. Martcheva, An Introduction to Mathematical Epidemiology, Springer, New York, 2015.
Google Scholar
T. McKinley, A.R. Cook, and R. Deardon, Inference in epidemic models without likelihoods, Int. J. Biostat. 5(1) 2009. https://doi.org/10.2202/1557-4679.1171.
Google Scholar
N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (2004), pp. 1087–1092. https://doi.org/10.1063/1.1699114.
Google Scholar
H. Miao, C. Dykes, L.M. Demeter, J. Cavenaugh, S.Y. Park, A.S. Perelson, and H. Wu, Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments, Bull. Math. Biol. 70 (2008), pp. 1749–1771.
PubMed Web of Science ®Google Scholar
H. Miao, X. Xia, A.S. Perelson, and H. Wu, On identifiability of nonlinear ODE models and applications in viral dynamics, SIAM Rev. 53 (2011), pp. 3–39.
PubMed Web of Science ®Google Scholar
Ministerio de Salud, Chile, Ministerio de Salud, Chile. Accessed July 29, 2021. Available at http://www.minsal.cl.
Google Scholar
Ministery of Science, Technology, Knowledge and Innovation of Chile, Website of the official database for COVID-19 research. Accessed February 16, 2021. Available at http://www.minciencia.gob.cl/covid19.
Google Scholar
K. Mizumoto, K. Kagaya, A. Zarebski, and G. Chowell, Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020, Euro Surveill. 25(10) (2020). https://doi.org/10.2807/1560-7917.es.2020.25.10.2000180.
PubMedGoogle Scholar
A. Morciglio, B. Zhang, G. Chowell, J.M. Hyman, and Y. Jiang, Mask-ematics: Modeling the effects of masks in COVID-19 transmission in high-risk environments, Epidemiologia 2 (2021), pp. 207–226. https://doi.org/10.3390/epidemiologia2020016.
PubMedGoogle Scholar
V.K. Murty and J. Wu, Mathematics of Public Health, Springer, Cham, 2022.
Google Scholar
J.C. Navarro, J. Arrivillaga-Henríquez, J. Salazar-Loor, and A.J. Rodriguez-Morales, COVID-19 and dengue, co-epidemics in Ecuador and other countries in Latin America: Pushing strained health care systems over the edge, Travel Med. Infect. Dis. 37 (2020), p. 101656. Available at https://www.sciencedirect.com/science/article/pii/S1477893920301241.
PubMed Web of Science ®Google Scholar
H. Nishiura, T. Kobayashi, T. Miyama, A. Suzuki, S.M. Jung, K. Hayashi, R. Kinoshita, Y. Yang, B. Yuan, A.R. Akhmetzhanov, and N.M. Linton, Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19), Int. J. Infect. Dis 94 (2020), pp. 154–155.
PubMed Web of Science ®Google Scholar
R. Padmanabhan, H.S. Abed, N. Meskin, T. Khattab, M. Shraim, and M.A. Al-Hitmi, A review of mathematical model-based scenario analysis and interventions for COVID-19, Comput. Meth. Programs Biomed. 209 (2021), p. 106301. Available at https://www.sciencedirect.com/science/article/pii/S0169260721003758.
PubMed Web of Science ®Google Scholar
J. Paireau, A. Andronico, N. Hozé, M. Layan, P. Crépey, A. Roumagnac, M. Lavielle, P.Y. Boëlle, and S. Cauchemez, An ensemble model based on early predictors to forecast COVID-19 health care demand in France, Proc. Nat. Acad. Sci. 119 (2022), p. e2103302119. https://doi.org/10.1073/pnas.2103302119.
PubMed Web of Science ®Google Scholar
L. Peng, W. Yang, D. Zhang, C. Zhuge, and L. Hong, Epidemic analysis of COVID-19 in China by dynamical modeling, (2020). Available at medRxiv https://www.medrxiv.org/content/early/2020/02/18/2020.02.16.20023465.
Google Scholar
M. Renardy, D. Kirschner, and M. Eisenberg, Structural identifiability analysis of age-structured pde epidemic models, J. Math. Biol. 84 (2022), pp. 1–30.
Web of Science ®Google Scholar
K. Roosa and G. Chowell, Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models, Theor. Biol. Med. Model. 16(1) (2019). https://doi.org/10.1186/s12976-018-0097-6.
PubMedGoogle Scholar
T. Sauer, T. Berry, D. Ebeigbe, M.M. Norton, A.J. Whalen, and S.J. Schiff, Identifiability of infection model parameters early in an epidemic, SIAM J. Control Optim. 60(2) (2022), pp. S27–S48.
PubMed Web of Science ®Google Scholar
S. Shankar, S.S. Mohakuda, A. Kumar, P. Nazneen, A.K. Yadav, K. Chatterjee, and K. Chatterjee, Systematic review of predictive mathematical models of COVID-19 epidemic, Medical J. Armed Forc. India 77 (2021), pp. S385–S392. Special issue: COVID-19 – navigating through challenges. Available at https://www.sciencedirect.com/science/article/pii/S0377123721001258.
PubMedGoogle Scholar
A. Tariq, T. Chakhaia, S. Dahal, A. Ewing, X. Hua, S.K. Ofori, O. Prince, A.D. Salindri, A.E. Adeniyi, J.M. Banda, P. Skums, R. Luo, L.Y. Lara-Díaz, R. Bürger, I.C.H. Fung, E. Shim, A. Kirpich, A. Srivastava, and G. Chowell, An investigation of spatial-temporal patterns and predictions of the coronavirus 2019 pandemic in colombia, 2020–2021, PLOS Negl. Trop. Dis. 16 (2022), p. e0010228. https://doi.org/10.1371/journal.pntd.0010228.
PubMed Web of Science ®Google Scholar
E.C.C.W. T.N.C.P.E.R.E. Team, The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) −china (2020).
Google Scholar
N. Tuncer and T.T. Le, Structural and practical identifiability analysis of outbreak models, Math. Biosci. 299 (2018), pp. 1–18. Available at https://www.sciencedirect.com/science/article/pii/S0025556417303164.
PubMed Web of Science ®Google Scholar
N. Tuncer, A. Timsina, M. Nuno, G. Chowell, and M. Martcheva, Parameter identifiability and optimal control of an SARS-CoV-2 model early in the pandemic, J. Biol. Dyn. 16 (2022), pp. 412–438.
PubMed Web of Science ®Google Scholar
E.A. Undurraga, G. Chowell, and K. Mizumoto, COVID-19 case fatality risk by age and gender in a high testing setting in Latin America: Chile, March–August 2020, Infect. Dis. Pov. 10(1) (2021).
PubMed Web of Science ®Google Scholar
P. van den Driessche and J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission, Math. Biosci. 180 (2002), pp. 29–48. Available at https://www.sciencedirect.com/science/article/pii/S0025556402001086.
PubMed Web of Science ®Google Scholar
E. Vynnycky, and R.E. White, An Introduction to Infectious Disease Modelling. OUP Oxford, 2010.
Google Scholar
P.G.T. Walker, C. Whittaker, O.J. Watson, M. Baguelin, P. Winskill, A. Hamlet, B.A. Djafaara, Z. Cucunubá, D.O. Mesa, W. Green, H. Thompson, S. Nayagam, K.E.C. Ainslie, S. Bhatia, S. Bhatt, A. Boonyasiri, O. Boyd, N.F. Brazeau, L. Cattarino, G. Cuomo-Dannenburg, A. Dighe, C.A. Donnelly, I. Dorigatti, S.L. van Elsland, R. FitzJohn, H. Fu, K.A.M. Gaythorpe, L. Geidelberg, N. Grassly, D. Haw, S. Hayes, W. Hinsley, N. Imai, D. Jorgensen, E. Knock, D. Laydon, S. Mishra, G. Nedjati-Gilani, L.C. Okell, H.J. Unwin, R. Verity, M. Vollmer, C.E. Walters, H. Wang, Y. Wang, X. Xi, D.G. Lalloo, N.M. Ferguson, and A.C. Ghani, The impact of COVID-19 and strategies for mitigation and suppression in low- and middle-income countries, Science 369 (2020), pp. 413–422. https://doi.org/10.1126/science.abc0035.
PubMed Web of Science ®Google Scholar
J. Wang, Mathematical models for COVID-19: applications, limitations, and potentials, J. Publ. Health Emerg. 4 (2020), pp. 9–9, https://doi.org/10.21037/jphe-2020-05.
PubMedGoogle Scholar
Wikipedia Commons, Map of Chile. Accessed February 7, 2021. Available at https://commons.wikimedia.org/wiki/File:Mapa-chile.svg.
Google Scholar
World Health Organization, WHO Director-General's opening remarks at the media briefing on COVID-19-11. March 2020 World Health Organization 2020 [April 23]. Available at https://bit.ly/2A8aCIO.
Google Scholar
H. Wu, H. Zhu, H. Miao, and A. Perelson, Parameter identifiability and estimation of HIV/AIDS dynamic models, Bull. Math. Biol. 70 (2008), pp. 785–799.
PubMed Web of Science ®Google Scholar
P. Yan and G. Chowell, Quantitative Methods for Investigating Infectious Disease Outbreaks, Texts in Applied Mathematics, Springer, Cham, 2019.
Google Scholar
C. You, Y. Deng, W. Hu, J. Sun, Q. Lin, F. Zhou, C.H. Pang, Y. Zhang, Z. Chen, and X.H. Zhou, Estimation of the time-varying reproduction number of COVID-19 outbreak in China, Int. J. Hyg. Environ. Health 228 (2020), p. 113555. Available at https://www.sciencedirect.com/science/article/pii/S1438463920302133.
PubMed Web of Science ®Google Scholar
C. Zhan, Y. Zheng, T.H.Z. ai, and B. Li, Identifying epidemic spreading dynamics of COVID-19 by pseudocoevolutionary simulated annealing optimizers, Neural Comput. Applic. 33 (2021), pp. 4915–4928.
PubMed Web of Science ®Google Scholar
S. Zhang, J. Ponce, Z. Zhang, G. Lin, and G. Karniadakis, An integrated framework for building trustworthy data-driven epidemiological models: Application to the COVID-19 outbreak in New York City, PLOS Comput. Biol. 17 (2021), pp. 1–29. https://doi.org/10.1371/journal.pcbi.1009334.
Web of Science ®Google Scholar
Z. Zhao, X. Li, F. Liu, G. Zhu, C. Ma, and L. Wang, Prediction of the COVID-19 spread in African countries and implications for prevention and control: A case study in South Africa, Egypt, Algeria, Nigeria, Senegal and Kenya, Sci. Tot. Environ. 729 (2020), p. 138959. Available at https://www.sciencedirect.com/science/article/pii/S0048969720324761.
PubMed Web of Science ®Google Scholar
C. Zhou, Evaluating new evidence in the early dynamics of the novel coronavirus COVID-19 outbreak in Wuhan, China with real time domestic traffic and potential asymptomatic transmissions, (2020). Available at medRxiv https://www.medrxiv.org/content/early/2020/02/20/2020.02.15.20023440.
Google Scholar
L. Zou, F. Ruan, M. Huang, L. Liang, H. Huang, Z. Hong, J. Yu, M. Kang, Y. Song, J. Xia, Q. Guo, T. Song, J. He, H.L. Yen, M. Peiris, and J. Wu, SARS-CoV-2 viral load in upper respiratory specimens of infected patients, New Engl. J. Med. 382 (2020), pp. 1177–1179. PMID: 32074444. https://doi.org/10.1056/NEJMc2001737.
PubMed Web of Science ®Google Scholar

Appendix. Tables and Figures

Figure A1. Summary of results using conditions of experiment of scenario 1 (Parameters (α, β,

ρ_{s}

)). Here and in Figure in per row, the left plot corresponds to the 95% confidence intervals (vertical lines) for the distributions of each estimated parameter obtained through 250 realizations of the synthetic data generated for Scenario 1. Each red point denotes the mean estimated parameter value. The light-blue dashed horizontal line represents the true value (or assumed) for each parameter estimated; the experiments applied to each parameter are fixed on the horizontal axis. The right plot corresponds to the mean squared error (MSE) distribution of parameter estimates considering each experiment and dataset. Finally, Yellow, blue, and green lines or bars are grouped for each experiment and represent each

R_{0}^{c}

assumed for creating the synthetic data, i.e.

R_{0}^{c} = 3

, 4, and 5, respectively.

Figure A1. Summary of results using conditions of experiment of scenario 1 (Parameters (α, β, ρs)). Here and in Figure A2 in per row, the left plot corresponds to the 95% confidence intervals (vertical lines) for the distributions of each estimated parameter obtained through 250 realizations of the synthetic data generated for Scenario 1. Each red point denotes the mean estimated parameter value. The light-blue dashed horizontal line represents the true value (or assumed) for each parameter estimated; the experiments applied to each parameter are fixed on the horizontal axis. The right plot corresponds to the mean squared error (MSE) distribution of parameter estimates considering each experiment and dataset. Finally, Yellow, blue, and green lines or bars are grouped for each experiment and represent each R0c assumed for creating the synthetic data, i.e. R0c=3, 4, and 5, respectively.

Figure A2. Summary of results using conditions of scenario 1 (Parameters ( $I_{s} (0)$ , $J (0)$ , $R_{0}^{c}$ )).

A computational approach to identifiability analysis for a model of the propagation and control of COVID-19 in Chile

Abstract

1. Introduction

1.1. Scope

1.2. Related work

2. Methods

2.1. Compartmental model

2.2. Simulated data

Table 1. Values of model parameters selected to generate simulated data.

2.2.1. Scenario 1: initial phase for three values of R0c without quarantine

2.2.2. Scenario 2: initial phase for three values of R0c with quarantine.

2.3. Parameter estimation

2.3.1. Mixed simulated annealing method with a least-squares approach (SA-LSQ)

2.3.2. Sets of parameters to be estimated

Table 2. Set of parameters to be estimated in Scenarios 1 and 2.

Table 3. Summary of results obtained for Scenario 1.

Table 4. Summary of results for Scenario 2.

2.4. Bootstrapping method

2.5. Parameter identifiability

2.5.1. Preliminaries

2.5.2. Identifiability analysis process

3. Results

3.1. Structural identifiability

3.2. Practical identifiability

4. Application to regional data of Chile

4.1. The early transmission of COVID-19 in Chile

Table 5. Changes in criteria of reported data.

4.2. Experiments applied to regional data

Table 6. Summary of the identifiable parameters concluded from the structural and practical identifiability study used to Chilean data.

Table 7. Fixed parameters for running experiments and duration of quarantine periods.

Table 8. Parameter estimation results for the application with Chilean data.

4.3. Application results

5. Discussion and conclusions

Supplemental Material

Disclosure statement

Availability of data and materials

Additional information

Funding

References

Appendix. Tables and Figures

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

2.2.1. Scenario 1: initial phase for three values of $R_{0}^{c}$ without quarantine

2.2.2. Scenario 2: initial phase for three values of $R_{0}^{c}$ with quarantine.