1,348
Views
23
CrossRef citations to date
0
Altmetric
Original Articles

Modelling heterogeneity in malaria transmission using large sparse spatio-temporal entomological data

, , , &
Article: 22682 | Received 26 Aug 2013, Accepted 10 May 2014, Published online: 24 Jun 2014

Abstract

Background

Malaria transmission is measured using entomological inoculation rate (EIR), number of infective mosquito bites/person/unit time. Understanding heterogeneity of malaria transmission has been difficult due to a lack of appropriate data. A comprehensive entomological database compiled by the Malaria Transmission Intensity and Mortality Burden across Africa (MTIMBA) project (2001–2004) at several sites is the most suitable dataset for studying malaria transmission–mortality relations. The data are sparse and large, with small-scale spatial–temporal variation.

Objective

This work demonstrates a rigorous approach for analysing large and highly variable entomological data for the study of malaria transmission heterogeneity, measured by EIR, within the Rufiji Demographic Surveillance System (DSS), MTIMBA project site in Tanzania.

Design

Bayesian geostatistical binomial and negative binomial models with zero inflation were fitted for sporozoite rates (SRs) and mosquito density, respectively. The spatial process was approximated from a subset of locations. The models were adjusted for environmental effects, seasonality and temporal correlations and assessed based on their predictive ability. EIR was calculated using model-based predictions of SR and density.

Results

Malaria transmission was mostly influenced by rain and temperature, which significantly reduces the probability of observing zero mosquitoes. High transmission was observed at the onset of heavy rains. Transmission intensity reduced significantly during Year 2 and 3, contrary to the Year 1, pronouncing high seasonality and spatial variability. The southern part of the DSS showed high transmission throughout the years. A spatial shift of transmission intensity was observed where an increase in households with very low transmission intensity and significant reduction of locations with high transmission were observed over time. Over 68 and 85% of the locations selected for validation for SR and density, respectively, were correctly predicted within 95% credible interval indicating good performance of the models.

Conclusion

Methodology introduced here has the potential for efficient assessment of the contribution of malaria transmission in mortality and monitoring performance of control and intervention strategies.

Malaria is still endemic in more than 100 countries worldwide, leaving children and pregnant mothers being the most vulnerable groups for infections (Citation1). Global estimates report 219 million malaria cases (range 154–289 million) with about 660 thousands deaths (range 610–971), most of these (~90%) occurring in Africa. The impact of the malaria burden on the achievement of Millennium Development Goals is enormous, and its control is a potential contribution towards significant progress (Citation1).

Malaria is transmitted by female Anopheles mosquitoes. The transmission intensity is therefore highly sensitive to environmental variations that affect the densities of these vectors and their ability to transmit the infection (Citation2Citation4). Up to 10-fold variations in transmission intensity have been observed within very small localities due to geographical, biological or socio-economic factors (Citation5Citation8). Understanding the heterogeneity in transmission and human exposure to malaria infection is critical for optimizing control programs and targeting interventions (Citation9Citation12).

Malaria disease burden and transmission can be assessed using incidence or prevalence in human hosts. However, the entomological inoculation rate (EIR) most directly quantifies the exposure of the human population to the infectious stages of the parasite (Citation12Citation16). EIR is the product of the human-biting rate, for example, mosquito bites/person/night (which can also be estimated using mosquito density) and the sporozoite rate (SR), which is the proportion of infective mosquitoes (Citation7, Citation17). The measure expresses the average number of infective bites a person receives in a specified unit of time. It can be also used to predict other measures of transmission, which are used to evaluate effectiveness of malaria control program (Citation5, Citation18). Uncertainty due to small sample, low values and variability in the SR and cost complicate precise estimation of EIR requiring standardized entomological surveys conducted over large areas (Citation5, Citation6) (Citation13, Citation14). Accurate estimation of EIR requires longitudinal surveys within the study area to take into account spatio-temporal variations and seasonality trends. However, there is a paucity of this type of data due to cost and resources needed to collect them (Citation19Citation21).

The Malaria Transmission Intensity and Mortality Burden across Africa (MTIMBA) project was initiated by the INDEPTH Network (Citation22, Citation23) and conducted over a period of 2001–2004 in several countries in Africa including Tanzania, Kenya, Mozambique, Senegal, Ghana and Burkina Faso. The main objective of the initiative was to assess the relation between the intensity of malaria transmission and all-cause as well as malaria-specific mortality across Africa, taking into account the influence of malaria control activities. The MTIMBA entomological data have been collected fortnightly over large number of locations (households) and to date this is the only available entomological database appropriate to study space–time heterogeneity of malaria transmission in Africa. These data are sparse with seasonal variations and spatio-temporal correlations. High dependence of climate, environment and ecological factors in the life of mosquito and seasonality any of the survey locations had zero mosquitoes or proportion of infected ones. In standard modelling approaches, EIR is treated as a continuous outcome, logarithmically transformed to fulfil the assumption of normality (Citation21, Citation24Citation27). However, when EIR is estimated as a product of the SR and mosquito density, which are generated from the binomial and a count distribution like Poisson or negative binomial, respectively, normality assumptions are void. To our knowledge, Kasasa et al. (Citation28) is the only literature report analysis of EIR data considering the two sources of data separately. In addition, due to the amount of zeros which is larger than what can be generated by the standard distributions, the data are over/under dispersed and zero inflated (Citation21, Citation29Citation32). Statistical analysis which accounts for these characteristics is essential to obtain unbiased estimates for the regression coefficients (Citation33Citation36).

Moreover, the MTIMBA-EIR data have been collected at fixed locations and they are typically geostatistical data. Similar exposures of environmental and climatic conditions to locations which are geographically close introduce spatial correlation between them. Geostatistical models take into account spatial correlation by introducing location-specific random effects as latent observations from a multivariate spatial Gaussian process (Citation37). Spatial correlation between any pair of locations is often considered as a function of distance on the covariance matrix of the process. These models have a large number of parameters. Bayesian formulations (Citation38) allow model fit via Markov Chain Monte Carlo (MCMC) simulation methods (Citation39). However, the estimation process involves covariance matrix computations which are infeasible when the number of locations is too large (Citation40, Citation41). A computational flexible way to overcome this problem is the approximation of the spatial process from a subset of locations using properties of conditional multivariate Gaussian distribution of the process (Citation40Citation42). Most of these techniques have been applied in simulated data, observed in regular grid and mainly with Gaussian characteristics. In this study, selection of subset of locations is implemented using methods proposed in our previous work (Citation40, Citation43).

We now demonstrate a rigorous modelling way of analysing large spatio-temporal EIR data and study the heterogeneity, space and temporal patterns of malaria transmission within one MTIMBA site, the Rufiji DSS area in Tanzania (Citation44). The Gaussian process approximation proposed by Banerjee et al. (Citation40) is applied to binomial (SRs) and negative binomial (density) data with zero inflation. The models are fitted using Bayesian MCMC simulation and assessed on the basis of their predictive ability. Model-based predictions of SR and density were multiplied to compute EIR. Model formulation details are given in the methodology section and selected results are presented afterwards. The discussion and conclusion of the findings consider the implications for timing and allocation of resources for malaria interventions.

Methodology

Study site

The study utilized data collected from one of the MTIMBA sites in Tanzania, the Rufiji DSS (RDSS). The RDSS is located in Rufiji District, Coast Region, Tanzania, about 178 km south of Dar es Salaam. The RDSS area extends from 7.47° to 8.03° south latitude and 38.62°–39.17° east longitude and operates in six contiguous wards and 31 villages. The surveillance area covers an area of 1,813 km2 and monitors 85,000 people, which is about 47% of the total population of the Rufiji District (INDEPTH Monogram). Rufiji District has an overall mean altitude of <500 metres. Its vegetation is mainly formed of tropical forests and grassland. The district has hot weather throughout the year and two rainy seasons: short rains (October–December) and long rains (February–May). The average annual precipitation in the district is between 800 and 1,000 mm. A prominent feature in the District is the Rufiji River with its large flood plain and delta, the most extensive in the country (INDEPTH Monogram; Rufiji DSS Profile, 2000). The majority of the people in the Rufiji District are subsistence farmers.

The main responsible malaria vectors in the area include A. funestus, and members of the A. gambiae complex, including A. gambiae (sensu stricto) and A. arabiensis. Mosquito populations usually peak during the rain seasons especially in areas where rice cultivation is taking place and during the dry months, a high population was usually observed in areas with permanent water bodies (Citation23).

Mosquito data

The entomological data were collected for the period of 3 years, October 2001–September 2004 (Source: http://www.indepth-network.org/dss_site_profiles/rufiji.pdf). The MTIMBA entomological protocol has been well defined in MTIMBA documentation (unpublished). In a snapshot, mosquitoes were captured at least twice every month using Centers for Disease Control (CDC) miniature light traps. The human population in the RDSS was classified into geographical clusters (100–1,000 people), then for each round a simple random sampling (without replacement) was employed within clusters to select between 20 and 100 ‘index’ people (households) for the set-up of mosquito catches (traps). The traps were fitted indoors with incandescent bulbs and laid close to a human volunteer (randomly selected from members of the household) sleeping under an untreated bednet. Light traps operated from sundown to sunrise (i.e. 6 pm–6 am) for two consecutive nights in each household and bags were emptied every morning. A total of 2,479 unique locations (households) involved were geo-referenced. Collected mosquitoes were counted and sorted into vector species to allow for separate assessment of transmission intensity.

Environmental and climatic data

Remote sensing data were extracted from different sources with different spatial, SpR, and temporal, TR, resolutions. These include normalized difference vegetation index (NDVI) (SpR: 250 m2; TR: 16 days; Source: MODIS), day and night temperature (SpR: 1 km2; TR: 8 days; Source: MODIS), rainfall (SpR: 8 km2; TR: 10 days; Source: ADDS) and distance to the nearest water bodies (SpR: 1 km2; Source: Health Mapper).

Statistical analysis

Geostatistical zero inflated negative binomial and logistic regression models were fitted on the mosquito density and SR data, respectively. The models accounted for the effect of environmental and climatic predictors, annual trends, seasonal patterns, and spatial and temporal correlations. The predictive process was used to approximate the spatial process using a subset of locations. Model-based prediction of SR and density were multiplied to obtain estimates of monthly and annual EIR. Details of the model formulation and its implementation are described in the subsections below. Programs used for this analysis are available via contact with the corresponding author.

Model formulation for density data

Let Y it be the number of female mosquitoes and Xit(1) be a vector of environmental predictors (extracted from satellite data) observed at location s i , i =1,…, n , and calendar month t=1,…,36 for a specific species. Y it is assumed to follow a negative binomial distribution, Y it ~NB(r, p it ), where p it =r/(r+µ it ). r is an over-dispersion parameter and µ it is the mean mosquito density. Covariates Xit(1), seasonal trends f(t)(1), spatial Ui (1), temporal ɛ t (1)=(e 1,e 2,…,e t ) and non-spatial φ i (1) random effects are introduced on the log scale of the mean count via the equation log(µ it )=X T(1) β (1)+f(t)(1)+U i (1)+e t (1)+φ i (1), where β (1) is the vector of regression coefficients, φ i (1) is a residual error term capturing the remaining variability in the data. f(t)(1) is modelled via trigonometric function with a mixture of cycle, C f(t)=c=1C{δ1c(1)*cos(2πTct)+δ2c(1)*sin(2πTct)},C=2;t=1,...,12/36

where T c is the period of the season for cycle C (i.e. T 1=12 and T 2=6) and δ1c(1) and δ2c(1) are regression parameters used to describe the amplitude and phase within a period (Citation45, Citation46). Separate models were fitted assuming: (i) a constant seasonal pattern across the 3 years of the study by taking t=1,…,12; or (ii) a continuous time for the entire study period by taking t=1,…,36. The seasonal pattern considering dry/wet categorization of the data was also assessed.

A zero inflated model formulation was adopted to take into account the excess zeros in the count data. The model is defined as a mixture of a degenerate distribution with mass at zero and a non-degenerate count distribution. The log-likelihood is therefore a sum of the log-likelihood for the non-zero and the zero counts. The distribution of the data is now defined as:P(Y=0p*,θ)=p*+(1-p*)π(0θ)P(Y=yp*,θ)=(1-p*)π(yθ),y>0

where p* is the probability for a count to arise from the zero mass and 1–p* is the probability to observe a sample from a count distribution (i.e. π(y|θ)≡NB for our case, and θ is the vector of parameters associated with the distribution). This probability can be assigned a value between 0 and 1, usually approximates the proportion of zero counts in the sample or can be a function of covariates similar or different from those used in the full model (Citation21, Citation33) (Citation34, Citation47). Involving possible sources of zero inflation (e.g. covariates) reduces bias in parameter estimation of p* and other sources of uncertainty. In our case p* is modelled with a logit link as a function of all climatic predictors Xi observed at location s i , i.e. logit logit(pi*)=X*Tα, where α is the corresponding vector of regression coefficients.

Bayesian model formulation requires the specification of prior distributions for all unknown parameters. For the regression coefficients, β (1), δ (1) and α, a standard non-informative uniform prior is adopted, i.e. β (1)~Unif(–∞,∞), δ (1)~Unif(–∞,∞) and α ~Unif(–∞,∞), respectively. The latent observations Ui introduced at each location s i are assumed to be derived from a multivariate normal distribution with a covariance matrix Σnxn, i.e. U(1)MVN(0,Σnxn). The Σ(1) is a matrix with elements Σij(1) and quantify the covariance Cov(U i ,U j ) between the pair of locations s i and s j , respectively. Its distribution defines the Gaussian spatial process. Under the assumption of stationarity, the spatial correlation is taken to be a function of distance between locations. An exponential correlation structure for the covariance matrix of the spatial process is adopted, that is Σij(1)=σsp2(1)exp(-dijρ(1)), where σsp2(1) is the spatial variance, d ij is the distance between locations s i and s j and ρ (1) measuring the correlation decay and also known as the effective range (3/ρ (1)) and estimates the distance where the spatial correlation is <5%. The decay parameter ρ (1) assumed to follow a gamma distribution.

Computation of the Gaussian process requires the inversion of the covariance matrix, Σ(1), which for a very large number of locations is not feasible. To enable model fit we approximate the spatial process by a subset of locations, knots, {s i *,i=1,…,m} (m<<n) with latent observations U *(1)=(U(s1 *),…,U(sm *)) T . U *(1) is considered to arise from the same Gaussian process as U (1) and thus U *(1)~N(0,Σ*), where Σ* is the mxm covariance matrix of the sub-process. These latent observations U of the original process can be approximated by the ‘predictions’ of the sub-process via the mean of Gaussian conditional distribution U(s)U*N(QTΣ*-1U*,σ2-QTΣ*-1Q), that is TΣ*-1U*, where Cov(U*,U) is an mxn matrix of the covariance functions between the full and the sub-process (Citation48, Citation49). Selection of subset of location was done using the minimax space filling design implemented in R software (Citation50). The approach optimizes the selection of the best subset by minimizing the maximum of the nearest-neighbour distance between the original survey and the subset locations.

The εt model temporal correlation via a stationary autoregressive process of order one, i.e. e 1~ e1Normal(0,σT2(1)/(1-γ2)) and e te 1,…t−1~ Normal ete1,t-1Normal(γ(1)et-1,σT2(1)),t2, where γ(1) is an autocorrelation parameter γ(1)<1 which adopts a bounded uniform distribution, U*N(0,Σ*) and σT2(1) is the temporal error (51). The φi is assumed to follow a normal distribution with mean zero and a homoscedastic variance σe2(1). Inverse gamma priors are adopted for the variance parameters σsp2(1), σT2(1)and σe2(1).

Model for SR

Let N it and Z it be the number of mosquitoes tested and number infected, respectively at location s i and calendar month t. Z it is assumed to arise from a binomial distribution, Z it ~Bin(N it ,π it ) , where π it measures the SR at location s i and time t. The regression function links the SR with other terms of the model (as shown for the density data) and is given as logit logit(πit)=XT(2)β(2)+f(t)(2)+Ui+εt+φi. A similar specification described for the density model is followed in this model.

Data management and environmental lags

To facilitate the assessment of the seasonal pattern, data were summarized by location and calendar month. That implies that all repeated surveys from a specific location within the same month were collapsed (sum of mosquito density/tested and positive) to a single observation.

To account for the environmental-lag-effect on mosquito density or SR, non-spatial (negative) binomial models (with/without zero inflation) were fitted and best lags were assessed. Lags refer to a climate/ environment value at different time intervals prior to the study date that might influence the amount of mosquitoes collected or the SR. Lags considered include the current month (month of collection of mosquitoes); 1/2/3 month(s) prior to the collection; average of current and one previous month; average of one and two previous months; and lastly average of current, one and two previous months. The analysis took into account seasonality, distance from water bodies and time (annual effect) which was incorporated as a binary variable indicating the year of study. Analysis was conducted separately for each species. Fitted values from models with all possible combinations of the environmental lags were calculated and plotted against the observed values (mosquito counts or SR). The combination which best fits the data was used for further analysis. This was implemented in STATA 10 (Stata Corps).

Model validation and prediction

Models were fitted using a training set (85% of the data) randomly selected from the entire data. Validation of the model performance was done on the test locations (the remaining 15% of the data) The predictive ability of the model was assessed by specifically calculate different credible intervals with different probability coverage of the posterior predictive distribution and compare the percentage of test locations correctly predicted within these credible intervals (Citation52). The best predictive ability of the model is observed when higher the number of test locations falls within the narrowest credible interval. The predicted power of the model at 95% credible interval is reported.

Using the estimates obtained from the models, SR and mosquito density were predicted for the whole Rufiji site. The prediction was done at the 250 m resolution.

Calculation of EIR

The EIR can be estimated as a product of the SR and human-biting rate. Depending on the mosquito collection method used (human landing, light trap, etc.), the human-biting rate can be correctly approximated either by the number of blood meals taken on humans/mosquito/day or by the mosquito density. Established correlation between number of mosquitoes captured by light traps and human landing catches is usually used to adjust light trap collection to equivalence of biting catches and avoid collection bias (Citation53). For this study, EIR was calculated as a product of SR and mosquito density and then adjusted using a correction factor of 1.605 to calibrate estimates obtained from light trap collection (Citation28, Citation53) (Citation54).

At a specific pixel j and month t the predicted values of SR, πˆjt and mosquito density, μˆjt were obtained for A. funestus and A. gambiae species. EIR estimates representing the infectious bite/person/day were calculated as:

where 1.605 is the correction factor.

The EˆRjt was then multiplied by 30.5 and 365 to obtain monthly and annual estimates, respectively. Monthly and annual maps were produced to show seasonal and temporal trends of the transmission.

Geostatistical model implementation

The final model was implemented in OpenBUGS and parameters were estimated using the Gibbs sampler MCMC algorithm. The spatial variance parameter was sampled directly from its inverse gamma full conditional distributions using Gibbs sampling (Citation39). The remaining parameters were simulated using Metropolis algorithm with a normal proposal distribution. The mean of the proposal distribution was the parameter estimated from the previous iteration with a fixed variance (Citation55, Citation56). Two separate chains were run in parallel with a total of 150,000 iterations each. A burn-in of 20,000 iterations was done and the last 5,000 and 1,000 samples were used for posterior inference and prediction, respectively. The Gelman-Rubin model diagnostic tool (Citation57) was used to assess convergence of chains before summarizing the results. The package ‘fields’ in R was used for selection of knots. For practical implementation of the geostatistical model 281 knots (2,479 unique locations) were selected for the density data (both species), 177 (415 unique locations) for SR analysis of A. funestus and 219 (639 unique locations) for SR of A. gambiae. Predictions and calculations of EIR were done in Fortran 95 (Compaq Visual Fortran Professional 6.6.0).

Results

Data description

In total of 2,479 unique locations were visited for the collection of the mosquitoes. A total of 15,983 A. funestus (from 18% of the surveyed locations, n=447) and 17,885 A. gambiae (from 27.3% of the surveyed locations, n=678) mosquitoes were captured. About 83 and 74.3% of the visits for mosquito collection for A. funestus and A. gambiae received zero counts. The crude annual SRs were 3.3, 2.8 and 3.2% for Year 1 (October 01–September 02), Year 2 (October 02–September 03) and Year 3 (October 03–September 04), respectively. The crude EIR were 507, 72.8 and 146 infectious bites/person/year for 3 years respectively. In , the relation between rainfall, temperature and mosquito density is shown (data collapsed in a period of one calendar year).

Fig. 1 Seasonal variations of (A) rainfall, temperature and (B) mosquitoes densities of A. gambiae and A. funestus in the Rufiji DSS October 2001–September 2004.

Fig. 1 Seasonal variations of (A) rainfall, temperature and (B) mosquitoes densities of A. gambiae and A. funestus in the Rufiji DSS October 2001–September 2004.

Most A. gambiae mosquitoes were captured during the months of April and May while most A. funestus were collected in the period of July–September. The number of A. gambiae collected was higher during the heavy rains while short rains with high temperature favour the population of A. funestus ().

Geostatistical model results

summarizes the results of parameter estimations from a multivariate geostatistical models on SRs and mosquito density.

Table 1 Results of association of environment/climate variables on sporozoite rate and mosquito density and spatio-temporal parameters

The effect of environmental variables differs significantly between species. Rain and temperature are the most influencing factors for density and sporozoite with higher effect on the A. funestus species. No significant effect of distance to the water bodies was obtained. highly pronounced with a significant decrease of mosquito population in Year 2 as compared to Year 1 and later an increase in the Year 3 as compared to Year 2. Spatial ranges are quite high especially for the SRs. The estimate of the over-dispersion parameter of A. funestus is twice as large as that of A. gambiae which could be influenced by the amount of zero counts in the data. However, the estimate of r is larger than 1 indicating that the data are not highly overdispersed (Citation58). Day temperature significantly reduces the probability of observing zero mosquito counts. Spatial variability accounts more for the total variability in the data as compared to the non-spatial and temporal variability.

For a total of 63, 99, 368 and 368 test locations selected for validation of SR-AF, SR-AG, Density-AF and Density-AG models respectively, 68.3, 63.6, 84.1 and 89.9% of the locations were correctly predicted within 95% credible interval. Gelman-Rubin diagnostics indicated good convergence of all model parameters.

Mapping of EIR

presents selected EIR maps for the Rufiji DSS site for the A. funestus and A. gambiae.

Fig. 2 Selected EIR maps showing the spatial distribution and the seasonal pattern, for the period of Oct 2001–Sept 2004. (A) Dry months followed by the period of short rains, (B) Months immediately after the onset of heavy rains during the first year (very wet), (C) Months immediately after the onset of heavy rains during the second year (dry) and (D) Months immediately after the onset of heavy rain season during the third year (normal rains).

Fig. 2 Selected EIR maps showing the spatial distribution and the seasonal pattern, for the period of Oct 2001–Sept 2004. (A) Dry months followed by the period of short rains, (B) Months immediately after the onset of heavy rains during the first year (very wet), (C) Months immediately after the onset of heavy rains during the second year (dry) and (D) Months immediately after the onset of heavy rain season during the third year (normal rains).

The southern part of the DSS showed high transmission throughout the years. High transmission was observed immediately at the onset of rains, especially during the heavy rain period. At the end of the rainy season (May–June), the transmission spread throughout the region ().

In , monthly time series (median) predicted EIR are plotted for the entire study period. Attributes of each species are also indicated.

Fig. 3 Predicted monthly EIR median and attribute of each species in Rufiji DSS.

Fig. 3 Predicted monthly EIR median and attribute of each species in Rufiji DSS.

The transmission starts peaking in the month of April (just after rains) and gradually drops in July (first year of the study). There was a reduction in the second year of the study and EIR increased again during the last year. A similar monthly trend is observed across years, which emphasizes seasonality. A. funestus are more prominent during the dry months while A. gambiae are more prominent during the rainy periods. The spatial temporal distribution of year-by-year EIR is shown in with maps of prediction error. The prediction error for the EIR estimates was obtained my multiplying the prediction errors obtained from SR and density models.

Fig. 4 Spatial temporal distribution of annual EIR with prediction error maps.

Fig. 4 Spatial temporal distribution of annual EIR with prediction error maps.

Patterns in show that few surveyed households are located in areas with EIR<1; however, a large proportion of household presented high transmission intensity. Higher prediction errors are seen in areas with few surveyed locations. The errors also capture the effect of heterogeneity arising from unmeasured factors.

Population-adjusted EIR

The annual and species-specific population-adjusted EIR were calculated by averaging predicted inoculation rates at all households (N=14,516) within the RDSS () excluding all of the other pixels. Results are presented in .

Fig. 5 Distribution of households in the Rufiji DSS area (Source: TEHIP, 2002).

Fig. 5 Distribution of households in the Rufiji DSS area (Source: TEHIP, 2002).

Table 2 Overall predicted EIR with the percent attribute of each species

Overall transmission intensity reduced significantly during Year 2 and 3 as compared to Year 1 of the study. A. funestus was the main responsible vector for transmission in the first (68%) and second (78%) year, while the last year transmission was mainly driven by A. gambiae (63%).

In addition, we assessed the spatial shift (distribution) of transmission intensity over time, as illustrated in . EIR were categorized into five transmission intensities which were: no transmission (EIR=0), very low (EIR≥0.0–1), low (EIR≥1–10), average (EIR≥10–100), and high (EIR≥100). The change in the percentage of households exposed to a specific level of transmission was then studied.

Table 3 Distribution of predicted EIR over the RDSS area by Year, N* (%)

The proportion of households predicted with very low transmission intensity increased between the first year and the third year of the study, from 4.0 to 7.2%. A significant reduction (over 68%) of locations with high transmission is seen during the last year of the study (i.e. 12.6% in the first year to 4% in the third year).

Discussion

In this study, we assessed spatial–temporal variation and heterogeneity of malaria transmission in the Rufiji DSS site using a large geo-referenced biweekly entomological dataset collected over 3 years, and rigorous Bayesian geostatistical models. Our work is amongst the few to address spatial modelling of Entomology inoculation rate (EIR) based on sparse data by applying current Bayesian methodologies approximating spatial processes for large data. The INDEPTH-MTIMBA data, which was used in our application, is the most comprehensive entomological database in Africa. Bayesian spatio-temporal binomial and zero inflated negative binomial regression models were developed to produce monthly maps of EIR taking into account the malaria–climate relation and seasonality in transmission (Citation35, Citation36) (Citation59Citation61).

Geostatistical models have been widely used in malaria mapping in recent years (Citation3, Citation38) (Citation52, Citation62Citation64). Most of these analysis involved standard geostatistical models which are relevant for a moderate number of locations. Computation involved in these models is not feasible for data collected over a large number of survey locations. In this study, we used methods proposed by Barnejee et al. (Citation40) and Finley et al. (Citation42) to approximate the spatial process using a subset of survey locations selected via space filling design implemented in R software. Additive temporal correlations with autoregressive structure were also incorporated in all models. The predictive power of the model suggests good performance of the spatial correlation approximated from a subset of observed location. That might indicate that the subset selected was significantly appropriate. This work adds to the few in literature that indirect evaluates performance of using subsets to approximate the spatial process in real-life field data.

Changes in climate conditions, natural inhabitants and other human activities, which depend on the environment, alter the intensity of malaria transmission (Citation21, Citation65). Our results depict temporal and seasonal variation in EIR along the study period and study area. Transmission was higher during the rainy periods with high temperatures and very low during the dry season or year. Two species A. funestus and A. gambiae are mainly responsible for malaria transmission in this region. Differences on the effect of environmental factors on the mosquito abundance and SRs of the species were observed. The population of A. gambiae increases at the onset of heavy rains while that of A. funestus peaks during the short rains season. Similar results have been reported in the Kilombero valley and other areas with similar climate in Africa and are associated with the preferential conditions of breeding sites of these species (Citation13, Citation16) (Citation66Citation70). A study, which assessed spatio-temporal variation of EIR in Navrongo DSS, showed similar patterns of seasonality that differed by species (Citation28). Highly significant effects of temperature on the SR and density of A. funestus were observed. Contrary to A. gambiae which has relatively exophilic behaviour, this species is strictly endophilic, which could facilitate choice of conducive a resting environment favouring the gonotrophic cycle resulting in higher survival and hence longer infectivity (Citation71Citation73). Knowledge of these characteristics can be important for understanding disease dynamics and for efficient implementation of interventions (Citation5, Citation6) (Citation66, Citation74).

There was considerable variation over short distances in the intensity of transmission. Small-scale variations in malaria transmission are common in sub-Saharan Africa and create complexity in implementing strategies to combat malaria (Citation8, Citation28) (Citation59, Citation75Citation77). The spatial correlation was still present over a substantial distance and the spatial variation comprised of about 90% of the total data variance. The spatial correlation arises partly due to spatial pattern in environmental drivers of transmission, partly due to effects of limited mosquito dispersion, and is also affected by human factors such as migration and human population densities (Citation41, Citation42). We had an abundance of data on both mosquito and human populations; however, due to the relative small DSS area, it is difficult to separate the contributions of these different factors to the spatial correlation, which explains the higher spatial range. Such heterogeneity arising from unmeasured factors is captured by the prediction errors.

The methodology described in this study allows estimation of EIR while adjusting for both, temporal and small area spatial variations in a systematic and thorough manner. It acknowledges key characteristic of the data, considers computation difficulties and correlation among potential drivers of malaria transmission. It could be seen that the crude EIR were underestimated as compared to model-based estimations by over 55%. This underlines the importance of utilizing efficient methodologies while estimating epidemiological parameters to allow for proper decisions.

Our formulation allows further expansion and easy incorporation of other covariates in the main structure of the model either as specific covariates or their interaction. The complex component of our proposal is how to separately model SR and density data, incorporate seasonality, choosing environment lags and lastly approximate the spatial correlation when the large number of location has been observed. All these have been worked on. Moreover, DSS sites including Rufiji, collected comprehensive records of all-cause and disease mortality in the human population at the time of this entomological surveillance. The exposure surfaces estimated using this approach can be linked to mortality data to assess the malaria-specific mortality burden. Through that, much more accurate estimates of the benefits to be gained by reducing malaria transmission can be estimated than if it would have been possible from analyses that aggregate EIR over large areas and time periods or those fitted assuming normally distributed EIR.

Conflict of interest and funding

The authors declare that they have no competing interests.

Acknowledgements

We are grateful to the community in the RDSS and staff from IHI who devoted their time and enormous energy to the study. The study received funding from the Swiss National Science Foundation (Nr 325200-118379/1).

References

  • WHO. World malaria report. 2012; Geneva: World Health Organization.
  • Hay SI, Snow RW. The malaria atlas project: developing global maps of malaria risk. PLoS Med. 2006; 3: e473. [PubMed Abstract] [PubMed CentralFull Text].
  • Hay SI, Guerra CA, Gething PW, Patil AP, Tatem AJ, Noor AM, etal. A world malaria map: Plasmodium falciparum endemicity in 2007. PLoS Med. 2009; 6: e1000048. [PubMed Abstract] [PubMed CentralFull Text].
  • Parham PE, Michael E. Modeling the effects of weather and climate change on malaria transmission. Environ Health Perspect. 2010; 118: 620–6. [PubMed Abstract] [PubMed CentralFull Text].
  • Fontenille D, Lochouarn L, Diatta M, Sokhna C, Dia I, Diagne N, etal. Four years’ entomological study of the transmission of seasonal malaria in Senegal and the bionomics of Anopheles gambiae and A. arabiensis. Trans R Soc Trop Med Hyg. 1997; 91: 647–52. [PubMed Abstract].
  • Beier JC, Killeen GF, Githure JI. Short report: entomologic inoculation rates and Plasmodium falciparum malaria prevalence in Africa. Am J Trop Med Hyg. 1999; 61: 109–13. [PubMed Abstract].
  • Hay SI, Rogers DJ, Toomer JF, Snow RW. Annual Plasmodium falciparum entomological inoculation rates (EIR) across Africa: literature survey, Internet access and review. Trans R Soc Trop Med Hyg. 2000; 94: 113–27. [PubMed Abstract] [PubMed CentralFull Text].
  • Mboera LEG, Senkoro KP, Mayala BK, Rumisha SF, Rwegoshora RT, Mlozi MRS, etal. Spatio-temporal variation in malaria transmission intensity in five agro-ecosystems in Mvomero district, Tanzania. Geospat Health. 2010; 4: 167–78. [PubMed Abstract].
  • Nedelman J. A negative binomial model for sampling mosquitoes in a malaria survey. Biometrics. 1983; 39: 1009–20. [PubMed Abstract].
  • Alexander N, Moyeed R, Stander J. Spatial modelling of individual-level parasite counts using the negative binomial distribution. Biostatistics. 2000; 1: 453–63. [PubMed Abstract].
  • Michael E. Quantifying mosquito biting patterns on humans by DNA fingerprinting of bloodmeals. Am J Trop Med Hyg. 2001; 65: 722. [PubMed Abstract].
  • Shaukat AM, Breman JG, McKenzie FE. Using the entomological inoculation rate to assess the impact of vector control on malaria parasite transmission and elimination. Malar J. 2010; 9: 122. [PubMed Abstract].
  • Smith T, Charlwood JD, Kihonda J, Mwankusye S, Billingsley P, Meuwissen J, etal. Absence of seasonal variation in malaria parasitaemia in an area of intense seasonal transmission. Acta Tropica. 1993; 54: 55–72. [PubMed Abstract].
  • Killeen GF, McKenzie FE, Foy BD, Schieffelin C, Billingsley PF, Beier JC. A simplified model for predicting malaria entomologic inoculation rates based on entomologic and parasitologic parameters relevant to control. Am J Trop Med Hyg. 2000; 62: 535–44. [PubMed Abstract] [PubMed CentralFull Text].
  • Lee HI, Lee JS, Shin EH, Lee WJ, Kim YY, Lee KR. Malaria transmission potential by Anopheles sinensis in the Republic of Korea. Korean J Parasitol. 2001; 39: 185–92. [PubMed Abstract] [PubMed CentralFull Text].
  • Kelly-Hope LA, McKenzie FE. The multiplicity of malaria transmission: a review of entomological inoculation rate measurements and methods across sub-Saharan Africa. Malar J. 2009; 8: 19. [PubMed Abstract] [PubMed CentralFull Text].
  • Snow RW, Craig MH, Deichmann U, le Sueur D. A preliminary continental risk map for malaria mortality among African children. Parasitol Today (Regul Ed). 1999; 15: 99–104.
  • Woolhouse MEJ, Dye C, Etard J-F, Smith T, Charlwood JD, Garnett GP, etal. Heterogeneities in the transmission of infectious agents: implications for the design of control programs. Proc Natl Acad Sci U S A. 1997; 94: 338–42. [PubMed Abstract] [PubMed CentralFull Text].
  • Brogan R, Zhao C. Designing a longitudinal study [microform]: issues, problems & concerns. [S.l.]: distributed by ERIC Clearinghouse. 1992. Available from: http://www.eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=ED401316 [cited 23 August 2013]..
  • Smith TA, Leuenberger R, Lengeler C. Child mortality and malaria transmission intensity in Africa. Trends Parasitol. 2001; 17: 145–9. [PubMed Abstract].
  • Thomson MC, Connor SJ. The development of malaria early warning systems for Africa. Trends Parasitol. 2001; 17: 438–45. [PubMed Abstract].
  • Sankoh OA, Binka F. Takken W, Martens P, Bogers RJ. INDEPTH Network: a viable platform for the assessment of malaria risk in developing countries. Environmental change and malaria risk-global and local implications. 2005; Berlin: Springer Verlag.
  • Gatton M. Environmental factors and malaria transmission risk. Lancet Infect Dis. 2010; 10: 15–6.
  • Killeen GF, Tami A, Kihonda J, Okumu FO, Kotas ME, Grundmann H, etal. Cost-sharing strategies combining targeted public subsidies with private-sector delivery achieve high bednet coverage and reduced malaria transmission in Kilombero Valley, southern Tanzania. BMC Infect Dis. 2007; 7: 121. [PubMed Abstract] [PubMed CentralFull Text].
  • Leisnham PT, Slaney DP, Lester PJ, Weinstein P, Heath ACG. Mosquito density, macroinvertebrate diversity, and water chemistry in water-filled containers: relationships to land use. New Zeal J Zool. 2007; 34: 203–18.
  • Chase JM, Shulman RS. Wetland isolation facilitates larval mosquito density through the reduction of predators. Ecol Entomol. 2009; 34: 741–7.
  • Kweka EJ, Mwang'onde BJ, Mahande AM. Optimization of odour-baited resting boxes for sampling malaria vector, Anopheles arabiensis Patton, in arid and highland areas of Africa. Parasit Vectors. 2010; 3: 75. [PubMed Abstract] [PubMed CentralFull Text].
  • Kasasa S, Asoala V, Gosoniu L, Anto F, Adjuik M, Tindana C, etal. Spatio-temporal malaria transmission patterns in Navrongo demographic surveillance site, northern Ghana. Malar J. 2013; 12: 63. [PubMed Abstract] [PubMed CentralFull Text].
  • Greene WH. Accounting for excess zeros and sample selection in Poisson and negative binomial regression models. 1994; Leonard N. Stern School of Business: Technical report. New York: New York University. 34.
  • Cheung YB. Zero-inflated models for regression analysis of count data: a study of growth and development. Stat Med. 2002; 21: 1461–9. [PubMed Abstract].
  • Yau KKW, Wang K, Lee AH. Zero-inflated negative binomial mixed regression modeling of over-dispersed count data with extra zeros. Biometrical J. 2003; 45: 437–52.
  • Ryan PA, Lyons SA, Alsemgeest D, Thomas P, Kay BH. Spatial statistical analysis of adult mosquito (Diptera: Culicidae) counts: an example using light trap data, in Redland Shire, Southeastern Queensland, Australia. J Med Entomol. 2004; 41: 1143–56. [PubMed Abstract].
  • Ridout M, Hinde J, DeméAtrio CGB. A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics. 2001; 57: 219–23. [PubMed Abstract].
  • Agarwal DK, Gelfand AE, Citron-Pousty S. Zero-inflated models with application to spatial count data. Environ Ecol Stat. 2002; 9: 341–55.
  • Warton DI. Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics. 2005; 16: 275–89.
  • Sogoba N, Vounatsou P, Doumbia S, Bagayoko M, Touré MB, Sissoko IM, etal. Spatial analysis of malaria transmission parameters in the rice cultivation area of Office du Niger, Mali. Am J Trop Med Hyg. 2007; 76: 1009–15. [PubMed Abstract].
  • Cressie NAC. Statistics for spatial data. 1993; New York: Wiley-Interscience.
  • Diggle PJ, Tawn JA, Moyeed RA. Model-based geostatistics. J Roy Stat Soc C Appl Stat. 1998; 47: 299–350.
  • Gelfand AE, Smith AFM. Sampling-based approaches to calculating marginal densities. J Am Stat Assoc. 1990; 85: 398–409.
  • Banerjee S, Gelfand AE, Finley AO, Sang H. Gaussian predictive process models for large spatial data sets. J R Stat Soc Series B Stat Methodol. 2008; 70: 825–48. [PubMed Abstract] [PubMed CentralFull Text].
  • Eidsvik J, Finley AO, Banerjee S, Rue H. Approximate Bayesian inference for large spatial datasets using predictive process models. Comput Stat Data Anal. 2012; 56: 1362–80.
  • Finley AO, Sang H, Banerjee S, Gelfand AE. Improving the performance of predictive process modeling for large datasets. Comput Stat Data Anal. 2009; 53: 2873–84. [PubMed Abstract] [PubMed CentralFull Text].
  • Rumisha SF. Modelling the seasonal and spatial variation of malaria transmission in relation to mortality in Africa. 2013; PhD thesis, University of Basel, Basel, Switzerland.
  • Mwageni E, Momburi D, Juma Z, Irema M, Masanja H, TEHIP and AMMP Teams. INDEPTH monograph series: demographic surveillance systems for assessing populations and their health in developing countries. 2002; Ottawa: IDRC/CRDI.
  • Stolwijk AM, Straatman H, Zielhuis GA. Studying seasonality by using sine and cosine functions in regression analysis. J Epidemiol Community Health. 1999; 53: 235–8. [PubMed Abstract] [PubMed CentralFull Text].
  • Rau R. Seasonality in human mortality – a demographic approach. 1st ed. 2006; Springer. Available from: http://www.springer.com/economics/population/book/978-3-540-44900-3 [cited 23 Aug 2013]..
  • Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992; 34: 1–14.
  • Seeger M. Bayesian Gaussian Process Models: PAC-Bayesian generalisation error bounds and sparse approximations. 2003. Available from: http://www.era.lib.ed.ac.uk/handle/1842/321 [cited 23 Aug 2013]..
  • Xia G, Gelfand AE. Stationary process approximation for the analysis of large spatial datasets. Technical Report. 2005; Durham, NC: ISDS, Duke University.
  • Johnson ME, Moore LM, Ylvisaker D. Minimax and maximin distance designs. J Stat Plann Infer. 1990; 26: 131–48.
  • Hay JL, Pettitt AN. Bayesian analysis of a time series of counts with covariates: an application to the control of an infectious disease. Biostatistics. 2001; 2: 433–44. [PubMed Abstract].
  • Gosoniu L, Vounatsou P, Sogoba N, Smith T. Bayesian modelling of geostatistical malaria risk data. Geospat Health. 2006; 1: 127–39. [PubMed Abstract].
  • Lines JD, Wilkes TJ, Lyimo EO. Human malaria infectiousness measured by age-specific sporozoite rates in Anopheles gambiae in Tanzania. Parasitology. 1991; 102 Pt 2: 167–77. [PubMed Abstract].
  • Amek N, Bayoh N, Hamel M, Lindblade KA, Gimnig JE, Odhiambo F, etal. Spatial and temporal dynamics of malaria transmission in rural Western Kenya. Parasites & Vectors. 2012; 5: 86. [PubMed Abstract] [PubMed CentralFull Text].
  • Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970; 57: 97–109.
  • Metropolis N. The beginning of the Monte Carlo method. Los Alamos Science. 1987; 15: 125.
  • Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Statist Sci. 1992; 7: 457–72.
  • Lloyd-Smith JO. Maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases. PLoS One. 2007; 2: e180. [PubMed Abstract] [PubMed CentralFull Text].
  • Thomas CJ, Lindsay SW. Local-scale variation in malaria infection amongst rural Gambian children estimated by satellite remote sensing. Trans R Soc Trop Med Hyg. 2000; 94: 159–63. [PubMed Abstract].
  • Gosoniu L. Development of Bayesian geostatistical models with applications in malaria epidemiology [doctoral]. 2008; University of Basel. Available from: http://ihi.eprints.org/1131/ [cited 23 August 2013]..
  • Reid H, Vallely A, Taleo G, Tatem AJ, Kelly G, Riley I, etal. Baseline spatial distribution of malaria prior to an elimination programme in Vanuatu. Malar J. 2010; 9: 150. [PubMed Abstract] [PubMed CentralFull Text].
  • Amek N, Bayoh N, Hamel M, Lindblade KA, Gimnig J, Laserson KF, etal. Spatio-temporal modeling of sparse geostatistical malaria sporozoite rate data using a zero inflated binomial model. Spat Spatiotemporal Epidemiol. 2011; 2: 283–90. [PubMed Abstract].
  • Gething PW, Patil AP, Smith DL, Guerra CA, Elyazar IRF, Johnston GL, etal. A new world malaria map: Plasmodium falciparum endemicity in 2010. Malar J. 2011; 10: 378. [PubMed Abstract] [PubMed CentralFull Text].
  • Giardina F, Gosoniu L, Konate L, Diouf MB, Perry R, Gaye O, etal. Estimating the Burden of Malaria in Senegal: Bayesian Zero-Inflated Binomial Geostatistical Modeling of the MIS 2008 Data. PLoS ONE. 2012; 7: e32625. [PubMed Abstract] [PubMed CentralFull Text].
  • Snow RW, Nahlen B, Palmer A, Donnelly CA, Gupta S, Marsh K. Risk of severe malaria among African infants: direct evidence of clinical protection during early infancy. J Infect Dis. 1998; 177: 819–22. [PubMed Abstract].
  • Gillies MT, de Meillon B. The Anophelinae of Africa south of the Sahara (Ethiopian Zoogeographical Region). Publ S Afr Inst Med Res. 1968; 54: 3+343.
  • Charlwood JD, Vij R, Billingsley PF. Dry season refugia of malaria-transmitting mosquitoes in a dry savannah zone of east Africa. Am J Trop Med Hyg. 2000; 62: 726–32. [PubMed Abstract].
  • Warrell DA, Gilles HM. Essential malariology. 2003; London, UK: Arnold. 4th edition.
  • Guelbeogo WM, Sagnon N, Grushko O, Yameogo MA, Boccolini D, Besansky NJ, etal. Seasonal distribution of Anopheles funestus chromosomal forms from Burkina Faso. Malar J. 2009; 8: 239. [PubMed Abstract] [PubMed CentralFull Text].
  • Adja AM, N'goran EK, Koudou BG, Dia I, Kengne P, Fontenille D, etal. Contribution of Anopheles funestus, An. gambiae and An. nili (Diptera: Culicidae) to the perennial malaria transmission in the southern and western forest areas of Côte d'Ivoire. Ann Trop Med Parasitol. 2011; 105: 13–24. [PubMed Abstract].
  • Charlwood JD, Qassim M, Elnsur EI, Donnelly M, Petrarca V, Billingsley PF, etal. The impact of indoor residual spraying with malathion on malaria in refugee camps in eastern Sudan. Acta Trop. 2001; 80: 1–8. [PubMed Abstract].
  • Kent RJ, Coetzee M, Mharakurwa S, Norris DE. Feeding and indoor resting behaviour of the mosquito Anopheles longipalpis in an area of hyperendemic malaria transmission in southern Zambia. Med Vet Entomol. 2006; 20: 459–63. [PubMed Abstract].
  • Atieli H, Menya D, Githeko A, Scott T. House design modifications reduce indoor resting malaria vector densities in rice irrigation scheme area in western Kenya. Malar J. 2009; 8: 108. [PubMed Abstract] [PubMed CentralFull Text].
  • Thomson MC, Mason SJ, Phindela T, Connor SJ. Use of rainfall and sea surface temperature monitoring for malaria early warning in Botswana. Am J Trop Med Hyg. 2005; 73: 214–21. [PubMed Abstract].
  • Drakeley CJ, Carneiro I, Reyburn H, Malima R, Lusingu JPA, Cox J, etal. Altitude-dependent and -independent variations in Plasmodium falciparum prevalence in northeastern Tanzania. J Infect Dis. 2005; 191: 1589–98. [PubMed Abstract].
  • Stewart L, Gosling R, Griffin J, Gesase S, Campo J, Hashim R, etal. Rapid assessment of malaria transmission using age-specific sero-conversion rates. PLoS ONE. 2009; 4: e6083. [PubMed Abstract] [PubMed CentralFull Text].
  • Bousema T, Okell L, Shekalaghe S, Griffin JT, Omar S, Sawa P, etal. Revisiting the circulation time of Plasmodium falciparum gametocytes: molecular detection methods to estimate the duration of gametocyte carriage and the effect of gametocytocidal drugs. Malar J. 2010; 9: 136. [PubMed Abstract] [PubMed CentralFull Text].