108
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A regional study to evaluate the impact of coal-fired power plants on lung cancer incident rates

ORCID Icon, ORCID Icon, , , , , & ORCID Icon show all
Article: 2348469 | Received 28 Dec 2023, Accepted 24 Apr 2024, Published online: 06 Jun 2024

Abstract

Background: Lung cancer is the leading cause of cancer related deaths. In Kansas, where coal-fired power plants account for 34% of power, we investigated whether hosting counties had higher age-adjusted lung cancer incidence rates. We also examined demographics, poverty levels, percentage of smokers, and environmental conditions using spatial analysis.

Methods: Data from the Kansas Health Matters, and the Behavioral Risk Factor Surveillance System (2010–2014) for 105 counties in Kansas were analyzed. Multiple Linear Regression (MLR) assessed associations between risk factors and age-adjusted lung cancer incidence rates while Geographically Weighted Regression (GWR) examined regional risk factors.

Results: Moran’s I test confirmed spatial autocorrelation in age-adjusted lung cancer incidence rates (p < 0.0003). MLR identified percentage of smokers, population size, and proportion of elderly population as significant predictors of age-adjusted lung cancer incidence rates (p < 0.05). GWR showed positive associations between percentage of smokers and age-adjusted lung cancer incidence rates in majority of counties.

Conclusion: Proximity to a coal-fired power plant was not a significant predictor of age-adjusted lung cancer incidence rates. Instead, percentage of smokers was a consistent global and regional risk factor. Regional lung cancer outcomes in Kansas are influenced by wind patterns and elderly population.

Introduction

Lung cancer symptoms usually occur when the cancer is in an advanced stage and thus chemotherapy may not be effective [Citation1]. The survival rate for lung cancer patients is at 25.4% which is low compared to other cancer types. The main reason being the late diagnosis of lung cancer compared to other cancers, such as breast cancer [Citation2]. Lung cancer is considered a leading cause of cancer related death across the globe among both men and women [Citation3]. In 2020, Lung and Bronchus cancer held the grim distinction of being the third-leading cause of cancer-related deaths in the United States, with a mortality rate of 31.8 per 100,000 people (95% CI 31.7, 31.9 per 100,000 people). Additionally, it ranked as the third most common cancer in terms of new cases, with an incidence rate of 47.1 per 100,000 people (95% CI 46.9, 47.3 per 100,000 people), highlighting its significant impact on public health. In 2020, Kansas faced a notable lung cancer burden with an age-adjusted incidence rate of 52.5 per 100,000 people (95% CI 51.4, 53.6 per 100,000). This surpassed the national average, underscoring the significant impact of lung cancer in the state with 1,892 reported cancer cases [Citation4].

There are many risk factors associated with lung cancer development. Cigarette smoking is an established cause of lung cancer, with other significant risk factors including age, radon exposure, environmental pollution, occupational factors, gender, race, and pre-existing lung conditions [Citation3]. Conclusively established as the leading cause of lung cancer in both men and women, cigarette smoking accounts for almost 85% of all lung cancer cases, making it the primary contributor to this fatal disease [Citation5–8]. The association between the socio-economic status and lung cancer occurrence is well documented and there is a significant association between income, education, and economic status with the lung cancer risk [Citation9,Citation10]. The results indicate that the individuals in lower socioeconomic position (SEP) categories exhibits higher risks compared to those in the highest SEP level. In high Human Development Index (HDI) countries, lung cancer is one of the most common causes of cancer deaths indicating a significant geographical variation in lung cancer mortality rate [Citation11].

Various factors contribute to lung cancer, stemming from exposure to environmental hazards like asbestos, metals (such as chromium, cadmium, and arsenic), certain chemicals, radiation, coal smoke, and indoor emissions from fuel burning [Citation12,Citation13]. Noteworthy risk factors include firsthand smoking, exposure to toxic gases like radon, and exposure to environmental pollution, as highlighted by Schabath [Citation14] and Bernatsky et al. [Citation15]. With two major factors being air pollution and smoking, there is a large research focus is on understanding how pollution is spread geographically and how pollution levels influence cancer incidence [Citation16–18]. There is clear evidence that polycyclic aromatic hydrocarbons (PAHs), which are primarily emitted from traffic exhausts and fuel combustion, increase the risk of lung cancer [Citation19,Citation20]. Also, air pollution in the form of PAHs is positively geographically correlated with coal-fired power plants [Citation21,Citation22]. Therefore, the workers and individuals located near power plants are being exposed to PAH pollutants at a higher rate than others. A study has shown that coal-fired power plant workers have an increased cancer risk due to airborne PAH inhalation and dermal contact [Citation23]. Also, it is well known that people living near power plants suffer from higher rates of premature mortality, respiratory diseases, lung cancer, cardiovascular disease, poorer child health, and higher infant mortality [Citation24]. On a related note, coal-fired power plants often pollute adjacent counties and counties where they are located. This is due to legislation that incentivizes placing power plants on county/state borders to minimize the monetary liability of intrastate pollution [Citation25].

Within this conceptual framework, it is critical to understand how coal-fired power plants impact lung cancer incident rates across Kansas [Citation26]. The state of Kansas contains 105 counties, which are spread across 82,278 mi², and a state population of 2.93 million. The state of Kansas contains less air pollution than other highly populated states in the Midwestern United States [Citation27]. Coal-fired power plants are present in many different counties across Kansas and contribute 34% of the overall power generation across Kansas [Citation28]. To relate those power plants with county lung cancer rates requires using a spatial statistical tool [Citation29].

The literature has explored and discussed the geographical variation in lung cancer incidence and mortality, along with the associated risk factors by applying spatial analysis methodologies. A 1973–2013 study in China [Citation30] revealed the spatial pattern of lung cancer mortality, differing from Western countries. It identified a positive association between lung cancer mortality and urbanized areas with increased industrial development. Studies across diverse geographical locations in different time spans consistently indicate a significant positive association between Particulate Matter 2.5 (PM2.5) and lung cancer mortality. This suggests that areas characterized by elevated environmental pollution face an increased risk of lung cancer. [Citation31–34].

Our primary research aim was to conduct a spatial analysis to determine whether counties with coal-fired power plants exhibited a higher age-adjusted incidence of lung cancer compared to those without, using regression methodologies. Additionally, we explored the association between lung cancer age-adjusted incidence rates in 100,000 people in Kansas from 2010 to 2014 and various risk factors, such as smoking behavior, population demographics, and poverty level, also employing regression methodologies. The outcomes of this analysis have implications for resource allocation in healthcare, the necessity of legislation to protect the health of residents in these areas, and the intersection of socioeconomic factors with illness associated with coal-fired power plants.

Materials and methods

Data source

For this analysis, we utilized the dataset of age-adjusted lung cancer incidence rates per 100,000 people in Kansas, spanning five years (2010–2014). The data was obtained from the Kansas Health Matters (KHM) webpage, administered by the Kansas Department of Health and Environment (KDHE) [Citation35]. Age-adjusted incidence rates per 100,000 are calculated to allow fair comparisons of lung cancer incidence rates between populations with different age distributions [Citation36]. The dataset provides age-adjusted lung cancer incidence rates only for counties with 20 cases or more. In Kansas, there are 105 counties; however, our dataset only included 99 counties, with 6 counties censored for age-adjusted lung cancer incidence rates. For the 6 missing counties, age-adjusted lung cancer incidence rates were estimated using the K-Nearest Neighbors (KNN) algorithm, incorporating latitude and longitude to add a spatial feature to the estimation process [Citation37].

Additionally, the study incorporates various variables reflecting demographic, behavioral, and economic aspects in Kansas. These include the 2012 population (midpoint of the study period), the percentage of individuals aged 55 to 79 representing the demographic profile, and indicators such that percentage of people below the poverty level and the urban index as economic indicators in Kansas. In this study, we considered the percentage of individuals aged between 55 and 79. This age range was chosen in accordance with the recommendation from the U.S. Preventive Services Task Force (USPSTF), which advises annual lung cancer screening for adults with a history of smoking within this demographic [Citation38]. The urban index variable is a continuous variable that takes values between 0 and 1, where a value toward 1 implies greater urbanization. The percentage of smokers in Kansas serves as a behavioral indicator variable, which was presented as the proportion of Kansas adults aged 18 years and older who smokeThe data source for the percentage of smokers, poverty level, age distribution was taken from the Kansas Behavioral Risk Factor Surveillance System (BRFSS) annual reports from 2010–2014 published by KDHE [Citation39]. Urban index was taken from the Organize and Prioritize Trends to Inform KU Cancer Center (OPTIK) application [Citation40].

Kansas residents experience typical Midwest weather, with four major seasons. A general description of the Midwest climate would be a humid continental climate, where temperatures vary drastically from summer to winter and appreciable precipitation occurs year-round. Since there are no big mountains blocking it, the wind moves around freely in Kansas [Citation41]. To account for the movement of air pollutants across geographical areas due to wind effects, data on wind speed (W) and wind direction (θ) from the north side were obtained from the Kansas Mesonet [Citation42]. These data were represented as vectors in the Cartesian coordinate system. Using trigonometric functions, the circular wind data were decomposed into northward and eastward wind effects in Kansas [Citation43]. The north-side and east-side wind speeds, represented by Wcos(θ) and Wsin(θ) respectively. Additional information about wind decomposition is provided in Supplementary Materials S1.

The air contamination in Kansas state can be caused by the byproducts of coal-fired power plants. We utilized the minn_2020_Power_Plants dataset from ArcGIS online, derived from the Energy Information Administration (EIA), to determine the distance to coal-fired power plants. The dataset includes information on power plants, specifying their primary fuel source (i.e. coal, hydroelectric, wind petroleum, biomass, solar, batteries, geothermal, and other sources to generate electricity), location, names, utility details, geographical coordinates, and the generated energy amount. Focusing on our study goals, we narrowed down the dataset to include only coal-fired plants in Kansas. The analysis includes the distances, measured in miles, from the five identified coal-fired power plants to the counties as an explanatory variable.

Method

The analysis involved several steps. Initially, we created a layer in ArcGIS focusing on lung cancer incident rates. Subsequently, using the National power plant dataset, we identified and added a new layer for coal-fired power plants in Kansas, pinpointing the five plants in the state. Employing a spatial join, we combined these layers to explore areas affected by the dispersion of smoke from these power plants. Then the Buffer tool in ArcGIS was employed to generate a 25-mile radius around each coal-fired power plant. This visualization helps depict the area potentially affected by the emissions from power plants. The 25-mile radius around the coal-fired power plants is considered based on several critical factors outlined in research by [Citation44–46].

In our spatial analysis, we primarily utilized two regression models: the Multiple Linear Regression (MLR) and Geographically Weighted Regression (GWR) models. The Geographically Weighted Regression (GWR) method, introduced by Brunsdon et al. [Citation47] is a frequently employed spatial tool for modeling lung cancer incidence and mortality, particularly in cases where the spatial data shows non-stationarity [Citation48–50]. These models were applied to assess the relationship between lung cancer age-adjusted incidence rates per 100,000 in Kansas and associated risk factors. The following sections provide brief explanations of these two regression methodologies.

Multiple linear regression

The Multiple Linear Regression (MLR) model is employed to establish a linear relationship between the lung cancer age-adjusted incidence rate per 100,000 population in Kansas and a set of predictor variables. It's important to note that the linear regression model is global and does not consider the spatial distribution of the data. To incorporate spatial aspects into the MLR, this study introduces the distance to the nearest coal-fired power plant as an additional explanatory variable. The formulation of the proposed multiple linear regression model is as follows: (1) Yi=β0+k=1pβkxk,i+ϵi,(1) where xk,i be the observed value for the kth predictor variable for the ith spatial location and βk be the parameter of xk,i. Here, ϵi be the random error term of the ith spatial location. It is assumed to follow a normal distribution with zero mean and constant variance σ2, such that ϵiN(0,σ2). Here, k=1,2,,p and i=1,2,,105, where p is the number of predictor variables in the MLR model. The ordinary least square estimates for the parameter vector β is βˆ=(βˆ0,,βˆp)T and it is expressed as follows: βˆ=(XTX)1XTY where X=[1x11x1p1x1nxnp]n(p+1) and Y=(y1,,yn)T.

Model selection, guided by the Akaike Information Criterion (AIC) and Adjusted R2 values, identifies the best-fit model. Subsequently, a residual analysis is conducted to assess whether the fitted model satisfies the model assumptions, including the Jarque-Bera (JB) test for normality. The Global Moran’s I statistic examines spatial autocorrelation in the residuals, while hotspot analysis visualizes spatial clusters of predictions. Testing for spatial randomness and using Koenker's Studentized Breusch–Pagan (KB) test checks if model factors consistently influence the dependent variable, both geographically and in data values. Ensuring model stationarity, meaning consistency in geographical and data space, helps detect and address potential issues like heteroskedasticity.

Geographically weighted regression

The multiple linear regression (MLR) process, expressed by Equation Equation1, is a comprehensive model predicting lung cancer age-adjusted incidence rates and other predictor variables. The MLR model's reliability depends on adhering to key assumptions like stationarity, homoscedasticity, and spatial randomness of regression errors. Non-stationarity in spatial data challenges conventional regression models, making Geographically Weighted Regression (GWR) a suitable alternative. The GWR accommodates regional variations by incorporating neighboring features and establishing unique equations. The GWR formula is provided below. (2) yi=β0,i+k=1pβk,ixk,i+ϵi,(2) where xk,i be the observed value for the kth predictor variable for the ith spatial location and βk,i be the parameter of xk,i. Here, ϵi be the random error term of the ith spatial location. It is assumed to follow a normal distribution with zero mean and constant variance σ2, such that ϵiN(0,σ2). Here, k=1,2,,p and i=1,2,,105 where p is the number of predictor variables in the GWR model. According to equation Equation2, in GWR each location has its regression model. The parameter set βi of the ith location is estimated as: βiˆ=(XTWiX)1XTWiY,where, X=[1x11x1p1x1nxnp]n(p+1), Y=(y1yn)T and Wi=diag[wi1,,win] be the n-by n weight matrix with weight wij defined as: wij=exp[12(dijγ)2]. Here, dij is the geographical distance between the ith and jth spatial positions, and γ is the bandwidth parameter which controls the decay of spatial correlation.

Results

It can be seen from Figure  that lung cancer age-adjusted incidence rates were high towards the east and southeast sides of Kansas State. Further, Moran’s I spatial autocorrelation test result (M = 0.2561, p-value < 0.0003) concludes with 95% confidence that there exists a spatial autocorrelation in the lung cancer age-adjusted incidence rate per 100,000 people for the state of Kansas, during the five-year study period. Furthermore, the analysis of lung cancer age-adjusted incidence rate alongside the locations of coal-fired power plants reveals that counties without such plants and those with a single coal-fired power plant showed similar lung cancer incident rates. However, counties within the 25-mile radius of influence of the two coal-fired power plants exhibited a slightly higher lung cancer incident rate compared to counties unaffected by the emission radius.

Figure 1. Lung cancer incident rate per 100,000 people for the state of Kansas, five-year period 2010–2014.

Figure 1. Lung cancer incident rate per 100,000 people for the state of Kansas, five-year period 2010–2014.

The counties in the state of Kansas are divided into two groups based on their nearest distance to the coal-fired power plant. There are 19 counties where the minimum distance to the nearest power plant is less than a 25-mile radius, while there are 86 counties outside the 25-mile radius from the coal-fired power plants. According to Figure , the median lung cancer age-adjusted incidence rate within a 25-mile radius is slightly higher than that of the counties outside the 25-mile radius. The two-sample t-test indicates (t = 1.8754, p-value = 0.0690), with 95% confidence, that there is no significant difference between the mean lung cancer age-adjusted incidence rates per 100,000 people in these two geographical areas during the period of 2010–2014. This implies that lung cancer age-adjusted incidence rates do not change significantly based on their proximity to the coal-fired power plant.

Figure 2. Boxplot representing the relationship between Coal-Fired Power Plants and Lung Cancer Incident rate per 100,000.

Figure 2. Boxplot representing the relationship between Coal-Fired Power Plants and Lung Cancer Incident rate per 100,000.

Figure  illustrates the correlation between the lung cancer age-adjusted incidence rate per 100,000 individuals and the predictor variables examined in this study. The data indicate positive correlations with the percentage of smokers, the northward wind speed, the percentage of the population below the poverty line, the urban index, and the logarithmic value of the population of counties in Kansas in the year 2012. The logarithmic value of the population of counties was used to reduce skewness of the population density distribution. Conversely, negative correlations are observed with variables such as the percentage of the population aged 55 to 79, the eastward wind speed, and the distance to the nearest coal-fired power plant. The histogram indicates that the lung cancer age-adjusted incidence rate for 100,000 people approximately follows a normal distribution.

Figure 3. Correlation between variables.

Figure 3. Correlation between variables.

In practical terms, counties with a higher percentage of smokers and a greater percentage of poverty tend to exhibit higher lung cancer incidence rates due to the positive association between these factors and lung cancer incidence rate. Urban counties in Kansas tend to have higher lung cancer incidence rates due to the positive correlation between the urban index and lung cancer incidence rates. Similarly, counties farther away from coal-fired power plants exhibit lower lung cancer incidence rates, as shown in Figure .

Multiple linear regression analysis

The multiple linear regression models aimed to predict the lung cancer age-adjusted incidence rates per 100,000 in Kansas State, utilizing various predictor variables. Table  presents a summary of the findings from the selected regression models. The results suggest, with a 95% confidence level, that the proportion of smokers remains a significant factor, positively associated with the lung cancer age-adjusted incidence rate, while holding other factors constant. Therefore, a higher proportion of smokers can indicate a higher lung cancer age-adjusted incidence rate in the population.

Table 1. Multiple Linear Regression Models, Study on Lung Cancer Age-Adjusted Incidence Rates in Kansas 2010–2014.

Except for Model 3, all other models indicate that the logarithmic value of the population in 2012 is also significant at a 5% significance level and is positively associated with lung cancer age-adjusted incidence rates. Model 3 provides intriguing insights into how wind effects are associated with lung cancer age-adjusted incidence. With a 95% confidence level, the high-speed winds in the north side is positively associated with elevated lung cancer age-adjusted incidence rates during the study period, while other factors remained unchanged.

Considering both the AIC and adjusted R-squared values, Model 4 was selected as the most suitable model. It's important to note that the study spanned from 2010 to 2014, and these conclusions are drawn based on the data collected during that period. The selected model is formulated as follows: (3) Yˆi=49.5798+3.1410X1,i+8.1400X2,i+0.7498X3,i0.1544X4,i(3) where, X1,i,X2,i,X3,i and X4,i are the predictor variables which represent the proportion of smokers, the logarithm of the population in the year 2012, the percentage of the population between the ages of 55 and 79, and the wind effect on the East for the ith county respectively. Further, Yˆi represents the estimated lung cancer age-adjusted incidence rates per 100,000 individuals in the ith county in Kansas, where i=1,2,,105.

The residual analysis was performed based on the residuals calculated by Model 4. The global Moran’s I test for spatial autocorrelation (M = 0.0304, p-value = 0.5872) indicates that residuals are scattered randomly. The predictor variables in Model 4 exhibit no multicollinearity, as indicated by Variance Inflation Factor (VIF) values below 7.5. The Jarque-Bera Statistic (JB = 3.9163, p-value = 0.1411) suggests that the residuals are normally distributed, and the KB test concludes, with 95% confidence, that (KB = 5.8395, p-value = 0.2115) there is not enough evidence to indicate the heteroscedasticity of the residuals. This indicates that the standardized residuals are independent, and normally distributed with constant variance means that the selected regression model satisfies the model assumption.

Geographically weighted regression analysis

One of the key advantages of the GWR model lies in its ability to leverage sub-sample information from neighboring counties to conduct local regression. This allows for the estimation of variable parameters in each county, facilitating spatial comparative analysis. The GWR model was employed to analyze the lung cancer age-adjusted incidence dataset spanning from 2010 to 2014. As described in Equation Equation2, each predictor variable has 105 estimated coefficients, corresponding to the 105 spatial locations in Kansas. The model predictor variables for the GWR model were selected from the MLR model, as outlined in Equation Equation3. To facilitate comparison, Table  outlines the results providing the mean, median, minimum, and maximum values of the estimated coefficients. Model comparison was conducted using AIC and Adjusted R2 values, revealing that the GWR model effectively explains the dataset. It is important to note that except for the two counties (Cheyenne County and Sherman County) the coefficient for the proportion of smokers is consistently positive for all counties. This suggests that an increase in the percentage of smokers is associated with a higher lung cancer age-adjusted incidence rate in the geographical area, assuming all other risk factors remain unchanged.

Table 2. Geographically Weighted Regression Model Estimation Results for Study on Lung Cancer Age-adjusted Incidence Rates in Kansas 2010–2014.

Figure  illustrates the estimated coefficients and their significance at a 5% significant level based on the fitted GWR model. In our study of lung cancer age-adjusted incidence rate per 100,000 in Kansas, specific regional patterns emerged. The percentage of smokers exhibited significance in the north and south corridors, with higher coefficients compared to the east and west sides of Kansas. On the other hand, the local regression model in equation Equation3 suggests that the percentage of smokers is significant at a 5% significance level across the entire state. Interestingly, the logarithm of the population did not demonstrate significance at a 5% significance level; however, in the global regression model, it showed a significant coefficient. East wind speed is significant only for nine counties (Decatur County, Norton County, Phillips County, Smith County, Jewell County, Graham County, Rooks County, Osborne County, and Mitchell County) in Kansas, with negative coefficients implying that high wind toward the east side is negatively associated with lung cancer age-adjusted incidence in these counties. Practically speaking, this east wind moves pollutants from coal-fired power plants away from the counties. Additionally, the percentage of the population between the ages of 55 and 79 emerged as a significant factor in three counties (Sheridan County, Graham County, and Trego County), indicating that an increase in the percentage of the population in this age group corresponds to higher lung cancer age-adjusted incidence rates.

Figure 4. Geographical Weighted Regression model parameter estimations and significance of the parameters by the county.

Figure 4. Geographical Weighted Regression model parameter estimations and significance of the parameters by the county.

Analyzing county-level age-adjusted incidence predictions with the GWR model presented in Figure  highlights a concentration of higher estimated lung cancer age-adjusted incidence rates in southwestern Kansas and lower rates in northwestern Kansas. Notably, southwestern Kansas features only one coal-fired power plant. However, it's crucial to acknowledge that the sparse and small population in western Kansas might impact the model predictions. Furthermore, the standardized residuals for the GWR model indicate the absence of a significant cluster, validating that the fitted GWR model best explains the association between lung cancer age-adjusted incidence rates and risk factors, considering the spatial variations.

Figure 5. GWR prediction of lung cancer age-adjusted incidence rates per 100,000 and standardized residuals.

Figure 5. GWR prediction of lung cancer age-adjusted incidence rates per 100,000 and standardized residuals.

Discussion

Several studies suggest an association between lung cancer incidence rates and the use of coal-fired power plants for energy generation. Lin et al. [Citation51] noted that increasing coal capacity by 1 kW per person in a country is associated with a 59% higher risk of lung cancer in males and an 85% higher risk in females. Our analysis aimed to determine whether proximity to a coal-fired power plant was a significant predictor of lung cancer age-adjusted incidence rates per 100,000 population in Kansas while controlling for established demographic, behavioral, economic, and climatological risk factors such as age, smoking status, poverty, and wind effects. However, contrary to Li et al. [Citation51], our analysis indicates that there is no significant difference in lung cancer age-adjusted incidence rates, whether the county is situated in proximity to a coal-fired power plant or not. This difference may be due to the fact that they found an association between the lung cancer incidence rate and the energy generated from the coal-fired power plant, while our study considered proximity to the power plant. This suggests that the vicinity of the power plant was not significantly associated with lung cancer age-adjusted incidence when controlling for other major risk factors.

Smoking, age, poverty, and population are significant global risk factors associated positively with the lung cancer age-adjusted incidence rate based on the MLR model. According to the GWR model, the prevalence of smokers demonstrated a significant variation between the north and south corridors, with higher coefficients observed compared to the east and west sides of the state. Although the population did not exhibit significance in the local regression model, its significance in the global model suggests that the population size alone may be a primary determinant and becomes more nuanced when considering localized variations.

The impact of wind on lung cancer age-adjusted incidence rates was found to be significant only in the northern part of Kansas. The negative coefficient for eastward wind suggests that when winds blow to the west, pollutants generated from the coal-fired power plants are carried towards the northern counties in Kansas. This intriguing finding emphasizes the importance of including environmental factors in lung cancer related studies. Furthermore, the percentage of the population between the ages of 55–79 emerged as a significant factor in 3 counties, revealing an association between higher populations in this age group and increased lung cancer age-adjusted incidence rates.

Limitations

The strength of this study is that it has evaluated the environmental impact on a niche Midwest population around the most aggressive cancer type. However, this study has several limitations. The first limitation being, study could not use the gender as a strata variable. Estimation of incidence rates of the 6 missing counties using K-Nearest Neighbors comes at a cost of not being the true lung cancer incidence rates of these counties.

Conclusion

In conclusion, our comprehensive analysis aimed to discern the relationship between proximity to coal-fired power plants and lung cancer age-adjusted incidence rates in Kansas, considering various established risk factors. Contrary to our initial hypothesis, the results suggest that proximity to a coal-fired power plant is not a significant predictor of lung cancer age-adjusted incidence rates when controlling for major demographic, behavioral, economic, and climatological factors. Instead, our study highlights the impactful role of behavioral factors, particularly smoking, which emerged as a consistently significant global risk factor in association with higher lung cancer age-adjusted incidence rates. The findings related to wind speed and direction in certain counties further emphasize the relevance of environmental factors in influencing regional health outcomes, as pollutants carried by wind may contribute to lung cancer age-adjusted incidence. In Kansas, certain areas show a highlighted risk among the elderly population. While proximity to coal-fired power plants may not emerge as a significant predictor in this study, it is critical to acknowledge their potential impact on local air quality and their confounding effect on lung cancer outcomes. Future analyses should examine how local coal-fired power plants are affecting the health of workers, who are likely exposed to high levels of pollutants. Comparing their lung cancer incidence rates to those of the general population would provide valuable insights.

Competing interests

The authors report there are no competing interests to declare.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Additional information

Funding

This study was supported by the National Cancer Institute (NCI) Cancer Center Support Grant Funding P30CA168524 and used the Biostatistics and Informatics Shared Resource (BISR).

References