ABSTRACT
This article presents a machine learning-based approach for the prediction of 7-day maximum and 1-day minimum air temperatures in India. In this approach, a Gradient-boosting model is developed using latitude, longitude and altitude as model features. Raw data for this study consisted of hourly air temperature data collected over 30 years across 20,168 grid points within India. As part of the predictive methodology, cluster analysis was performed initially, which helped in obtaining homogeneous regions. Three different approaches empirical clustering, statewise clustering and k-mean clustering, were used. The model features pertaining to individual clusters were used with a gradient-boosting approach. Statistical analysis was conducted to check the accuracy of model predictions based on different clustering techniques. In all cases, predictions made at the global level resulted in poor predictions. In most cases, the results obtained with k-mean clustering showed that increasing the number of clusters improved the predictive accuracy. Furthermore, predictive accuracy with statewise or k-mean clustering was dependent on several features involved. The proposed predictive models have a very simple structure that requires the least input (i.e. geographical indicators). Hence the same can be used for faster and accurate computation of 7-day maximum and 1-day minimum air temperature.
Acknowledgements
The authors gratefully acknowledge NCMRWF, the Ministry of Earth Sciences, Government of India, for IMDAA reanalysis.
Disclosure statement
No potential conflict of interest was reported by the authors.
Data availability statement
The meteorological data that support the findings of this study are openly available on the website of the National Centre for Medium Range Weather Forecasting (NCMRWF), which is a centre of excellence in weather and climate modelling under the Ministry of Earth Sciences. The link to the website is given below. https://rds.ncmrwf.gov.in/home.