262
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Taxi drivers’ traffic violations detection using random forest algorithm: A case study in China

, , ORCID Icon, , , & show all
Pages 362-370 | Received 08 Sep 2022, Accepted 12 Mar 2023, Published online: 28 Mar 2023
 

Abstract

Objective

To effectively explore the impacts of several key factors on taxi drivers’ traffic violations and provide traffic management departments with scientific decisions to reduce traffic fatalities and injuries.

Methods

43,458 electronic enforcement data about taxi drivers’ traffic violations in Nanchang City, Jiangxi Province, China, from July 1, 2020, to June 30, 2021, were utilized to explore the characteristics of traffic violations. A random forest algorithm was used to predict the severity of taxi drivers’ traffic violations and 11 factors affecting traffic violations, including time, road conditions, environment, and taxi companies were analyzed using the Shapley Additionality Explanation (SHAP) framework.

Results

Firstly, the ensemble method Balanced Bagging Classifier (BBC) was applied to balance the dataset. The results showed that the imbalance ratio (IR) of the original imbalanced dataset reduced from 6.61% to 2.60%. Moreover, a prediction model for the severity of taxi drivers’ traffic violations was established by using the Random Forest, and the results showed that accuracy, m_F1, m_G-mean, m_AUC, and m_AP obtained 0.877, 0.849, 0.599, 0.976, and 0.957, respectively. Compared with the algorithms of Decision Tree, XG Boost, Ada Boost, and Neural Network, the performance measures of the prediction model based on Random Forest were the best. Finally, the SHAP framework was used to improve the interpretability of the model and identify important factors affecting taxi drivers’ traffic violations. The results showed that functional districts, location of the violation, and road grade were found to have a high impact on the probability of traffic violations; their mean SHAP values were 0.39, 0.36, and 0.26, respectively.

Conclusions

Findings of this paper may help to discover the relationship between the influencing factors and the severity of traffic violations, and provide a theoretical basis for reducing the traffic violations of taxi drivers and improving the road safety management.

Acknowledgments

The authors thank the Traffic Administration Bureau of Nanchang Public Security Bureau for providing the data.

Disclosure statement

The authors report there are no competing interests to declare.

Additional information

Funding

This research was supported by the National Nature Science Foundation of China (grant numbers 52162049, 51805169, 52062014, 52062015), Natural Science Foundation of Jiangxi Province (grant numbers 20202BABL212009, 20212ABC03A07), and Natural Science Foundation of Guangdong Province (grant number 2022A1515011040).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.