ABSTRACT
The global electric vehicle (EV) market has been experiencing an impressive growth in recent times. Understanding consumer preferences on this cleaner, more eco-friendly mobility option could help guide public policy toward accelerating EV adoption and sustainable transportation systems. Previous studies suggest the strong influence of individual and external factors on EV adoption decisions. In this study, we apply machine learning techniques on EV stated preference survey data to predict EV adoption using attitudinal factors, ridesourcing factors (e.g., frequency of Uber/Lyft rides), as well as underlying sociodemographic and vehicle factors. To overcome machine learning models’ low interpretability, we adopt the innovative Local Interpretable Model-Agnostic Explanations (LIME) method to elaborate each factor’s contribution to the predicting outcomes. Besides what was found in previous EV preference literature, we find that the frequent usage of ridesourcing, knowledge about EVs, and awareness of environmental protection are important factors in explaining high willingness of adopting EVs.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Author contributions
The authors confirm contribution to the paper as follows: study conception and design: Javier Bas; data collection: Javier Bas, Cinzia Cirillo; analysis and interpretation of results: Javier Bas, Zhenpeng Zou, Cinzia Cirillo; draft manuscript preparation: Javier Bas, Zhenpeng Zou. All authors reviewed the results and approved the final version of the manuscript.
Notes
1. We run AutoML in the R environment using the application programming interface (API) developed by H2O. We run the ML models in an Amazon Web Service instance with 16 vCPU and 64 GiB of memory.
2. Specifically, the following algorithms are included in AutoML: Five pre-specified Gradient Boosting Machine (GBM), three pre-specified Extreme Gradient Boosting Machine (XGBoost GBM), a default Random Forest (DRF), a near-default Deep Neural Network (DNN), an Extremely Randomized Forest (XRT), a fixed grid of Generalized Linear Model (GLM), a random grid of XGBoost GBMs, a random grid of GBMs, and a random grid of DNNs.
3. The value of the coefficients is equivalent to their relative importance since the coefficient with the highest value can be interpreted as the most important (and therefore being normalized to 1), and then the rest can be scaled accordingly (which would not alter the shape of the graph shown in (b).). In this case, for the sake of a more traditional interpretation of the GLM model, we keep the value and sign (blue/orange color) of the coefficients without transforming them into their importance counterpart.