ABSTRACT
As predictive analytics increasingly applies supervised machine learning (SML) models to inform mission-critical decision-making, adversaries become incentivized to exploit the vulnerabilities of these SML models and mislead predictive analytics into erroneous decisions. Due to the limited understanding and awareness of such adversarial attacks, the predictive analytics knowledge and deployment need a principled technique for adversarial robustness assessment and enhancement. In this research, we leverage the technology threat avoidance theory as the kernel theory and propose a research framework for assessing and enhancing the adversarial robustness of predictive analytics applications. We instantiate the proposed framework by developing a robust text classification system, the ARText system. The proposed system is rigorously evaluated in comparison with benchmark methods on two tasks extensively enabled by SML: spam review detection and spam email detection, which then confirmed the utility and effectiveness of our ARText system. Results from numerous experiments revealed that our proposed framework could significantly enhance the adversarial robustness of predictive analytics applications.
Supplementary information
Supplemental data for this article can be accessed on the publisher’s website
Disclosure Statement
No potential conflict of interest was reported by the authors.
Notes
1. The norm similarity distance between a successful adversarial sample and is defined as, , where and .
2. We intend to include most of the commonly used text classification model except for the K-Nearest Neighbors algorithm, which is not a supervised machine learning model building upon the ERM principle.
4. We chose k = 5 in k-fold cross validation to ensure that the testing set in each iteration was sufficiently large to represent the broader dataset.
Additional information
Funding
Notes on contributors
Weifeng Li
Weifeng Li ([email protected]) received his Ph.D. in Management Information Systems from the University of Arizona. He is an assistant professor in the Department of Management Information Systems at the University of Georgia. Dr. Li’s research interests include the security of artificial intelligence systems and the development of artificial intelligence systems for cybersecurity applications. His methodological foci are machine learning, text mining, and social media analytics.
Yidong Chai
Yidong Chai ([email protected]; corresponding author) received his Ph.D. at Tsinghua University, China. He is a researcher in the School of Management, Hefei University of Technology and the Key Laboratory of Process Optimization and Intelligence Decision Making, Ministry of Education, Hefei, China. Dr. Chai’s research interests include machine learning, cybersecurity, business intelligence, and health informatics.