Adversarial data poisoning attacks against the PC learning algorithm

Emad AlsuwatComputer Science and Engineering Department, University of South Carolina, Columbia, SC, USACorrespondence[email protected]
[email protected]
View further author information

Hatim AlsuwatComputer Science and Engineering Department, University of South Carolina, Columbia, SC, USAView further author information

Marco ValtortaComputer Science and Engineering Department, University of South Carolina, Columbia, SC, USAView further author information

Csilla FarkasComputer Science and Engineering Department, University of South Carolina, Columbia, SC, USAView further author information

ABSTRACT

Data integrity is a key component of effective Bayesian network structure learning algorithms, namely PC algorithm, design and use. Given the role that integrity of data plays in these outcomes, this research demonstrates the importance of data integrity as a key component in machine learning tools in order to emphasize the need for carefully considering data integrity during tool development and utilization. To meet this purpose, we study how an adversary could generate a desired network with the PC algorithm. Given a Bayesian network $B_{1}$ and a database $D B_{1}$ generated by $B_{1}$ and a second Bayesian network, $B_{2}$ , which is equal to $B_{1}$ , except for a minor change like a missing link, a reversed link, or an additional link, we explore and analyze what is the minimal number of changes such as additions, deletions, substitutions to $D B_{1}$ that lead to a database $D B_{2}$ that, when given as input to PC algorithm, results in $B_{2}$ .

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Emad Alsuwat

Emad Alsuwat is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of South Carolina (USC). He received his B.S. degree in computer science from Taif University, Saudi Arabia in 2008 and his M.S. degree in computer science and engineering from the Department of Computer Science and Engineering at the University of South Carolina in 2014. His research interests include Probabilistic Graphical Models (esp. Bayesian Networks), Artificial Intelligence, information security, and secure database systems.

Hatim Alsuwat

Hatim Alsuwat is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of South Carolina (USC). He received his B.S. degree in computer science from Taif University, Saudi Arabia in 2008 and his M.E. degree in computer science and engineering from the Department of Computer Science and Engineering at the University of South Carolina in 2015. His research interests include Bayesian Networks, information security, secure database systems, and concept drift.

Marco Valtorta

Marco Valtorta (Ph.D., Duke University, 1987) is a professor of Computer Science and Engineering at the University of South Carolina. His research interests are in Artificial Intelligence. His first research result, known as “Valtorta's theorem” (1980), was recently (2011) described as “seminal” and “an important theoretical limit of usefulness” for heuristics computed by problem relaxation. Most of his later research has been in the area of uncertainty in artificial intelligence. He has around 75 peer-reviewed publications in journals and highly selective conferences such as Artificial Intelligence, International Journal of Approximate Reasoning, ACM Journal of Data and Information Quality, IEEE Transactions on Instrumentation and Measurement, International Joint Conference on Artificial Intelligence, and Conference on Uncertainty in Artificial Intelligence. His students have been best paper award winners at the Conference on Uncertainty in Artificial Intelligence (1993, 2006) and the International Conference on Information Quality (2006). He is a senior member of ACM, IEEE, and AAAI.

Csilla Farkas

Csilla Farkas is a Professor in the Department of Computer Science and Engineering and Director of the Information Security Laboratory at the University of South Carolina (USC). Dr. Farkas received her Ph.D. from George Mason University, Virginia in 2000. She led the efforts at USC to develop a nationally recognized information assurance (IA) programs. Under her leadership, USC obtained the designations of National Center of Academic Excellence in Information Assurance Education and in Research. Dr. Farkas actively promotes IA awareness and builds academic and industry liaisons. She is the cybersecurity lead of the SC Department of Commerce initiative to promote applied research between SC industry, academia and Fraunhofer USA. Dr. Farkas actively publishes and participates in peer-reviewed international conferences and journals. Her research interests include information security, data inference problems, economic and legal analysis of cyber crime, and security for high performance computing.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.