ABSTRACT
Data integrity is a key component of effective Bayesian network structure learning algorithms, namely PC algorithm, design and use. Given the role that integrity of data plays in these outcomes, this research demonstrates the importance of data integrity as a key component in machine learning tools in order to emphasize the need for carefully considering data integrity during tool development and utilization. To meet this purpose, we study how an adversary could generate a desired network with the PC algorithm. Given a Bayesian network and a database
generated by
and a second Bayesian network,
, which is equal to
, except for a minor change like a missing link, a reversed link, or an additional link, we explore and analyze what is the minimal number of changes such as additions, deletions, substitutions to
that lead to a database
that, when given as input to PC algorithm, results in
.
Disclosure statement
No potential conflict of interest was reported by the authors.
Additional information
Notes on contributors
![](/cms/asset/91f60ff5-06c8-483d-921e-716c4afe0328/ggen_a_1630401_ilg0001.gif)
Emad Alsuwat
Emad Alsuwat is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of South Carolina (USC). He received his B.S. degree in computer science from Taif University, Saudi Arabia in 2008 and his M.S. degree in computer science and engineering from the Department of Computer Science and Engineering at the University of South Carolina in 2014. His research interests include Probabilistic Graphical Models (esp. Bayesian Networks), Artificial Intelligence, information security, and secure database systems.
![](/cms/asset/10de81d3-51be-43e5-bf51-391b92864c67/ggen_a_1630401_ilg0002.gif)
Hatim Alsuwat
Hatim Alsuwat is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of South Carolina (USC). He received his B.S. degree in computer science from Taif University, Saudi Arabia in 2008 and his M.E. degree in computer science and engineering from the Department of Computer Science and Engineering at the University of South Carolina in 2015. His research interests include Bayesian Networks, information security, secure database systems, and concept drift.
![](/cms/asset/8a09640b-4659-4d42-89de-c8ea628dfb83/ggen_a_1630401_ilg0003.gif)
Marco Valtorta
Marco Valtorta (Ph.D., Duke University, 1987) is a professor of Computer Science and Engineering at the University of South Carolina. His research interests are in Artificial Intelligence. His first research result, known as “Valtorta's theorem” (1980), was recently (2011) described as “seminal” and “an important theoretical limit of usefulness” for heuristics computed by problem relaxation. Most of his later research has been in the area of uncertainty in artificial intelligence. He has around 75 peer-reviewed publications in journals and highly selective conferences such as Artificial Intelligence, International Journal of Approximate Reasoning, ACM Journal of Data and Information Quality, IEEE Transactions on Instrumentation and Measurement, International Joint Conference on Artificial Intelligence, and Conference on Uncertainty in Artificial Intelligence. His students have been best paper award winners at the Conference on Uncertainty in Artificial Intelligence (1993, 2006) and the International Conference on Information Quality (2006). He is a senior member of ACM, IEEE, and AAAI.
![](/cms/asset/6ee1742d-34c6-411f-a5c9-50e5ac4a874e/ggen_a_1630401_ilg0004.gif)
Csilla Farkas
Csilla Farkas is a Professor in the Department of Computer Science and Engineering and Director of the Information Security Laboratory at the University of South Carolina (USC). Dr. Farkas received her Ph.D. from George Mason University, Virginia in 2000. She led the efforts at USC to develop a nationally recognized information assurance (IA) programs. Under her leadership, USC obtained the designations of National Center of Academic Excellence in Information Assurance Education and in Research. Dr. Farkas actively promotes IA awareness and builds academic and industry liaisons. She is the cybersecurity lead of the SC Department of Commerce initiative to promote applied research between SC industry, academia and Fraunhofer USA. Dr. Farkas actively publishes and participates in peer-reviewed international conferences and journals. Her research interests include information security, data inference problems, economic and legal analysis of cyber crime, and security for high performance computing.