592
Views
5
CrossRef citations to date
0
Altmetric
Original Articles

Malware Detection Using Nonparametric Bayesian Clustering and Classification Techniques

, , &
Pages 535-546 | Received 01 Dec 2013, Published online: 18 Nov 2015

References

  • Anderson, B., Quist, D., Neil, J., Storlie, C., Lane, T. (2011), Graph-based Malware Detection Using Dynamic Analysis, Journal of Computer Virology, 7, 247–258.
  • Anderson, B., Storlie, C., Lane, T. (2012), Multiple Kernel Learning Clustering With an Application to Malware, 2012 IEEE 12TH International Conference on Data Mining (ICDM), . 804–809.
  • Antoniak, C. (1974), Mixtures of Dirichlet Processes With Applications to Bayesian Nonparameteric Problems, Analysis of Statistics, 2, 1152–1174.
  • Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E. (2009), Scalable, Behavior-Based Malware Clustering, ISOC Network and Distributed System Security Symposium.
  • Chikofsky, E., Cross, J. (1990), Reverse Engineering and Design Recovery: A Taxonomy, IEEE Software, 7, 13–17.
  • Christodorescu, M., Jha, S. (2003), Static Analysis of Executables to Detect Malicious Patterns, Proceedings of the 12th Usenix Security Symposium, . 169–186.
  • Cortes, C., Vapink, V. (1995), Support Vector Networks, Machine Learning, 20, 273–297.
  • Dai, J., Guha, R., Lee, J. (2009), Efficient Virus Detection Using Dynamic Instruction Sequences, Journal of Computers, 4, 405–414.
  • Davy, M., Tourneret, J. (2010), Generative Supervised Classification Using Dirichlet Process Priors, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1781–1794.
  • Dinaburg, A., Royal, P., Sharif, M., Lee, W. (2008), Ether: Malware Analysis Via Hardware Virtualization Extensions, Proceedings of the 15th ACM Conference on Computer and Communications Security, . 51–62.
  • Efron, B., Tibshirani, R. (2002), Empirical Bayes Methods and False Discovery Rates for Microarrays, Genetic Epidemiology, 23, 70–86.
  • Escobar, M. (1994), Estimating Normal Means With a Dirichlet Process Prior, Journal of the American Statistical Association, 89, 268–277.
  • Escobar, M., West, M. (1995), Bayesian Density-estimation and Inference Using Mixtures, Journal of the American Statistical Association, 90, 577–588.
  • Ferguson, T. (1973), A Bayesian Analysis of Some Nonparametric Problems, The Annals of Statistics, 1, 209–230.
  • (1983), Bayesian Density Estimation by Mixtures of Normal Distributions, in Recent Advances in Statistics, eds. H. Rizvi and J. Rustagi, New York: Academic Press, pp. 287–302.
  • Ghosh, J.K., and Ramamoorthi, R.V. (2003), Bayesian Nonparametrics, New York: Springer.
  • Goldberg, I., Wagner, D., Thomas, R., Brewer, E. (1996), A Secure Environment for Untrusted Helper Applications (confining the wily hacker), in Proceedings of the 6th Conference on USENIX Security Symposium, Focusing on Applications of Cryptography. p. 1
  • Hastie, T., Tibshirani, R., and Friedman, J.H. (2001), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer-Verlag.
  • Hoerl, A., Kennard, R. (1970), Ridge Regression—Biased Estimation for Nonorthogonal Problems, Technometrics, 12, 55.
  • Jackson, E., Davy, M., Doucet, A., Fitzgerald, W.J. (2007), Bayesian Unsupervised Signal Classification by Dirichlet Process Mixtures of Gaussian Processes, in 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE Signal Proc Soc, IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA, pp. 1077–1080. 32nd IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, HI, April 15--20, 2007.
  • Karim, M., Walenstein, A., Lakhotia, A., Parida, L. (2005), Malware Phylogeny Generation Using Permutations of Code, Journal of Computer Virology, 1, 13–23.
  • Kolter, J.Z., Maloof, M.A. (2006), Learning to Detect and Classify Malicious Executables in the Wild, The Journal of Machine Learning Research, 7, 2721–2744.
  • Kong, D., Yan, G. (2013), Discriminant Malware Distance Learning on Structural Information for Automated Malware Classification, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 1357–1365.
  • Kruegel, C., Kirda, E., Mutz, D., Robertson, W., Vigna, G. (2006), Polymorphic Worm Detection Using Structural Information of Executables, Recent Advances in Intrusion Detection, . 207–226.
  • Kushner, D. (2013), The Real Story of Stuxnet, IEEE Spectrum, 50, 48–53.
  • Li, F., Lai, A., Ddl, D. (2011), Evidence of Advanced Persistent Threat: A Case Study of Malware for Political Espionage, Sixth International Conference on Malicious and Unwanted Software, . 102–109.
  • MacEachern, S. (1994), Estimating Normal Means With a Conjugate Style Dirichlet Process Prior, Comunications in Statistics—Simulation and Computation, 23, 727–741.
  • Muller, P., Parmigiani, G., Rice, K. (2006), FDR and Bayesian Multiple Comparisons Rules,” in Bayesian statistics (vol. 8), eds. J. M. Bernardo et al., Oxford: Oxford University Press.
  • Neal, R. (2000), Markov Chain Sampling Methods for Dirichlet Process Mixture, Journal of Computational and Graphical Statistics, 9, 249–265.
  • Newton, M., Noueiry, A., Sarkar, D., Ahlquist, P. (2004), Detecting Differential Gene Expression With a Semiparametric Hierarchical Mixture Method, Biostatistics, 5, 155–176.
  • Perdisci, R., Dagon, D., Fogla, P., Sharif, M. (2006), Misleading Worm Signature Generators Using Deliberate Noise Injection, Proceedings of the 2006 IEEE Symposium on Security and Privacy, . 17–31.
  • Picard, R., Cook, R. (1984), Cross-Validation of Regression-Models, Journal of the American Statistical Association, 9, 575–583.
  • Quist, D. (2012), Community Malicious Code Research and Analysis.” http://www.offensivecomputing.net/
  • Reddy, D. K.S., Dash, S., and Pujari, A. (2006), “New Malicious Code Detection Using Variable Length n-grams,” in Information Systems Security, Vol. 4332 of Lecture Notes in Computer Science, Berlin/Heidelberg: Springer, pp. 276–288.
  • Reddy, D., Pujari, A. (2006), N-gram Analysis for Computer Virus Detection, Journal of Computer Virology, 2, 231–239.
  • Rieck, K., Holz, T., Willems, C., Dssel, P., Laskov, P. (2008), Learning and Classification of Malware Behavior, Detection of Intrusions and Malware, and Vulnerability Assessment, 5137, 108–125.
  • Royal, P., Halpin, M., Dagon, D., Edmonds, R., Lee, W. (2006), Polyunpack: Automating the Hidden-code Extraction of Unpackexecuting Malware, ACSAC, . 289–300.
  • Shahbaba, B., Neal, R. (2009a), Nonlinear Models Using Dirichlet Process Mixtures, Journal of Machine Learning Resesarch, 10, 1829–1850.
  • Shahbaba, B., Neal, R. (2009b), Nonlinear Models Using Dirichlet Process Mixtures, Journal of Machine Learning Research, 10, 1829–1850.
  • Stolfo, S., Wang, K., Li, W. (2005), Fileprint Analysis for Malware D etection, ACM Workshop on Recurring/Rapid Malcode.
  • Storey, J., Taylor, J., Siegmund, D. (2004), Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach, Journal of the Royal Statistical Society, Series B, 66, 187–205.
  • Storlie, C., Wiel, S., Quist, D., Anderson, B., Hash, C., Brown, N. (2013), Stochastic Iidentification and Clustering of Malware With Dynamic Traces, Technical Report, Los Alamos National Laboratory.
  • Symantec Bagle Security Report ( 2013), http://www.symantec.com/security_response/writeup.jsp?docid=2004-031508-5302-99.
  • Tibshirani, R. (1996), Regression Shrinkage and Selection Via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267–288.
  • Virvilis, N., Gritzalis, D. (2013), The Big Four—What We Did Wrong in Advanced Persistent Threat Detection?, Eighth International Conference on Availability, Reliability and Security, . 248–254.
  • West, M., Müller, P., and Escobar, M.D. (1994), “Hierarchical Priors and Mixture Models, With Application in Regression and Density Estimation,” in Aspects of Uncertainty: A Tribute to D.V. Lindley, London: Wiley, pp. 363–386. http://ftp.stat.duke.edu/WorkingPapers/93-A02.html
  • Xu, Q., Liang, Y. (2001), Monte Carlo Cross Validation, Chemometrics and Intelligent Laboratory Systems, 56, 1–11.
  • Zhu, H., Brown, P.J., Morris, J.S. (2012), Robust Classification of Functional and Quantitative Image Data Using Functional Mixed Models, Biometrics, 8, 1260–1268.
  • Zou, H., Hastie, T. (2005), Regularization and Variable Selection Via the Elastic Net, Journal of the Royal Statistical Society, Series B, 67, 301–320.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.