6,601
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Machine Learning Applications in Baseball: A Systematic Literature Review

& ORCID Icon

References

  • Attarian, A., et al. 2013. A comparison of feature selection and classification algorithms in identifying baseball pitches. International MultiConference of Engineers and Computer Scientists, 263–68.
  • Attarian, A., et al. 2014. Baseball pitch classification: A Bayesian method and dimension reduction investigation. IAENG Transactions on Engineering Sciences: Special Issue of the International MultiConference of Engineers and Computer Scientists 2013 and World Congress on Engineering 2013, 392–99, CRC Press.
  • Barnes, S. L., and M. V. Bjarnadóttir. 2016. Great expectations: An analysis of major league baseball free agent performance. Statistical Analysis and Data Mining: the ASA Data Science Journal 9 (5):295–309. doi:10.1002/sam.11311.
  • Baumer, B., and A. Zimbalist. 2013. The sabermetric revolution: Assessing the growth of analytics in baseball. University of Pennsylvania Press, Philadelphia, PA.
  • Bishop, C. M. 2006. Pattern recognition and machine learning. Springer.  Bock, Joel R. 2015. “Pitch Sequence Complexity and Long-Term Pitcher Performance”. Sports 3 (1):40–55.
  • Blog, L. 2009. Bayesian estimators for the beta-binomial model of batting ability. Accessed January 8, 2018. https://lingpipe-blog.com/2009/09/23/bayesian- estimators-for-the-beta-binomial-model-of-batting-ability/.
  • Bock, J. R. 2015. Pitch sequence complexity and long-term pitcher performance. Sports 3 (1):40–55
  • Costa, G. B., M. R. Huber, and J. T. Saccoman. 2007. Understanding sabermetrics: An introduction to the science of baseball statistics. McFarland, Jefferson, NC.
  • Costa, G. B., M. R. Huber, and J. T. Saccoman. 2012. Reasoning with sabermetrics: Applying statistical science to Baseball’s tough questions. McFarland, Jefferson, NC.
  • Das, R., and S. Das. 1994. Catching a baseball: A reinforcement learning perspective using a neural network. AAAI, 688–93.
  • Elo, A. E. 1978. The rating of chessplayers, past and present. Arco Publishing, New York, NY.
  • Everman, B. 2015. Analyzing baseball statistics using data mining. Accessed January 9, 2018. http://truculent.org/papers/DB%20Paper.pdf.
  • Fichman, M., and M. A. Fichman. 2012. From darwin to the diamond: How baseball and billy beane arrived at moneyball.(Accessed January 9, 2018. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2112109.
  • Firsick, Z. 2013. “Predicting Major League Baseball Playoff Outcomes Through Multiple Linear Regression”. PhD thesis, University of South Dakota.
  • Freiman, M. H. 2010. Using random forests and simulated annealing to predict probabilities of election to the Baseball Hall of Fame. Journal of Quantitative Analysis in Sports 6 (2):1–35. doi:10.2202/1559-0410.1245.
  • Ganeshapillai, G., and J. Guttag. 2012. Predicting the next pitch. Sloan Sports Analytics Conference.
  • Ganeshapillai, G., and J. Guttag. 2014. A data-driven method for in-game decision making in MLB. In Sports Analytics Conference.
  • George, E. I., and R. E. McCulloch. 1993. Variable selection via Gibbs sam- pling. Journal of the American Statistical Association 88 (423):881–89. doi:10.1080/01621459.1993.10476353.
  • Hamilton, M., et al. 2014. “Applying machine learning techniques to baseball pitch prediction”. Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods, 520–27, SCITEPRESS-Science and Technology Publica- tions, Lda.
  • Hammons, C. 2006. “A Bayesian Approach to Markov Chain Baseball Analysis”. PhD thesis, Georgetown College.
  • Hardy, R. L. 1977. Least squares prediction. Photogrammetric Engineering and Remote Sensing 43 (4):1905–15.
  • Healey, G. 2015. Modeling the probability of a strikeout for a batter/pitcher matchup. IEEE Transactions on Knowledge and Data Engineering 27 (9):2415–23. doi:10.1109/TKDE.2015.2416735.
  • Healey, G. 2017. Matchup models for the probability of a ground ball and a ground ball hit. Journal of Sports Analytics 3 (1):21–35. doi:10.3233/JSA-160025.
  • Herrlin, D. L. 2015. “Forecasting MLB performane utilizing a Bayesian approach in order to optimize a fantasy baseball draft”. PhD thesis, San Diego State University.
  • Hoang, P. 2015. Supervised learning in baseball pitch prediction and Hepatitis C Diagnosis. North Carolina State University, Raleigh, NC.
  • Hoang, P., et al. 2015. A dynamic feature selection based LDA approach to baseball pitch prediction. In Trends and applications in knowledge discovery and data mining, 125–37. Springer, New York, NY.
  • Huddleston, S. D. 2012. “Hitters vs. Pitchers: A Comparison of Fantasy Baseball Player Performances Using Hierarchical Bayesian Models”. PhD thesis, Brigham Young University-Provo.
  • Ishii, T. 2016. Using machine learning algorithms to identify undervalued baseball players. Accessed January 9, 2018. http://cs229.stanford.edu/proj2016/report/Ishii-UsingMachineLearningAlgorithmsToIdentifyUndervaluedBaseballPlayers-report.pdf.
  • James, B. 1987. The bill james baseball abstract 1987. Ballantine Books, New York, NY.
  • Jang, W.-I., A. Nasridinov, and Y.-H. Park. 2014. Analyzing and predicting patterns in baseball data using machine learning techniques. Advanced Science and Technology Letters 62:37–40.
  • Jensen, S. T., B. M. Blakeley, A. J. Wyner, et al. 2009. Hierarchical Bayesian modeling of hitting performance in baseball. Bayesian Analysis 4 (4):631–52. doi:10.1214/09-BA424.
  • Jiang, W., C.-H. Zhang, et al. 2010. Empirical Bayes in-season prediction of baseball batting averages. In Borrowing strength: Theory powering applications–A Festschrift for Lawrence D. Brown, 263–73. Institute of Mathematical Statistics.
  • Keele, S. et al. 2007. Guidelines for performing systematic literature reviews in software engineering. Technical report, Ver. 2.3 EBSE Technical Report. EBSE. sn.
  • Kelleher, J. D., B. M. Namee, and A. D’Arcy. 2015. Fundamentals of machine learning for predictive data analytics: Algorithms, worked examples, and case studies. MIT Press, Cambridge, MA.
  • Lewis, M. 2004. Moneyball: The art of winning an unfair game. WW Norton & Company, New York, NY.
  • Lyle, A. 2007. “Baseball prediction using ensemble learning”. PhD thesis, University of Georgia.
  • Miller, S. J. 2005. A derivation of the pythagorean win-loss formula in baseball. ArXiv Mathematics e-prints.
  • Moy, D. 2006. “Regression Planes to Improve the Pythagorean Percentage: A regression model using common baseball statistics to project offensive and defensive efficiency”. Master’s thesis, University of California, Berkeley.
  • Panda, M. L. 2014. “Penalized Regression Models for Major League Baseball Metrics”. PhD thesis, University of Georgia.
  • Petticrew, M., and H. Roberts. 2008. Systematic reviews in the social sciences: A practical guide. John Wiley & Sons, Hoboken, NJ.
  • Prospectus, B. 2012. Baseball Think Factory. Online resource: “www.baseballthinkfactory.com”.
  • Reeves, J. 2010. Major league baseball performance prediction. Accessed January 9, 2018. http://www.cs.dartmouth.edu/~lorenzo/teaching/cs134/Archive/Spring2010/proposal/cs134Proposal2/cs134.html.
  • Savant, B. Statcast catch rates. Accessed January 8, 2018. http://baseballsavant.mlb.com/statcast_catch_probability.
  • Sawchik, T. 2015. Big Data Baseball: Math, Miracles, and the End of a 20-Year Losing Streak. Flatiron, New York, NY.
  • Schumaker, R. P., O. K. Solieman, and H. Chen. 2010. Sports data mining methodology. Springer, New York, NY.
  • Sidle, G. D. 2017. “Using Multi-Class Machine Learning Methods to Predict Major League Baseball Pitches”. PhD thesis, North Carolina State University.
  • Sidran, D. E. 2005. A method of analyzing a baseball pitcher’s performance based on statistical data mining. University of Iowa, Iowa City, IA. Accessed January 9, 2018. https://www.researchgate.net/publication/267918769_A_Method_of_Analyzing_a_Baseball_Pitcher’s_Performance_Based_on_Statistical_Data_Mining.
  • Silver, N. 2003. Introducing pecota. Baseball Prospectus 2003:507–14.
  • Smola, A., and S. V. N. Vishwanathan. 2008. Introduction to machine learning. Cambridge University Press, Cambridge, United Kingdom.
  • Soto Valero, C. 2016. Predicting Win-Loss outcomes in MLB regular season games–A comparative study using data mining methods. International Journal of Computer Science in Sport 15 (2):91–112. doi:10.1515/ijcss-2016-0007.
  • Stevens, G. 2013. “Bayesian Statistics and Baseball”. PhD thesis, Pomona College.
  • Swan, G., and A. Scime. 2010. Winning baseball through data mining. DMIN, 151–57.
  • Tolbert, B., and T. Trafalis. 2016. Predicting Major League Baseball Championship Winners through data mining. Athens Journal of Sports 3 (4):239–52.
  • Tung, D. D. 2012. Data mining career batting performances in baseball. Accessed January 9, 2018. http://vixra.org/pdf/1205.0104v1.pdf.
  • Yang, T. Y., and T. Swartz. 2004. A two-stage Bayesian model for predicting winners in major league baseball. Journal of Data Science 2 (1):61–73.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.