Using multi-class classification methods to predict baseball pitch types

Since the introduction of PITCHf/x in 2006, there has been a plethora of data available for anyone who wants to access to the minute details of every baseball pitch thrown over the past nine seasons. Everything from the initial velocity and release point to the break angle and strike zone placement is tracked, recorded, and used to classify the pitch according to an algorithm developed by MLB Advanced Media (MLBAM). Given these classifications, we developed a model that would predict the next type of pitch thrown by a given pitcher, using only data that would be available before he even stepped to the mound. We used data from three recent MLB seasons (2013-2015) to compare individual pitcher predictions based on multi-class linear discriminant analysis, support vector machines, and classification trees to lead to the development of a real-time, live-game predictor. Using training data from the 2013, 2014, and part of the 2015 season, our best method achieved a mean out-of-sample predictive accuracy of 66.62%, and a real-time success rate of over 60%.
© Copyright 2018 Journal of Sports Analytics. IOS Press. All rights reserved.

Subjects: baseball throws prognosis mathematic-logical model
Notations: technical and natural sciences sport games
Tagging: Machine Learning PITCHf/x
DOI: 10.3233/JSA-170171
Published in: Journal of Sports Analytics
Published: 2018
Volume: 4
Issue: 1
Pages: 85-93
Document types: article
Language: English
Level: advanced