Authors
Jason Wu1, Evan Gunnell2 and Yu Sun2, 1USA, 2California State Polytechnic University, USA
Abstract
The offensive strategy in American football strives to be enigmatic. A strong offense has a well rounded offensive playbook, rotating offensive plays in attempts to disrupt any predictive patterns. Therefore, it has always been in theinterest of defensive coordinators to offer accurate predictions of the upcoming play to minimize offensive yardage gain. A well advised defense can change its positioning and coverage schemes, given solely whether the next play will be a run or a pass. Although coaches have developed traditional heuristics for tendency-based play prediction, they are limited to patterns discerned by human consciousness.
This paper aims to take advantage of recent professional football databases in an attempt to develop a machine learning classification model for predicting opponent play-calling, as well as implement said model into a novel application aimed at deployment on all levels of play.
We conducted research on various classification models and features. Utilizing 10 past seasons from the National Football League (NFL), we devised random forest classification models with optimally chosen features. We also investigated the importance of Synthetic Minority Oversampling Technique (SMOTE) in training with inherently imbalanced datasets. The final model was able to achieve an NFL league average accuracy of 89.52%, with 91.69% as the highest team-specific accuracy. This accuracy is substantially higher than past projects of similar goals.
Keywords
Football Analytics, Machine Learning, Classification, Play Prediction.