Princeton University Users: If you would like to view a senior thesis while you are away from campus, you will need to connect to the campus network remotely via the Global Protect virtual private network (VPN). If you are not part of the University requesting a copy of a thesis, please note, all requests are handled manually by staff and will require additional time to process.
 

Publication:

Beyond the Binder: A Machine Learning Exploration of Decision-Making in College Football

Loading...
Thumbnail Image

Files

SeniorThesis.pdf (1.4 MB)

Date

2025-04-10

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

This thesis investigates the decision-making behavior of college football coaches through the lens of data-driven optimal strategy. Literature has shown that despite the growing influence of analytics in sports, football is lagging behind, and many coaching decisions, such as fourth-down and two-point conversion attempts, continue to reflect conservative tendencies that may reduce a team’s chances of winning. To analyze this phenomenon, we develop a machine learning win probability model using XGBoost, trained on play-by-play data from the four most recent NCAA FBS football seasons, beginning in 2021. Uniquely, we develop three models to estimate the likelihood of winning. These are based on game state variables including score differential, time remaining, field position, and possession, as well as the point spread and advanced statistics for the second and third more advanced models, giving each successive model more information to learn from. Taking another novel approach, we then use an expected win probability framework to determine optimal decisions in high-leverage fourth-down situations and compare them against the actual coaching choices to quantify ”coaching aggression” by measuring deviations from optimality. The results refute the consensus that college coaches often exhibit excessive risk aversion, and actually finds that excessive aggression leads to the loss of about a quarter of a game above expectation every year. These findings dissent from the growing body of sports analytics literature and offer insight into how traditional intuition may still override data-driven logic on the sidelines, though additional data validation and confirmation is still necessary.

Description

Keywords

Citation