Princeton University Users: If you would like to view a senior thesis while you are away from campus, you will need to connect to the campus network remotely via the Global Protect virtual private network (VPN). If you are not part of the University requesting a copy of a thesis, please note, all requests are handled manually by staff and will require additional time to process.
 

Publication:

Finding Skill Changes in Major League Baseball Player Development: A Bayesian Approach with LSTM Networks and Hidden Markov Models

dc.contributor.advisorFan, Jianqing
dc.contributor.authorKram, Kaden
dc.date.accessioned2025-08-06T15:53:40Z
dc.date.available2025-08-06T15:53:40Z
dc.date.issued2025-04-07
dc.description.abstractMajor League Baseball (MLB) organizations invest heavily in player evaluation and development, often relying on end-of-season statistics and traditional regression-to-the-mean models to assess talent. However, regression-to-the-mean assumes fixed skill levels and fails to account for the dynamic nature of player performance over a season. My thesis presents a novel approach to evaluating and forecasting MLB player development using Bayesian inference and changepoint detection models, including CUSUM, Bayesian Online Changepoint Detection (BOCPD), Hidden Markov Models (HMMs), and Long Short-Term Memory (LSTM) networks. I use a Bayesian framework to iteratively update beliefs about a player's true skill level across various performance metrics such as batting average, slugging percentage, and weighted on-base-average. This approach incorporates uncertainty and provides richer comparisons between players than single-point estimates. I tested my models on both synthetic and real MLB play-by-play data, with synthetic data used to benchmark changepoint detection accuracy across controlled scenarios. My analysis shows that while Bayesian inference effectively captures player skill trends and variation, the changepoint detection models struggle to identify subtle but significant shifts in skill due to the high noise inherent in binary baseball outcomes. The LSTM model initially showed promise but ultimately failed to outperform simpler methods in accuracy or consistency. Nevertheless, this work provides a foundation for future efforts to disentangle random fluctuations from true skill changes in athlete performance. By offering a probabilistic framework for evaluating player development, this thesis contributes a more nuanced perspective to player scouting and performance forecasting, with implications for team decision-making, player strategy, and contract valuation.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp018k71nm55k
dc.language.isoen_US
dc.titleFinding Skill Changes in Major League Baseball Player Development: A Bayesian Approach with LSTM Networks and Hidden Markov Models
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-04-08T00:11:23.578Z
pu.contributor.authorid920245403
pu.date.classyear2025
pu.departmentOps Research & Financial Engr

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Kram_Thesis (1).pdf
Size:
2.9 MB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download