Russakovsky, OlgaChaturvedi, Saarthak2026-01-052026-01-052025https://theses-dissertations.princeton.edu/handle/88435/dsp01v405sd851Predicting the outcome of sports games is a big and exciting problem. Sports analytics is constantly evolving and finding better ways to understand and break down a sport. Basketball, being a dynamic sport, has tremendous avenues for analysis and predictive modeling. Previously, most approaches have either used rudimentary and descriptive data or built expensive and complex models. This thesis leverages dynamic and complementary feature engineering to model matchup-specific strengths and weaknesses between competing teams. Key methodological innovations include the use of rolling averages to capture temporal trends, complementary metrics (offensive vs. defensive efficiencies, rebounding differential) to account for interactions, and era-based segmentation to analyze the evolution of feature importance across basketball history. Logistic regression with L1 regularization was employed, achieving an impressive 70% prediction accuracy—a significant improvement over other models—and uncovering interpretable insights into feature contributions. The most accurate model was trained on 40 seasons (35,000 games) of NBA data from 1985-2023.en-USBreaking Basketball: Using Logistic Regression and SVMs to Predict Basketball Game OutcomesPrinceton University Senior Theses