Publication: Beyond the Stats: Quantifying Intangible Qualities in NFL Draft Prospects
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
NFL teams invest heavily in the scouting and evaluation of college players before the draft, however many high draft picks underperform while later round picks emerge as stars. This thesis investigates whether intangible traits, such as leadership, competitiveness, and work ethic can be quantified from scouting reports and used to better predict a player’s success in the NFL. Using a combination of zero-shot classification and sentiment analysis, trait-specific sentiment scores are computed across multiple positions. These scores are then incorporated alongside quantitative combine data and college career statistics for K-Means clustering to group players with similar profiles. For each position where the incorporation of intangible traits was more explanatory than using strictly quantitative statistics, six regression models were trained to predict a custom-defined career success metric based on positional performance and Approximate Value (AV). Cluster assignments were one-hot encoded to determine their predictive impact on career success. Clusters with high cluster coefficients and late average draft positions were identified as “undervalued,” while clusters with early average draft positions and lower coefficients were considered “overvalued.” Results indicated that qualitative clustering often yielded higher explanatory power than models based purely on quantitative features. In several cases, we observed higher cluster coefficients with a later average draft pick, suggesting that NFL teams may be systematically overlooking certain high-potential players. This thesis demonstrates the potential of utilizing Natural Language Processing and qualitative data in the evaluation of professional football scouting, and how it can prove more effective than the traditional quantitative approach.