Campus users should disconnect from VPN to access senior theses, as there is a temporary disruption affecting VPN.
 

Publication:

Characterizing National Soccer Identity via K-means Clustering of World Cup Match Performances

datacite.rightsrestricted
dc.contributor.advisorMoretti, Christopher M.
dc.contributor.authorSteinert, Max
dc.date.accessioned2026-01-05T22:14:16Z
dc.date.available2026-01-05T22:14:16Z
dc.date.issued2025
dc.description.abstractThis study investigates national playing style identity in professional soccer by applying unsupervised machine learning techniques to match statistics from the 2018 and 2022 FIFA World Cups. Motivated by countries like Spain and Brazil with well-known, signature playing styles, we aim to explore whether other countries exhibit national playing styles in the World Cup and to what extent these styles have cultural and historical ties. Our study uses a dataset of 200 match performances from 24 countries with 21 features that represent in-depth match statistics relating to possession, passing, defensive actions, goalkeeping, and shooting from FBRef.com. We implement four variations of k-means clustering assisted by principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP) for clustering and visualization. Our results show seven clusters in the match data, each corresponding to a well-known playing style or strategic approach. We find that countries with strong national soccer identity more frequently use one playing style while other countries vary their playing styles between matches. While we observe some correlation between chosen playing style and geopolitical factors like income, population size, and geographical region, the globalization of soccer markets appears to have diminished these effects. This study demonstrates how national playing styles can be quantitatively identified and used to understand how countries express their identity through professional soccer.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp01mg74qq59d
dc.language.isoen_US
dc.titleCharacterizing National Soccer Identity via K-means Clustering of World Cup Match Performances
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-12-15T15:45:40.205Z
pu.contributor.authorid920246188
pu.date.classyear2025
pu.departmentComputer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
mks3_written_final_report-1.pdf
Size:
6.23 MB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download