Starting Small: Using Machine Learning Techniques to Identify Physically Plausible Tracks in High-Pileup Collision Events

Macosko, Joah J.

Publication:
Starting Small: Using Machine Learning Techniques to Identify Physically Plausible Tracks in High-Pileup Collision Events

Files

Macosko_Senior_Thesis.pdf (5.29 MB)

Date

2025-04-28

Authors

Macosko, Joah J.

Abstract

Analyzing the physical properties of particles scattered after high-speed collisions is an important component of particle physics research. But reconstructing particle tracks from the position data of thousands of scattered particles in high-pileup events is a difficult task. Assessing how physically plausible a track formed by a set of points is could serve as the final step in a machine learning pipeline that identifies possible reconstructed tracks, and providing an accurate assessment for the plausibility of the tracks is therefore critical for training earlier steps in the pipeline. Thus, we attempt to create a neural network that can classify sets of position points as part of one track or part of multiple different tracks. To ensure our classifier is robust, we generate the sets of points that do not come from one true track by slightly perturbing a true track, either by randomly moving points by an amount proportional to the deviation of points from a circle fit of the track or by simply swapping out some of the points for one of their nearest neighbors. We then train the neural network on these true and perturbed tracks and try to find the model that can most accurately identify the true tracks, working to ensure that the classifier is effective in the high-momentum regimes that are most relevant for track reconstruction. We find that using transfer learning by first training a model on fake tracks that are easy to identify before training using more difficult fake tracks is markedly more effective than just directly training on the difficult tracks. Using this transfer learning strategy, we create a classifier that has a total momentum-weighted accuracy of 0.6608 on the most difficult category of fake tracks and an area under the receiver operating characteristic curve of 0.7280. Finally, we suggest possible improvements and alternate methods that could improve this performance and move closer to a classifier that can be reliably incorporated into a training pipeline.

URI

https://theses-dissertations.princeton.edu/handle/88435/dsp01wp988p26v

Collections

Physics, 1936-2025

Full item page

Thesis Central

Publication:
Starting Small: Using Machine Learning Techniques to Identify Physically Plausible Tracks in High-Pileup Collision Events

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication: Starting Small: Using Machine Learning Techniques to Identify Physically Plausible Tracks in High-Pileup Collision Events

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
Starting Small: Using Machine Learning Techniques to Identify Physically Plausible Tracks in High-Pileup Collision Events