Kernighan, Brian W.Fulton, Shelby2026-01-052026-01-052025https://theses-dissertations.princeton.edu/handle/88435/dsp016d570109qThis study leverages the capabilities of Large Language Models (LLMs) to analyze and assess 100 of the most popular menstrual cycle tracking apps on the App Store. The behavior of LLMs in analytical tasks involving large volumes of complex documents is examined. After four iterations of analysis and prompt engineering, Anthropic’s Claude 3 Opus model was able to consistently generate numerical strength scores based on a set of ten key privacy features, while also producing sound reasoning for these scores. The results of the automated evaluations reveal that, on average, the privacy policies of these apps are weak. This underscores the need for clearer guidelines regarding privacy and the management of sensitive health data that, if mishandled, could have grave consequences for users.en-USFlo State: Automated Evaluation of Menstrual Cycle Tracking App Privacy Policies Using Large Language ModelsPrinceton University Senior Theses