Computer Science, 1987-2025

Permanent URI for this collectionhttps://theses-dissertations.princeton.edu/handle/88435/dsp01mp48sc83w

Browse

Now showing 1 - 20 of 57

A Comparative Study of Syntax and Word Usage Between Standard French and Cameroonian French Using Natural Language Processing

(2025-04-10) Hines, Julia R.; Fellbaum, Christiane Dorothea
This study uses natural language processing (NLP) techniques to analyze the syntactic and lexical differences between Standard French and Cameroonian French, as well as examine how the dialect evolves when used by the Cameroonian diaspora in France. The central methodology involves training and evaluating two distinct NLP models: one fine-tuned on a corpus of Standard French, and the other on Cameroonian French. The LSTM model, on the other hand, outperformed the Logistic Regression model in all key metrics, including accuracy, precision, recall, and F1-score. The results of this study illustrate the limitations of traditional NLP methods, such as logistic regression, when applied to dialects with syntactical and linguistic differences, and they highlight the potential of deep learning approaches to better handle these variations. The findings point to the importance of fostering linguistic diversity within computational models.
A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management

(2025-04-10) Toh, Yi Jin; Eysenbach, Benjamin
The residential building sector is a major contributor to energy consumption and greenhouse gas emissions, making electrification and intelligent energy management essential for decarbonization. However, increased electricity demand can strain the power grid, leading to higher costs and emissions. Demand-side flexibility, enabled by on-site power generation, energy storage, and optimized control algorithms, can mitigate this problem by shifting electricity consumption to times when electricity is cheaper and cleaner.

This study evaluates three methods for centralized building energy storage management using CityLearn, an open-source environment for simulating and benchmarking building energy control. The evaluation compares Model Predictive Control (MPC) with two Reinforcement Learning (RL) methods: Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO). The methods are assessed across three dimensions: (1) energy performance, including cost, carbon emissions, electricity consumption, and stability of electricity use over time; (2) computational efficiency, including training time, memory usage, and inference speed; and (3) scalability, measured across different district sizes of two, four, and eight buildings.

Overall, SAC achieved the strongest performance on cost and energy metrics, performing slightly better than PPO in those areas. PPO, however, produced smoother control behavior with more stable electricity use over time while requiring significantly less memory than SAC and less computation than MPC. Both RL methods outperformed MPC across most metrics, with MPC particularly struggling to scale. Nonetheless, MPC remained more interpretable and required no training data, though it involved substantial engineering effort to develop an accurate system model.

These findings highlight trade-offs between performance, stability, and deployability. PPO emerged as the most balanced controller, offering strong performance with scalability and computational efficiency, making it well-suited for real-world use.
A Monte-Carlo Hearts Engine

(2025-04) Bendory, Eden R.; Kincaid, Zachary
The card game Hearts is a stochastic, sequential, non-zero sum, 4-player, partial information game. Such qualities of the game prevent standard game algorithms from finding optimal play in reasonable time. The Monte-Carlo tree search partially addresses this issue by offering an approximation of the payoff resulting from optimal play, so that every potential move in the game does not have to be searched for an action’s value to be evaluated. However, a standard Monte-Carlo tree search does not address imperfect information, stochastic, or N-Player games. My approach aims to close this gap by integrating other algorithms such as maxn [2] and Monte-Carlo sampling [3] to address these aspects of the game that Monte-Carlo tree search does not. The combination of these techniques results in a Hearts engine that is able to beat many base-level algorithms, existing Hearts engines, and advanced human Hearts players.
Algorithmic Auditing under Data Access Mandates: A risk limiting framework for third party evaluations of AI fairness

(2025-04-27) DeLucia, Lacey Rose L.; Liu, Lydia Tingruo
As AI systems become more prominent in decision-making for domains such as employment and advertising, ensuring fairness in these models is increasingly important. In this work, we design a black-box, risk-limiting audit framework for assessing fairness with the four-fifths rule. Inspired by election auditing techniques and sequential hypothesis testing, we propose two-group and multi-group algorithms that maintain risk-limiting guarantees and stop early for innocent models. Unlike fixed-sample methods, our approach evaluates fairness while continuously sampling, allowing auditors to repeatedly request more data when needed. We demonstrate the effectiveness of our algorithms through empirical evaluations on real-world employment datasets collected for New York City's Local Law 144. The audits detect fairness violations correctly 100% of the time and verify fairness after sampling on average 66% of the data in the multi-group setting. Our approach enables third-party auditors to efficiently and confidently evaluate fairness claims, even in settings with limited transparency.
All About Discourse Particles in Gen Z Text Messages... lol

(2025-04-10) Liu, Michelle; Kalin, Laura; Chazelle, Bernard
Whether it’s with our closest BFFs or long-distance lovers, texting has become an integral part of how we communicate with each other in the 21st century. As it becomes an increasingly powerful substitute for chatting in real life, it also begins to naturally develop advanced mechanisms for communicating nuances that have not previously existed in standard writing systems. Discourse particles like lol, lmao, and emojis don’t just fill space — they are a modern way to convey the subtleties of body language, facial expression, and tone. In this paper, we evaluate a corpus of about 200,000 real text messages from one speaker and investigate the typology, pragmatics, and modifiers of these particles. By connecting these phenomena to existing linguistic theories of discourse particles, variation, and digital communication, we explore how texting language has evolved to fit the fast-paced, screen-based communication we engage in every day, becoming a de- tailed, blossoming structured system of its own that is a fruitful field for extensive exploration.
Auditing Google Ad Delivery Optimization for Gender-Based Discrimination in Job Advertisements

(2025-04-10) Wu, Alexis J.; Korolova, Aleksandra
Despite growing awareness of potential discrimination in ad delivery, Google's ad delivery optimization has received limited attention. While researchers have hypothesized that platform-driven decisions in Google-served advertising can lead to discriminatory outcomes, no current methodologies isolate the impact of Google's algorithms in skewing delivery. To address this gap, we evaluate Google's Display Network (GDN) from both an advertiser’s and a third-party auditor’s perspective. Understanding the feasibility of diverse outreach as an advertiser and feasibility of detection of algorithmic bias in delivery are critical, especially in employment advertising, where fairness concerns have both legal and societal implications. We begin by detailing the functionality for creating ads, targeting audiences, and analyzing delivery on GDN. We then develop a novel framework that can assess gender-based skew in Google's Display Network through 61 experiments. We find that ad placements—defined as the websites and apps where ads appear—have the most prominent impact on the breakdown of demographics in delivery, consistently resulting in a delivery to an audience of 63% male users when shown on the entire GDN, and a more balanced audience of 50% of male users when shown only on YouTube's website. Importantly, the typical advertiser trying to reach a balanced audience may not be aware of the potential for such gender skew in the default placement option on GDN and may not have the tools or budget to identify ad placements that would lead to more balanced outreach. Finally, when varying ad images by implied gender, we observe no consistent delivery trends; further research is needed to understand the role of ad creatives in shaping delivery outcomes.
Beyond Algorithms: Autonomous Agentic Systems for Personalized Recommendations

(2025-04-10) Khalid, Roshaan; Adams, Ryan P.
This paper explores generative AI-based agents for autonomous, personalized content recommendations, utilizing state-of-the-art software for high-performance custom workflows, high-dimensional vector storage and searching, language-based tasks, and autonomous 24/7 running capabilities. Using unstructured, unseen, real-time data from YouTube, we utilize large language models to quantitatively handle subjective tasks and evaluate the outcomes. In essence, we created a recommendation system that uses artificial intelligence to autonomously find content and reduce the time spent on manual search. Content recommendation is a prominent problem in the industry, and we find that the performance of our system is satisfactory, and the scope of such systems is substantial. If used in correlation with default recommendation systems, the system can provide an improved interactive recommendation experience.
Bridging Physical and Digital Spaces through AR Zone Triggers in Capybara

(2025-04-10) Mak, Tinney; Monroy-Hernandez, Andres
Augmented Reality (AR) is uniquely positioned to bridge the virtual and physical worlds, enabling new forms of creative expression and interaction. In this work, we introduce the touches zone block in Capybara, an AR-based visual programming environment, that allows children to define spatial regions in their surroundings and program behaviors when virtual characters enter those areas. This feature expands on Capybara’s block-based system by providing a more flexible and intuitive alternative to predefined object detection. We demonstrate the expressive potential of zones through a series of user studies with 20 children in the United States and Argentina. Our findings suggest that zones support rich storytelling tied to physical space, encourage embodied exploration, and enable spatial reasoning through trial-and-error debugging. Participants also proposed future directions such as scanning custom objects, more intelligent interaction with the environment, and designing goal-driven narratives. These insights highlight how spatial triggers like zones can support more active, creative, and personally meaningful AR experiences for children.
Celebrating Excellence, Equally? A Quantitative Analysis of Social Media Posts during the 2024 Paris Olympics and Paralympics

(2025-04-10) Toujas-Bernaté, Clara L.; Fellbaum, Christiane Dorothea
This study examines X posts from the 2024 Paris Olympics and Paralympics using natural language processing (NLP) techniques to conduct a comparative analysis of public discourse. While much existing work has focused on language surrounding the Olympics, studies on the Paralympics remain scarce, and none provide a direct comparison of public perception between the two events. This research addresses this gap by applying both topical and sentiment analysis through a diverse set of NLP methods, including Latent Dirichlet Allocation (LDA) for topic modeling, word frequency analysis, Word2Vec for contextual word relationships, and Valence Aware Dictionary and sEntiment Reasoner (VADER) for sentiment classification and temporal trends. Six datasets are created, consisting of X posts from the Olympic and Paralympic Games, covering both English- and French-language discussions, as well as posts from the general public and the o fficial Olympic and Paralympic X accounts. By analyzing differences in language and sentiment across these datasets, this study explores how perceptions of the two global sporting events vary across cultures and between public discourse and institutional narratives.
ChangeIn: A Dynamic Camera Intrinsics Benchmark for Robust Computer Vision Systems

(2025-04-10) Bhattacharjee, Roma; Deng, Jia
Modern Simultaneous Localization and Mapping (SLAM) systems assume that camera intrinsics remain fixed throughout a video. This assumption breaks down in real-world, “in-the-wild” scenarios where zoom and focus can vary dynamically, meaning existing SLAM systems do not work on real-world videos. To address this, we introduce ChangeIn, a benchmark tailored for per-frame dynamic intrinsics prediction, laying the groundwork for more robust SLAM and other vision systems that can handle dynamic intrinsics.

To properly label ChangeIn’s dataset, we recorded a set of calibration videos using traditional calibration boards and resorting to drone-based methods for wider FOVs. Then, a comprehensive lookup table (LUT) was built that interpolates between our collected calibration data to map any lens focal length and focus distance to intrinsic parameters. Using this table, we produced ground truth intrinsics for a diverse collection of 389 real-world videos (126 indoor, 263 outdoor), featuring focal lengths from 17mm to 250mm and captured across varied environments—from Princeton’s campuses to urban centers. In addition to featuring changing zooms and focus distances, these videos include both static and moving cameras, as well as dynamic scene elements like people, vehicles, and interaction with objects, adding to the realism and complexity of the dataset. Evaluation of three existing intrinsic prediction methods on this benchmark demonstrates much room for improvement.

We hope ChangeIn will serve as a valuable resource for developing and evaluating models that estimate intrinsics on a per-frame basis and ultimately foster new research into SLAM and other computer vision applications on real-world videos.
Computational Approaches to Oral History: Exploring the Densho Digital Repository

(2025-04-10) Taylor, Molly; Kernighan, Brian W.
Oral history collections often contain hundreds—if not thousands—of lengthy interviews. How can computational tools help researchers explore and analyze oral history collections? I apply two topic modeling techniques, BERTopic and Latent Dirichlet Allocation, to address research questions about the Densho Digital Repository, which documents Japanese American incarceration during World War II. I demonstrate how these tools can illuminate connections between ideas, facilitate comparison of interviews, and offer new ways to navigate the collection.
Computational Models of Goal Inference in Open-Ended Domains

(2025-04-10) Siegel, Zachary S.; Griffiths, Tom
How are people able to quickly infer the goals of others when observing just a few of their actions? This work investigates the cognitive mechanisms underlying human goal inference by building and evaluating computational models that capture how people make these goal inferences in unstructured domains. Drawing on the frame- work of Bayesian inference, I formalize the idea that people interpret others’ actions by determining how consistent their observed evidence is with given goal hypotheses. I built a cooking domain called Recipe-Graph and ran human experiments to under- stand how well human predictions agree with those of our computational models. I find that people’s goal predictions correlate with those of our models, suggesting that Bayesian inference can capture how people predict the goals of others around them.
Constraint-based modeling of tissue-specific metabolism for circulating nutrients

(2025-05-06) Peng, Joanne Z.; Rabinowitz, Joshua D.
Metabolism is a complex network of biochemical reactions essential for maintaining physiological homeostasis and enabling adaptive responses to environmental changes. While traditional metabolomics techniques like LC-MS, GC-MS, and NMR provide static snapshots of metabolite abundances, they fail to capture the dynamic fluxes that reflect true metabolic activity. To address this gap, we present a minimal whole-body metabolic model that integrates thermodynamic constraints and tissue-specific data to reconstruct systemic metabolic fluxes. Our approach bridges the strengths of genome-scale models and curated networks by defining 64 core anabolic and catabolic tasks and extracting minimal reaction subnetworks from the RECON3D model using optimization algorithms. These tasks are then mapped to 12 key organs - including liver, muscle, adipose tissue, kidney, and brain - to create tissue-specific models, which are subsequently merged through defined circulatory compartments to simulate whole-body metabolism. We validate our model through flux variability and sampling analyses under fasted physiological conditions, demonstrating its ability to capture tissue-level metabolism and cross-tissue interactions. This framework provides a scalable, interpretable, and physiologically grounded platform for exploring systemic metabolic regulation, with broad applications in precision medicine and metabolic disease research.
Construction and Evaluation of Celltype-Specific Protein-Protein Interaction Networks

(2025-04-10) Mhrous, Emmanuel N.; Zhong, Ellen
Protein-protein interaction (PPI) networks serve as critical tools for probing the molecular mechanisms that define function and connect genotype to phenotype, yet context-agnostic PPI databases fail to capture the celltype-specific contexts in which these interactions occur. This thesis addresses this limitation by integrating single-cell RNA sequencing (scRNA-seq) data into these context-agnostic human PPI networks using two dominant methods in the literature: SCINET (parametric) and PINN (non-parametric). Using a dataset of dopaminergic midbrain neurons implicated in Parkinson's disease, we construct networks with these methods and offer ways to evaluate these networks to ensure they preserve important properties at multiple scales of biology. These include evaluations at the level of functional protein complexes, pathways, celltype-specific processes, and systematic interactions within tissues. Our analyses show that genes implicated in Parkinson's Disease play a significant role in the topology of their respective networks, highlighting the essentiality of these proteins. Furthermore, we construct contextual embeddings using PINNACLE, a graph neural network model for single-cell biology, to represent proteins at a systems-level scale. Despite limitations inherent to PPI representations of biological processes, this thesis emphasizes the importance of context-specificity in these networks, compares different methods of their construction, and offers a robust system of evaluations that show the strengths of different construction methods at various dimensions of biology.
Cosmic Computation: Applying an Astrophysics Lens to COS 126 Assignments

(2025-04-10) Slisher, Alex; Moretti, Christopher M.
Over the last couple of years, Princeton has seen a rise in department-specific, introductory computer science (COS) courses offered. This rise demonstrates a potential interest in subject-specific alternatives to COS 126, the introductory course in the department. However, it is unclear if these courses serve as an adequate alternative to COS 126 for students interested in continuing the COS sequence. As a result, I propose astrophysics adaptations of COS 126 assignments and fully implement three of them. Full-scale implementation includes an assignment specification document, an assignment rubric, a sample solution, and student scaffolding code. After surveying nine students and two COS department instructors, responses indicate that the modified assignments were clear, engaging, and similarly challenging to current COS 126 assignments. Further research should explore how assignments can be adapted using other fields of study.
Court v. Classifier: A Data-Driven Evaluation of Language and Decision-Making on the U.S. Supreme Court

(2025-04-10) Lee, Erin; Kernighan, Brian W.
This thesis investigates the language, behavior, and decision-making of U.S. Supreme Court justices through a computational lens. Grounding my study in structured and curated datasets—including justice- and case-level variables, authored opinions, and over 1,600 transcribed oral arguments—I analyze how justices speak, write, and vote.

I begin with an empirical study of voting patterns, opinion authorship, and judicial trends across natural court eras. I then turn to oral argument behavior, quantifying the participation of justices across alignments and outcomes. Building on these insights, I implement a series of predictive classifiers, replicating and extending a previous statistical model to include oral argument features. While the inclusion of these features yields modest and at times inconclusive improvements in accuracy, they underscore the complexity of predicting voting patterns based on oral argument behavior, given the distinct rhetorical styles and engagement patterns of individual justices. Nonetheless, the findings allude to promising directions for future modeling of case outcomes using alternative features derived from oral arguments.

Finally, I experiment with prompting large language models (LLMs) to classify tones of judicial questioning due to the limitations of more traditional natural language processing techniques. I also simulate justice voting behavior with LLMs on unseen cases, assessing the capabilities of generative AI for legal reasoning. Through our experimentation, the LLMs proved to be limited in their capacity for legal judgement, though they also demonstrate opportunity to be better leveraged when provided additional guidance through fine-tuning.

Altogether, this study offers a data-driven portrait of the Supreme Court and its justices, rooted in empirical data and powered by modern machine learning methods.
Design and Analysis of Planar Linkage Mechanisms With Machine Learning and Other Computational Methods

(2025-05-06) Palaparthi, Adityasai V.; Adams, Ryan P.
Planar linkage mechanisms, or linkages, are systems of rigid links and joints that translate an input motion into desirable output motions. In doing so, linkages enable us to perform complex tasks in fields such as manufacturing automation, robotics, computer graphics, and more with minimal input complexity. Recently, deep generative modeling solutions have been applied to generate linkage designs since these designs live in an intractable distribution; however, no scalable, conditional generative model has been found yet that can generate a set of optimal planar mechanical linkage designs, ranging in complexity, that best fit any type of generated path of motion by the user. In this work, we develop a generative flow network conditioned on linkage mechanism specifications to sample a diverse set of planar mechanisms. While developing this generative model, we also gain a much better understanding of the vast design space of linkages with linear and geometric algebra, graph neural networks, and implicit differentiation.
Design and Evaluation of a Modular Architecture To Assess LLMs in Summarizing Electronic Health Records

(2025-05-02) Chen, Rachel; Kaplan, Alan
Low health literacy affects nearly half of Americans and poses a major barrier to effective healthcare, particularly when interpreting complex electronic health records (EHRs). While large language models (LLMs) offer promising capabilities for simplifying medical information, little research has explored their performance on personalized patient data. This thesis presents a modular framework for evaluating five state-of-the-art LLMs—GPT-4o-mini, Gemini 2.0 Flash, Claude 3.7 Sonnet, DeepSeek V3, and MiniMax-01 Text—on their ability to generate readable, patient-friendly summaries of structured EHRs. Using synthetic data from the Synthea dataset and prompts targeting sixth-grade (AMA) and eighth-grade (NIH) reading levels, the framework measures outputs using quantitative readability metrics, including Flesch-Kincaid, SMOG, and Gunning Fog scores. The results reveal significant variation in tone, complexity, and prompt adherence across models. GPT-4o-mini consistently produced the most readable summaries, while Claude struggled with prompt sensitivity and cost-effectiveness. The findings highlight the importance of prompt engineering, context length, and model choice in improving health communication. This work contributes a replicable evaluation pipeline and underscores the potential of LLMs to enhance health literacy and patient empowerment
Designing for Efficiency: Enhancing the Gig Driver Experience with Driver’s Seat

(2025-04-29) Ahmed, Abani; Monroy-Hernandez, Andres
Driving for apps like Uber, Lyft, and DoorDash has become a central part of the modern-day gig economy. Yet, drivers for these apps often struggle to make ends meet, stay safe, and feel respected, whether due to inefficient routing, long unpaid wait times, information asymmetry (e.g., withheld route details), or difficulty accurately tracking earnings from multiple platforms. In 2019, the Driver’s Seat Cooperative launched an app called Driver's Seat to help drivers make informed decisions by giving them access to their own data in an industry that often limits driver control—features included mileage tracking across multiple platforms, crowd sourced earnings per hour data, and expense tracking. Since then, the app has lacked regular updates, faces usability challenges, and no longer fully meets the evolving needs of its users.

This study proposes a redesign of Driver’s Seat—through targeted features and a more user-friendly interface—to better support gig drivers. We conducted interviews with both current users of Driver’s Seat and drivers who had never used the app, identifying key needs such as access to reliable data, personalized insights, and support for safer working conditions. Based on these findings, we created wire-frames incorporating features that could address the identified needs such as an incident dashboard, a driver resource map, and more individualized insights. We then evaluated these features in workshops and additional semi-structured interviews with drivers. We also consulted with a data ethics expert to assess the broader implications of data sharing and visibility in safety reporting systems. This feedback informed our usability priorities and helped shape future directions for the app’s development, to ensure the redesign effectively supports their daily driving routines.
Elucidating Public Sentiment Toward Chinese Platform Workers via LLM-driven Analysis of Weibo Data

(2025-04-10) Pua, Kok Wei; Xie, Yu; Huang, Junming
This thesis investigates public perceptions of four fast-growing platform-based occupations in China, including food delivery riders, rideshare drivers, e-commerce retailers, and internet influencers, using Weibo data from 2017 to 2024. This research uses attribute-based sentiment analysis (ABSA) to evaluate five occupational dimensions (i.e., economic returns, work environment, work autonomy, career stability, and occupational prestige) and links them to societal perceptions of warmth and competence. Using the Stereotype Content Model framework, the findings show that while rideshare drivers and internet influencers are perceived negatively in terms of warmth and competence, e-commerce retailers are viewed favorably on both dimensions. Meanwhile, food delivery riders are perceived as warm yet lacking competence. This study indicates that occupational prestige stands out as the most influential factor in shaping these perceptions. It also demonstrates that public sentiment can react either temporarily or permanently toward external disruptions, such as the COVID-19 pandemic. Further analysis reveals that the key determinants of whether perception shifts are transient or permanent depends on the degree of structural challenges faced by the occupation. Methodologically, this study introduces a novel and scalable framework that leverages large-language models (LLMs) to perform ABSA, offering more granular insights in both horizontal (cross-occupational) and vertical (within-occupation) dimensions. This interdisciplinary approach not only enriches our understanding of platform workers in China but also provides a replicable model applicable to various domains in computational social science that can greatly benefit policymakers, platforms, and scholars.

Browse

Browsing Computer Science, 1987-2025 by Title