Princeton University Users: If you would like to view a senior thesis while you are away from campus, you will need to connect to the campus network remotely via the Global Protect virtual private network (VPN). If you are not part of the University requesting a copy of a thesis, please note, all requests are handled manually by staff and will require additional time to process.
 

Publication:

Design and Evaluation of a Modular Architecture To Assess LLMs in Summarizing Electronic Health Records

dc.contributor.advisorKaplan, Alan
dc.contributor.authorChen, Rachel
dc.date.accessioned2025-08-06T15:49:23Z
dc.date.available2025-08-06T15:49:23Z
dc.date.issued2025-05-02
dc.description.abstractLow health literacy affects nearly half of Americans and poses a major barrier to effective healthcare, particularly when interpreting complex electronic health records (EHRs). While large language models (LLMs) offer promising capabilities for simplifying medical information, little research has explored their performance on personalized patient data. This thesis presents a modular framework for evaluating five state-of-the-art LLMs—GPT-4o-mini, Gemini 2.0 Flash, Claude 3.7 Sonnet, DeepSeek V3, and MiniMax-01 Text—on their ability to generate readable, patient-friendly summaries of structured EHRs. Using synthetic data from the Synthea dataset and prompts targeting sixth-grade (AMA) and eighth-grade (NIH) reading levels, the framework measures outputs using quantitative readability metrics, including Flesch-Kincaid, SMOG, and Gunning Fog scores. The results reveal significant variation in tone, complexity, and prompt adherence across models. GPT-4o-mini consistently produced the most readable summaries, while Claude struggled with prompt sensitivity and cost-effectiveness. The findings highlight the importance of prompt engineering, context length, and model choice in improving health communication. This work contributes a replicable evaluation pipeline and underscores the potential of LLMs to enhance health literacy and patient empowerment
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp015t34sp032
dc.language.isoen_US
dc.titleDesign and Evaluation of a Modular Architecture To Assess LLMs in Summarizing Electronic Health Records
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-05-02T23:42:55.923Z
pu.contributor.authorid920252550
pu.date.classyear2025
pu.departmentComputer Science
pu.minorStatistics and Machine Learning

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
final_thesis.pdf
Size:
921.93 KB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download