Princeton University users: to view a senior thesis while away from campus, connect to the campus network via the Global Protect virtual private network (VPN). Unaffiliated researchers: please note that requests for copies are handled manually by staff and require time to process.
 

Publication:

Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

datacite.rightsrestricted
dc.contributor.advisorRigobon, Daniel
dc.contributor.authorWiller, Matt
dc.date.accessioned2025-08-07T12:56:31Z
dc.date.available2025-08-07T12:56:31Z
dc.date.issued2025-04-13
dc.description.abstractAdaptive clinical trial designs are aimed to improve efficiency and enhance ethical considerations by dynamically allocating patients to treatments based on accruing evidence. In this thesis, we formulate an adaptive clinical trial as a finite-horizon Markov Decision Process (MDP). The trial state comprises patient outcomes and Bayesian-updated treatment success probabilities, and is sequentially updated at each decision point. To solve the resulting treatment allocation decision-making problem, we implement a Soft Actor-Critic (SAC) framework that leverages maximum entropy reinforcement learning to balance exploration and exploitation effectively. To further capture this balance, we employ a weight-adjusted Total Variation Distance (TVD) component to the reward function, enabling us to quantify the value of information gathered between decision points. We conducted numerical simulations under two training schemes: one in which outcomes were generated using the true treatment success probabilities, and another where outcomes were based on the agent’s estimated probabilities. Across diverse hypothetical scenarios varying in cohort size, trial length, and prior knowledge, our SAC-based policy consistently approximated the ideal (oracle) policy in the true-probability setting. The agent was able to achieve success proportions close to that of the optimal policy while judiciously allocating more patients to the superior treatment. When the model was trained on estimated probabilities, performance degraded under high uncertainty or poorly specified priors, sometimes favoring a fixed, non-adaptive approach. Our results underscore the potential and limitations of employing SAC in adaptive trial design. Our proposed model provides a foundation for utilizing reinforcement learning in a clinical trial setting, highlighting the need for accurate prior information to fully realize its benefits. Our framework establishes a rigorous testbed for adaptive patient allocation, providing both theoretical insights and practical guidelines for future clinical trial designs.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp01mp48sh22v
dc.language.isoen_US
dc.titleBayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-04-13T04:04:16.564Z
dspace.workflow.startDateTime2025-04-16T20:09:25.942Z
pu.contributor.authorid920251573
pu.date.classyear2025
pu.departmentOperations Research and Financial Engineering

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Matthew_Willer_Thesis_Absolutely_Final_Version.pdf
Size:
4.25 MB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download