Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Willer, Matt

Publication:
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

datacite.rights	restricted
dc.contributor.advisor	Rigobon, Daniel
dc.contributor.author	Willer, Matt
dc.date.accessioned	2025-08-07T12:56:31Z
dc.date.available	2025-08-07T12:56:31Z
dc.date.issued	2025-04-13
dc.description.abstract	Adaptive clinical trial designs are aimed to improve efficiency and enhance ethical considerations by dynamically allocating patients to treatments based on accruing evidence. In this thesis, we formulate an adaptive clinical trial as a finite-horizon Markov Decision Process (MDP). The trial state comprises patient outcomes and Bayesian-updated treatment success probabilities, and is sequentially updated at each decision point. To solve the resulting treatment allocation decision-making problem, we implement a Soft Actor-Critic (SAC) framework that leverages maximum entropy reinforcement learning to balance exploration and exploitation effectively. To further capture this balance, we employ a weight-adjusted Total Variation Distance (TVD) component to the reward function, enabling us to quantify the value of information gathered between decision points. We conducted numerical simulations under two training schemes: one in which outcomes were generated using the true treatment success probabilities, and another where outcomes were based on the agent’s estimated probabilities. Across diverse hypothetical scenarios varying in cohort size, trial length, and prior knowledge, our SAC-based policy consistently approximated the ideal (oracle) policy in the true-probability setting. The agent was able to achieve success proportions close to that of the optimal policy while judiciously allocating more patients to the superior treatment. When the model was trained on estimated probabilities, performance degraded under high uncertainty or poorly specified priors, sometimes favoring a fixed, non-adaptive approach. Our results underscore the potential and limitations of employing SAC in adaptive trial design. Our proposed model provides a foundation for utilizing reinforcement learning in a clinical trial setting, highlighting the need for accurate prior information to fully realize its benefits. Our framework establishes a rigorous testbed for adaptive patient allocation, providing both theoretical insights and practical guidelines for future clinical trial designs.
dc.identifier.uri	https://theses-dissertations.princeton.edu/handle/88435/dsp01mp48sh22v
dc.language.iso	en_US
dc.title	Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach
dc.type	Princeton University Senior Theses
dspace.entity.type	Publication
dspace.workflow.startDateTime	2025-04-13T04:04:16.564Z
dspace.workflow.startDateTime	2025-04-16T20:09:25.942Z
pu.contributor.authorid	920251573
pu.date.classyear	2025
pu.department	Operations Research and Financial Engineering

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Matthew_Willer_Thesis_Absolutely_Final_Version.pdf
Size:: 4.25 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 100 B
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Operations Research and Financial Engineering, 2000-2025

Publication: Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Files

Original bundle

License bundle

Collections

Publication:
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach