Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Willer, Matt

Publication:
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Files

Matthew_Willer_Thesis_Absolutely_Final_Version.pdf (4.25 MB)

Date

2025-04-13

Authors

Willer, Matt

Abstract

Adaptive clinical trial designs are aimed to improve efficiency and enhance ethical considerations by dynamically allocating patients to treatments based on accruing evidence. In this thesis, we formulate an adaptive clinical trial as a finite-horizon Markov Decision Process (MDP). The trial state comprises patient outcomes and Bayesian-updated treatment success probabilities, and is sequentially updated at each decision point. To solve the resulting treatment allocation decision-making problem, we implement a Soft Actor-Critic (SAC) framework that leverages maximum entropy reinforcement learning to balance exploration and exploitation effectively. To further capture this balance, we employ a weight-adjusted Total Variation Distance (TVD) component to the reward function, enabling us to quantify the value of information gathered between decision points. We conducted numerical simulations under two training schemes: one in which outcomes were generated using the true treatment success probabilities, and another where outcomes were based on the agent’s estimated probabilities. Across diverse hypothetical scenarios varying in cohort size, trial length, and prior knowledge, our SAC-based policy consistently approximated the ideal (oracle) policy in the true-probability setting. The agent was able to achieve success proportions close to that of the optimal policy while judiciously allocating more patients to the superior treatment. When the model was trained on estimated probabilities, performance degraded under high uncertainty or poorly specified priors, sometimes favoring a fixed, non-adaptive approach. Our results underscore the potential and limitations of employing SAC in adaptive trial design. Our proposed model provides a foundation for utilizing reinforcement learning in a clinical trial setting, highlighting the need for accurate prior information to fully realize its benefits. Our framework establishes a rigorous testbed for adaptive patient allocation, providing both theoretical insights and practical guidelines for future clinical trial designs.

URI

https://theses-dissertations.princeton.edu/handle/88435/dsp01mp48sh22v

Collections

Operations Research and Financial Engineering, 2000-2025

Full item page

Publication:
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication: Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach