A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management

Toh, Yi Jin

Publication:
A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management

Files

written_final_report.pdf (1.2 MB)

Date

2025-04-10

Authors

Toh, Yi Jin

Abstract

The residential building sector is a major contributor to energy consumption and greenhouse gas emissions, making electrification and intelligent energy management essential for decarbonization. However, increased electricity demand can strain the power grid, leading to higher costs and emissions. Demand-side flexibility, enabled by on-site power generation, energy storage, and optimized control algorithms, can mitigate this problem by shifting electricity consumption to times when electricity is cheaper and cleaner.

This study evaluates three methods for centralized building energy storage management using CityLearn, an open-source environment for simulating and benchmarking building energy control. The evaluation compares Model Predictive Control (MPC) with two Reinforcement Learning (RL) methods: Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO). The methods are assessed across three dimensions: (1) energy performance, including cost, carbon emissions, electricity consumption, and stability of electricity use over time; (2) computational efficiency, including training time, memory usage, and inference speed; and (3) scalability, measured across different district sizes of two, four, and eight buildings.

Overall, SAC achieved the strongest performance on cost and energy metrics, performing slightly better than PPO in those areas. PPO, however, produced smoother control behavior with more stable electricity use over time while requiring significantly less memory than SAC and less computation than MPC. Both RL methods outperformed MPC across most metrics, with MPC particularly struggling to scale. Nonetheless, MPC remained more interpretable and required no training data, though it involved substantial engineering effort to develop an accurate system model.

These findings highlight trade-offs between performance, stability, and deployability. PPO emerged as the most balanced controller, offering strong performance with scalability and computational efficiency, making it well-suited for real-world use.

URI

https://theses-dissertations.princeton.edu/handle/88435/dsp01j9602408k

Collections

Computer Science, 1987-2025

Full item page

Publication:
A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication: A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
A Comparison of Model Predictive Control and Reinforcement Learning Methods for Building Energy Storage Management