Zhiang Zhang to Present PhD Dissertation Defense Wednesday 11 September

300x300 logo.jpg

Zhiang Zhang will present the defense for the dissertation, "Whole Building Energy Model Assisted HVAC Supervisory Control via Reinforcement Learning," to obtain the PhD in Building Performance & Diagnostics (PhD-BPD) on Wednesday 11 September.

Title: “Whole Building Energy Model Assisted HVAC Supervisory Control via Reinforcement Learning”
By Zhiang Zhang, PhD-BPD Candidate

Date: Wednesday, 11 September 2019
Time: 8:30am
Location: MMCH 415 IW Large Conference Room


PhD Advisory Committee

Khee Poh Lam, School of Architecture, Carnegie Mellon University (Chair)
Mario Berges, Civil and Environmental Engineering, Carnegie Mellon University
Gianni Di Caro, School of Computer Science, Carnegie Mellon University
Adrian Chong, Department of Building, National University of Singapore


Buildings account for a significant portion of the total energy consumption of many countries. Therefore, energy efficiency is one of the primary objectives of today's new building projects. Whole building energy model (BEM) is widely used by building designers to predict and improve the energy performance of building design. As a detailed physics-based modeling method, BEM also has the potential for developing supervisory control strategies for HVAC systems. The derived control strategies may significantly improve HVAC energy efficiency compared to the widely-used rule-based control strategies.

However, it is challenging to use BEM for HVAC control. This is because, firstly, BEM is a high-order model so that classical model-based control methods cannot be directly applied. Heuristic search algorithms, such as genetic algorithm, must be used for BEM-based control optimization. However, BEM has slow computational speed compared to other black-box or grey-box models, which limits its application for large-scale optimization problems. 

Model-free reinforcement learning (RL) is an alternative method to use BEM for HVAC control. Model-free RL is a “trial-and-error” learning method that is applicable for any complex systems. Therefore, BEM can be used as a simulator to train an RL agent offline to learn energy-efficient supervisory control strategies. However, reinforcement learning control for HVAC systems has not been adequately studied. Most existing studies are based on over-simplified HVAC systems and a limited number of experiment scenarios.

This study develops a BEM-assisted reinforcement learning control framework for HVAC supervisory control for energy efficiency. The control framework uses a design-stage BEM to “learn“ control strategies via model-free RL. Through computer simulations, the control framework is evaluated in different scenarios covering four typical commercial HVAC systems, four climates, and two building thermal mass levels. The robustness of the RL-derived control strategies is also evaluated through altering weather conditions and building operation schedules in different “perturbed” simulators. 

The control framework has achieved satisfactory control performance in a variable-air-volume (VAV) system for both cooling and heating under different climates and building thermal mass levels. Compared to the baseline rule-based control strategies, the RL-derived strategies can achieve obvious energy-saving and less setpoint notmet time. Also, the RL-derived strategies are robust to the changes in the weather conditions and the occupancy/plug-load schedules. However, the RL-derived control strategies have worse-than-baseline energy performance if the schedule of the indoor air temperature setpoint is changed. 

The control framework has also achieved reduced heating demand and improved-or-similar thermal comfort (compared to the baseline rule-based control) for a slow-response radiant heating system in all the experiment scenarios. Also, the RL-derived strategies have achieved good control performance in the perturbed simulators. However, the reward function must include a specially-designed heuristic to deal with the slow thermal response and the imperfect energy metric of this system. This indicates that the reward function design is crucial for the control performance of this control framework. 

The control performance may be poor if the reward function is over-complicated, as shown in the experiments related to a multi-chiller chilled water system. The reward function for this system consists of three complicated penalty functions corresponding to three operational constraints, including chiller cycling time, chiller partial-load-ratio, and the system supply water temperature. The RL-derived control strategies have violated the operational constraints significantly, and only achieved a limited amount of energy saving.

This thesis also studies the effects of the neural network model complexity on the control and convergence performance of reinforcement learning. It is found that a complex neural network model does not necessarily lead to better control performance than a simple neural network model. In the opposite, a complex neural network model may make the reinforcement learning hard to converge. Therefore, “deep” reinforcement learning is not necessary for HVAC control, even though it is a popular concept in recent literature. As a general guideline, this study recommends using a narrow and shallow non-linear neural network model for reinforcement learning.

In future work, the control framework should be evaluated in more scenarios, such as more types of buildings and HVAC systems, more climate zones, and so on. Besides, the framework should be validated in real-life implementation experiments. It is also necessary to conduct theoretical studies on the effects of the hyperparameters on reinforcement learning, such as the neural network model architecture, selection of the learning rate, and others. Last but not least, future work should develop an adaptive RL control method that could self-adapt to the changing operational conditions of an HVAC system.  

View dissertation here.