TY - JOUR
T1 - A Comparison of the State-of-the-Art Reinforcement Learning Algorithms for Health-Aware Energy & Emissions Management in Zero-emission Ships
AU - Reddy, Namireddy Praveen
AU - Skjetne, Roger
AU - Os, Oliver Stugard
AU - Papageorgiou, Dimitrios
PY - 2024
Y1 - 2024
N2 - Zero-emission ships (ZES) have gained interest to comply with the stringent regulations of international maritime organization. One way to build ZES is the hybridization of fuel cells with batteries. Traditionally, for a newly built ship, the Energy & Emissions Management System (EEMS) is designed based on the initial condition of the fuel cells and batteries and used with fixed parameters in future execution. However, for a fuel cell and battery ZES, the EEMS gradually becomes sub-optimal since the characteristics of fuel cells and batteries are continuously changing due to aging and degradation. In this paper, a reinforcement learning (RL) based EEMS is developed such that it can learn and adapt continuously to changes in the fuel cell/battery characteristics. Within RL, different types of algorithms such as double deep Q learning (DDQL), soft actor-critic (SAC), and proximal policy optimization (PPO) are implemented. The results are benchmarked against those of a typical rule-based EEMS. Each RL algorithm is trained with four reward function formulations; negative cost (r1), negative quadratic cost (r2), inverse cost (r3), and inverse quadratic cost (r4). The results demonstrate that health-aware EEMS can minimize fuel consumption and component degradation costs. r1 has led to the lowest operational expenses (OPEX) followed by r2, while r3 and r4 have high OPEX. Among the three algorithms, the DDQL led to the lowest reward followed by the SAC and then the PPO, when trained with r1 and r2.
AB - Zero-emission ships (ZES) have gained interest to comply with the stringent regulations of international maritime organization. One way to build ZES is the hybridization of fuel cells with batteries. Traditionally, for a newly built ship, the Energy & Emissions Management System (EEMS) is designed based on the initial condition of the fuel cells and batteries and used with fixed parameters in future execution. However, for a fuel cell and battery ZES, the EEMS gradually becomes sub-optimal since the characteristics of fuel cells and batteries are continuously changing due to aging and degradation. In this paper, a reinforcement learning (RL) based EEMS is developed such that it can learn and adapt continuously to changes in the fuel cell/battery characteristics. Within RL, different types of algorithms such as double deep Q learning (DDQL), soft actor-critic (SAC), and proximal policy optimization (PPO) are implemented. The results are benchmarked against those of a typical rule-based EEMS. Each RL algorithm is trained with four reward function formulations; negative cost (r1), negative quadratic cost (r2), inverse cost (r3), and inverse quadratic cost (r4). The results demonstrate that health-aware EEMS can minimize fuel consumption and component degradation costs. r1 has led to the lowest operational expenses (OPEX) followed by r2, while r3 and r4 have high OPEX. Among the three algorithms, the DDQL led to the lowest reward followed by the SAC and then the PPO, when trained with r1 and r2.
KW - Batteries
KW - Costs
KW - Fuel cells
KW - Fuels
KW - Degradation
KW - Marine vehicles
KW - Energy management
U2 - 10.1109/JESTIE.2023.3331230
DO - 10.1109/JESTIE.2023.3331230
M3 - Journal article
SN - 2687-9743
VL - 5
SP - 149
EP - 166
JO - IEEE Journal of Emerging and Selected Topics in Industrial Electronics
JF - IEEE Journal of Emerging and Selected Topics in Industrial Electronics
IS - 1
ER -