Joint bidding and pricing for electricity retailers based on multi-task deep reinforcement learning

Hongsheng Xu*, Qiuwei Wu, Jinyu Wen, Zhihong Yang

*Corresponding author for this work

    Research output: Contribution to journalJournal articleResearchpeer-review


    The single-task deep reinforcement learning (STDRL)-based methods solve the joint bidding and pricing problem for the electricity retailer in a hierarchical electricity market by defining a bidding policy and a pricing policy separately, which may suffer from low learning efficiency, time-consuming training and local optimization. To deal with these issues, this paper proposes a novel Multi-task Deep reinforcement learning approach for Joint Bidding and Pricing (MDJBP) optimization model. MDJBP can deal with the bidding and pricing tasks concurrently through a shared long short-term memory (LSTM) representation network to distill meaningful temporal characteristics from high-dimensional environment states. Furthermore, we develop a deep neural network (DNN) structure consisting a regression branch for bidding task and a soft actor-critic (SAC) branch for pricing task with automating entropy adjustment and adaptive loss weighting to implement MDJBP. The proposed multi-task deep reinforcement learning (MTDRL)-based method is tested with the IEEE 30-bus system. Numerical results show that the proposed methodology succeeds in giving an optimal joint bidding and pricing policy by fully exploiting commonalities and differences between bidding task and pricing task, and thereby boosts the profit, improves learning efficiency, reduces training time, and enhances stability.
    Original languageEnglish
    Article number107897
    JournalInternational Journal of Electrical Power and Energy Systems
    Number of pages17
    Publication statusPublished - 2022


    • Deep reinforcement learning
    • Multi-task learning
    • Electricity market
    • Demand response
    • Bidding
    • Pricing


    Dive into the research topics of 'Joint bidding and pricing for electricity retailers based on multi-task deep reinforcement learning'. Together they form a unique fingerprint.

    Cite this