We introduce GasRL, a simulator that couples a calibrated representation of the natural gas market with a model of storage-operator policies trained with deep reinforcement learning (RL). We use it to analyse how optimal stockpile management affects equilibrium prices and the dynamics of demand and supply. We test various RL algorithms and find that Soft Actor Critic (SAC) exhibits superior performance in the GasRL environment: multiple objectives of storage operators - including profitability, robust market clearing and price stabilisation - are successfully achieved. Moreover, the equilibrium price dynamics induced by SAC-derived optimal policies have characteristics, such as volatility and seasonality, that closely match those of real-world prices. Remarkably, this adherence to the historical distribution of prices is obtained without explicitly calibrating the model to price data. We show how the simulator can be used to assess the effects of EU-mandated minimum storage thresholds. We find that such thresholds have a positive effect on market resilience against unanticipated shifts in the distribution of supply shocks. For example, with unusually large shocks, market disruptions are averted more often if a threshold is in place.
翻译:我们提出了GasRL,这是一个将经过校准的天然气市场表征与通过深度强化学习(RL)训练的储存运营商策略模型相结合的模拟器。我们利用该模拟器分析最优库存管理如何影响均衡价格以及供需动态。我们测试了多种RL算法,发现软演员-评论家(SAC)在GasRL环境中表现出卓越性能:储存运营商的多个目标——包括盈利能力、稳健的市场出清和价格稳定——均成功实现。此外,由SAC推导的最优策略所诱导的均衡价格动态具有与实际价格高度匹配的特征,如波动性和季节性。值得注意的是,这种对历史价格分布的遵循是在未明确针对价格数据校准模型的情况下获得的。我们展示了如何使用该模拟器评估欧盟强制实施的最低储存阈值的影响。研究发现,此类阈值对市场抵御供应冲击分布意外变化的能力具有积极影响。例如,在遭遇异常大规模冲击时,若存在阈值,市场中断的发生频率会显著降低。