Recent Wave Energy Converters (WEC) are equipped with multiple legs and generators to maximize energy generation. Traditional controllers have shown limitations to capture complex wave patterns and the controllers must efficiently maximize the energy capture. This paper introduces a Multi-Agent Reinforcement Learning controller (MARL), which outperforms the traditionally used spring damper controller. Our initial studies show that the complex nature of problems makes it hard for training to converge. Hence, we propose a novel skip training approach which enables the MARL training to overcome performance saturation and converge to more optimum controllers compared to default MARL training, boosting power generation. We also present another novel hybrid training initialization (STHTI) approach, where the individual agents of the MARL controllers can be initially trained against the baseline Spring Damper (SD) controller individually and then be trained one agent at a time or all together in future iterations to accelerate convergence. We achieved double-digit gains in energy efficiency over the baseline Spring Damper controller with the proposed MARL controllers using the Asynchronous Advantage Actor-Critic (A3C) algorithm.
翻译:最近波能转换器(WEC)配备了多条腿和发电机,以最大限度地产生能源。传统控制器在捕捉复杂的波形模式方面表现出了局限性,控制器必须有效地最大限度地增加能源捕获量。本文介绍了一个多代理强化学习控制器(MARL),该控制器比传统上使用的弹簧阻隔控制器(MARL)表现得更好。我们的初步研究显示,问题的复杂性使得培训难于集中。因此,我们建议采用新的跳过培训方法,使MARL培训能够克服性能饱和,并与默认的MARL培训相比,向更优化的控制器汇合。我们还介绍了另一种新型混合培训初始化(STHTI)方法,在这个方法下,MARL控制器的个别代理器可以单独地接受基线弹簧阻控制器(SDML)控制器的训练,然后在将来一起培训一个代理器,以加速趋同。我们利用Asyncronous Advantage Ador-Critict (A3C) 算算算法,在基线的Spring Spring Spry-Dy-Dmarper控制器上实现了能源效率取得了两位数。我们与拟议的MARL控制器(A3C)。