The model-driven power allocation (PA) algorithms in the wireless cellular networks with interfering multiple-access channel (IMAC) have been investigated for decades. Nowadays, the data-driven model-free machine learning-based approaches are rapidly developed in this field, and among them the deep reinforcement learning (DRL) is proved to be of great promising potential. Different from supervised learning, the DRL takes advantages of exploration and exploitation to maximize the objective function under certain constraints. In our paper, we propose a two-step training framework. First, with the off-line learning in simulated environment, a deep Q network (DQN) is trained with deep Q learning (DQL) algorithm, which is well-designed to be in consistent with this PA issue. Second, the DQN will be further fine-tuned with real data in on-line training procedure. The simulation results show that the proposed DQN achieves the highest averaged sum-rate, comparing to the ones with present DQL training. With different user densities, our DQN outperforms benchmark algorithms and thus a good generalization ability is verified.
翻译:几十年来,对无线手机网络中干扰多个接入频道的无线驱动动力分配算法(PA)进行了数十年的调查。如今,数据驱动的无型机器学习方法在这一领域得到迅速发展,其中深度强化学习(DRL)被证明具有巨大的潜力。不同于监督学习,DRL利用探索和开发优势,在某些限制下最大限度地实现目标功能。我们的文件提出了一个两步培训框架。首先,在模拟环境中进行离线学习,深Q网络(DQN)经过深Q学习(DQN)培训,设计得非常周密,与PA问题相一致。第二,DQN将进一步与在线培训程序中的真实数据进行微调。模拟结果表明,拟议的DQN达到最高平均总和率,与目前的DQL培训相比。由于用户密度不同,我们的DQN超越了基准算法,因此实现了良好的全面化能力。