Inter-Cell Interference Coordination (ICIC) is a promising way to improve energy efficiency in wireless networks, especially where small base stations are densely deployed. However, traditional optimization based ICIC schemes suffer from severe performance degradation with complex interference pattern. To address this issue, we propose a Deep Reinforcement Learning with Deterministic Policy and Target (DRL-DPT) framework for ICIC in wireless networks. DRL-DPT overcomes the main obstacles in applying reinforcement learning and deep learning in wireless networks, i.e. continuous state space, continuous action space and convergence. Firstly, a Deep Neural Network (DNN) is involved as the actor to obtain deterministic power control actions in continuous space. Then, to guarantee the convergence, an online training process is presented, which makes use of a dedicated reward function as the target rule and a policy gradient descent algorithm to adjust DNN weights. Experimental results show that the proposed DRL-DPT framework consistently outperforms existing schemes in terms of energy efficiency and throughput under different wireless interference scenarios. More specifically, it improves up to 15% of energy efficiency with faster convergence rate.
翻译:为解决这一问题,我们提议在无线网络中为国际通信中心的 " 确定性政策和目标 " 深入强化学习(DRL-DPT)框架。DRL-DPT克服了在无线网络中应用强化学习和深层次学习的主要障碍,即连续状态空间、持续行动空间和融合。首先,深神经网络(DNN)作为行为体参与,以在连续空间获得确定性电力控制行动。随后,为了保证统一,介绍了在线培训进程,利用专门奖励功能作为目标规则,并采用政策梯度下行算法调整DNN重量。实验结果表明,拟议的DRL-DPT框架在能源效率和不同无线干扰情景下吞化方面始终超越现有计划。更具体地说,它将能源效率的15%提高到更快的趋同率。