In this paper, we for the first time investigate the random access problem for a delay-constrained heterogeneous wireless network. We begin with a simple two-device problem where two devices deliver delay-constrained traffic to an access point (AP) via a common unreliable collision channel. By assuming that one device (called Device 1) adopts ALOHA, we aim to optimize the random access scheme of the other device (called Device 2). The most intriguing part of this problem is that Device 2 does not know the information of Device 1 but needs to maximize the system timely throughput. We first propose a Markov Decision Process (MDP) formulation to derive a model-based upper bound so as to quantify the performance gap of certain random access schemes. We then utilize reinforcement learning (RL) to design an R-learning-based random access scheme, called tiny state-space R-learning random access (TSRA), which is subsequently extended for the tackling of the general multi-device problem. We carry out extensive simulations to show that the proposed TSRA simultaneously achieves higher timely throughput, lower computation complexity, and lower power consumption than the existing baseline--deep-reinforcement learning multiple access (DLMA). This indicates that our proposed TSRA scheme is a promising means for efficient random access over massive mobile devices with limited computation and battery capabilities.
翻译:在本文中,我们首次调查延迟限制的多式无线网络的随机访问问题。 我们首先研究两个装置通过一个共同的不可靠的碰撞通道向一个入口点(AP)输送受延迟限制的交通。 我们假设一个装置(称为“设备1”)采用ALOHA, 我们的目标是优化另一个装置(称为“设备2”)的随机访问计划(称为“设备2)。 这个问题最有趣的部分是, 设备2 不了解设备1 的信息, 但需要最大限度地实现系统及时输送。 我们首先提出一个 Markov 决策程序( MDP), 以产生一个基于模型的上限, 以量化某些随机访问计划的性能差距。 我们然后利用强化学习( RL) 来设计一个基于R学习的随机访问计划, 称为“ 小型州- 空间学习随机访问 ” ( TRA ), 其随后的扩展是为了解决一般的多设备问题。 我们进行了广泛的模拟, 以显示拟议的 TSRA( MA) 同时实现了更及时的配置, 更低的计算复杂性, 以及更低的电量消耗能力比现有的基线- 甚的TAMA( ) 的大规模访问能力表示, 这个有希望的大规模访问能力。