An integration of satellites and terrestrial networks is crucial for enhancing performance of next generation communication systems. However, the networks are hindered by the long-distance path loss and security risks in dense urban environments. In this work, we propose a satellite-terrestrial covert communication system assisted by the aerial active simultaneous transmitting and reflecting reconfigurable intelligent surface (AASTAR-RIS) to improve the channel capacity while ensuring the transmission covertness. Specifically, we first derive the minimal detection error probability (DEP) under the worst condition that the Warden has perfect channel state information (CSI). Then, we formulate an AASTAR-RIS-assisted satellite-terrestrial covert communication optimization problem (ASCCOP) to maximize the sum of the fair channel capacity for all ground users while meeting the strict covert constraint, by jointly optimizing the trajectory and active beamforming of the AASTAR-RIS. Due to the challenges posed by the complex and high-dimensional state-action spaces as well as the need for efficient exploration in dynamic environments, we propose a generative deterministic policy gradient (GDPG) algorithm, which is a generative deep reinforcement learning (DRL) method to solve the ASCCOP. Concretely, the generative diffusion model (GDM) is utilized as the policy representation of the algorithm to enhance the exploration process by generating diverse and high-quality samples through a series of denoising steps. Moreover, we incorporate an action gradient mechanism to accomplish the policy improvement of the algorithm, which refines the better state-action pairs through the gradient ascent. Simulation results demonstrate that the proposed approach significantly outperforms important benchmarks.
翻译:卫星与地面网络的融合对于提升下一代通信系统性能至关重要。然而,在密集城市环境中,网络性能受到长距离路径损耗和安全风险的制约。本文提出了一种由空中主动式同时透射反射可重构智能表面(AASTAR-RIS)辅助的星地隐蔽通信系统,旨在提高信道容量并确保传输隐蔽性。具体而言,我们首先推导了在监听者拥有完美信道状态信息(CSI)的最坏条件下的最小检测错误概率(DEP)。随后,通过联合优化AASTAR-RIS的轨迹和主动波束成形,我们构建了一个AASTAR-RIS辅助的星地隐蔽通信优化问题(ASCCOP),以在满足严格隐蔽约束的前提下,最大化所有地面用户的公平信道容量之和。针对复杂高维状态-动作空间带来的挑战以及动态环境中高效探索的需求,我们提出了一种生成式确定性策略梯度(GDPG)算法——一种基于生成式深度强化学习(DRL)的方法来求解ASCCOP。具体地,该算法采用生成式扩散模型(GDM)作为策略表示,通过一系列去噪步骤生成多样化的高质量样本以增强探索过程。此外,我们引入了动作梯度机制来实现算法的策略改进,该机制通过梯度上升优化状态-动作对。仿真结果表明,所提方法在关键性能指标上显著优于现有基准方案。