Efficient transmission of 3D point cloud data is critical for advanced perception in centralized and decentralized multi-agent robotic systems, especially nowadays with the growing reliance on edge and cloud-based processing. However, the large and complex nature of point clouds creates challenges under bandwidth constraints and intermittent connectivity, often degrading system performance. We propose a deep compression framework based on semantic scene graphs. The method decomposes point clouds into semantically coherent patches and encodes them into compact latent representations with semantic-aware encoders conditioned by Feature-wise Linear Modulation (FiLM). A folding-based decoder, guided by latent features and graph node attributes, enables structurally accurate reconstruction. Experiments on the SemanticKITTI and nuScenes datasets show that the framework achieves state-of-the-art compression rates, reducing data size by up to 98% while preserving both structural and semantic fidelity. In addition, it supports downstream applications such as multi-robot pose graph optimization and map merging, achieving trajectory accuracy and map alignment comparable to those obtained with raw LiDAR scans.
翻译:三维点云数据的高效传输对于集中式与去中心化多智能体机器人系统中的高级感知至关重要,尤其在当前对边缘与云端处理依赖日益增长的背景下。然而,点云数据规模庞大且结构复杂,在带宽受限与间歇性连接条件下带来挑战,常导致系统性能下降。本文提出一种基于语义场景图的深度压缩框架。该方法将点云分解为语义连贯的区块,并通过基于特征线性调制(FiLM)条件的语义感知编码器将其编码为紧凑的潜在表示。在潜在特征与图节点属性引导下,基于折叠结构的解码器能够实现结构精确的重建。在SemanticKITTI与nuScenes数据集上的实验表明,该框架实现了最先进的压缩率,在保持结构与语义保真度的同时将数据规模缩减高达98%。此外,该框架支持多机器人位姿图优化与地图融合等下游应用,其轨迹精度与地图对齐效果可与原始激光雷达扫描数据相媲美。