Modeling dynamic 3D environments from LiDAR sequences is central to building reliable 4D worlds for autonomous driving and embodied AI. Existing generative frameworks, however, often treat all spatial regions uniformly, overlooking the varying uncertainty across real-world scenes. This uniform generation leads to artifacts in complex or ambiguous regions, limiting realism and temporal stability. In this work, we present U4D, an uncertainty-aware framework for 4D LiDAR world modeling. Our approach first estimates spatial uncertainty maps from a pretrained segmentation model to localize semantically challenging regions. It then performs generation in a "hard-to-easy" manner through two sequential stages: (1) uncertainty-region modeling, which reconstructs high-entropy regions with fine geometric fidelity, and (2) uncertainty-conditioned completion, which synthesizes the remaining areas under learned structural priors. To further ensure temporal coherence, U4D incorporates a mixture of spatio-temporal (MoST) block that adaptively fuses spatial and temporal representations during diffusion. Extensive experiments show that U4D produces geometrically faithful and temporally consistent LiDAR sequences, advancing the reliability of 4D world modeling for autonomous perception and simulation.
翻译:从LiDAR序列中建模动态三维环境是构建自动驾驶与具身AI可靠四维世界的核心任务。然而,现有生成框架通常对所有空间区域进行统一处理,忽视了现实场景中不确定性的空间差异。这种均匀生成方式会导致复杂或模糊区域产生伪影,限制了真实感与时间稳定性。本研究提出U4D,一种面向四维LiDAR世界建模的不确定性感知框架。该方法首先通过预训练分割模型估计空间不确定性图谱,以定位语义层面具有挑战性的区域;随后采用“从难到易”的两阶段生成流程:(1)不确定性区域建模——以精细几何保真度重建高熵区域;(2)不确定性条件补全——在学习的结构先验下合成剩余区域。为增强时间连贯性,U4D引入时空混合模块,在扩散过程中自适应融合空间与时间表征。大量实验表明,U4D能生成几何精确且时间一致的LiDAR序列,显著提升了自动驾驶感知与仿真中四维世界建模的可靠性。