Event cameras offer a high temporal resolution over traditional frame-based cameras, which makes them suitable for motion and structure estimation. However, it has been unclear how event-based 3D Gaussian Splatting (3DGS) approaches could leverage fine-grained temporal information of sparse events. This work proposes a framework to address the trade-off between accuracy and temporal resolution in event-based 3DGS. Our key idea is to decouple the rendering into two branches: event-by-event geometry (depth) rendering and snapshot-based radiance (intensity) rendering, by using ray-tracing and the image of warped events. The extensive evaluation shows that our method achieves state-of-the-art performance on the real-world datasets and competitive performance on the synthetic dataset. Also, the proposed method works without prior information (e.g., pretrained image reconstruction models) or COLMAP-based initialization, is more flexible in the event selection number, and achieves sharp reconstruction on scene edges with fast training time. We hope that this work deepens our understanding of the sparse nature of events for 3D reconstruction. The code will be released.
翻译:相较于传统帧式相机,事件相机具备更高的时间分辨率,这使其在运动与结构估计任务中具有优势。然而,现有的事件驱动三维高斯溅射(3DGS)方法如何有效利用稀疏事件的细粒度时序信息尚不明确。本研究提出一种框架,以解决事件驱动3DGS中精度与时间分辨率之间的权衡问题。我们的核心思想是通过光线追踪与扭曲事件图像,将渲染过程解耦为两个分支:逐事件几何(深度)渲染与基于快照的辐射(强度)渲染。大量实验评估表明,我们的方法在真实世界数据集上达到了最先进的性能,并在合成数据集上取得了具有竞争力的结果。此外,所提方法无需先验信息(例如预训练的图像重建模型)或基于COLMAP的初始化,在事件选择数量上更具灵活性,能够在快速训练时间内实现场景边缘的锐利重建。我们希望这项工作能深化对稀疏事件在三维重建中本质特性的理解。代码将予以开源。