无需训练的多视角扩展IC-Light：面向文本位置感知的场景重光照 (Training-Free Multi-View Extension of IC-Light for Textual Position-Aware Scene Relighting)

We introduce GS-Light, an efficient, textual position-aware pipeline for text-guided relighting of 3D scenes represented via Gaussian Splatting (3DGS). GS-Light implements a training-free extension of a single-input diffusion model to handle multi-view inputs. Given a user prompt that may specify lighting direction, color, intensity, or reference objects, we employ a large vision-language model (LVLM) to parse the prompt into lighting priors. Using off-the-shelf estimators for geometry and semantics (depth, surface normals, and semantic segmentation), we fuse these lighting priors with view-geometry constraints to compute illumination maps and generate initial latent codes for each view. These meticulously derived init latents guide the diffusion model to generate relighting outputs that more accurately reflect user expectations, especially in terms of lighting direction. By feeding multi-view rendered images, along with the init latents, into our multi-view relighting model, we produce high-fidelity, artistically relit images. Finally, we fine-tune the 3DGS scene with the relit appearance to obtain a fully relit 3D scene. We evaluate GS-Light on both indoor and outdoor scenes, comparing it to state-of-the-art baselines including per-view relighting, video relighting, and scene editing methods. Using quantitative metrics (multi-view consistency, imaging quality, aesthetic score, semantic similarity, etc.) and qualitative assessment (user studies), GS-Light demonstrates consistent improvements over baselines. Code and assets will be made available upon publication.

翻译：本文提出GS-Light，一种高效、文本位置感知的流程，用于基于文本引导对高斯溅射（3DGS）表示的3D场景进行重光照。GS-Light实现了对单输入扩散模型的无训练扩展，以处理多视角输入。给定可能指定光照方向、颜色、强度或参考对象的用户提示，我们采用大型视觉语言模型（LVLM）将提示解析为光照先验。利用现成的几何与语义估计器（深度、表面法线和语义分割），我们将这些光照先验与视角几何约束融合，计算光照图并为每个视角生成初始潜在编码。这些精心推导的初始潜在编码引导扩散模型生成更准确反映用户期望的重光照输出，尤其在光照方向方面。通过将多视角渲染图像与初始潜在编码输入我们的多视角重光照模型，我们生成高保真、具有艺术感的重光照图像。最后，我们利用重光照外观对3DGS场景进行微调，获得完全重光照的3D场景。我们在室内外场景上评估GS-Light，并与包括逐视角重光照、视频重光照和场景编辑方法在内的先进基线进行比较。通过定量指标（多视角一致性、成像质量、美学评分、语义相似度等）和定性评估（用户研究），GS-Light展现出相对于基线的持续改进。代码与资源将在发表时公开提供。