I-Scene：三维实例模型作为隐式可泛化的空间学习器 (I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners)

Generalization remains the central challenge for interactive 3D scene generation. Existing learning-based approaches ground spatial understanding in limited scene dataset, restricting generalization to new layouts. We instead reprogram a pre-trained 3D instance generator to act as a scene level learner, replacing dataset-bounded supervision with model-centric spatial supervision. This reprogramming unlocks the generator transferable spatial knowledge, enabling generalization to unseen layouts and novel object compositions. Remarkably, spatial reasoning still emerges even when the training scenes are randomly composed objects. This demonstrates that the generator's transferable scene prior provides a rich learning signal for inferring proximity, support, and symmetry from purely geometric cues. Replacing widely used canonical space, we instantiate this insight with a view-centric formulation of the scene space, yielding a fully feed-forward, generalizable scene generator that learns spatial relations directly from the instance model. Quantitative and qualitative results show that a 3D instance generator is an implicit spatial learner and reasoner, pointing toward foundation models for interactive 3D scene understanding and generation. Project page: https://luling06.github.io/I-Scene-project/

翻译：泛化能力仍是交互式三维场景生成的核心挑战。现有的基于学习的方法将空间理解建立在有限的场景数据集上，限制了其对新布局的泛化能力。我们转而重新编程一个预训练的三维实例生成器，使其充当场景级学习器，用模型中心的空间监督替代数据集约束的监督。这种重新编程释放了生成器的可迁移空间知识，使其能够泛化到未见过的布局和新的物体组合。值得注意的是，即使训练场景由随机组合的物体构成，空间推理能力仍能涌现。这表明生成器的可迁移场景先验为从纯几何线索推断邻近性、支撑关系和对称性提供了丰富的学习信号。我们摒弃广泛使用的规范空间，采用以视角为中心的场景空间公式化来实例化这一洞见，从而得到一个完全前馈、可泛化的场景生成器，它直接从实例模型中学习空间关系。定量和定性结果表明，三维实例生成器是一种隐式的空间学习器和推理器，为交互式三维场景理解与生成的基础模型指明了方向。项目页面：https://luling06.github.io/I-Scene-project/