Generating or editing images directly from Neural signals has immense potential at the intersection of neuroscience, vision, and Brain-computer interaction. In this paper, We present Uni-Neur2Img, a unified framework for neural signal-driven image generation and editing. The framework introduces a parameter-efficient LoRA-based neural signal injection module that independently processes each conditioning signal as a pluggable component, facilitating flexible multi-modal conditioning without altering base model parameters. Additionally, we employ a causal attention mechanism accommodate the long-sequence modeling demands of conditional generation tasks. Existing neural-driven generation research predominantly focuses on textual modalities as conditions or intermediate representations, resulting in limited exploration of visual modalities as direct conditioning signals. To bridge this research gap, we introduce the EEG-Style dataset. We conduct comprehensive evaluations across public benchmarks and self-collected neural signal datasets: (1) EEG-driven image generation on the public CVPR40 dataset; (2) neural signal-guided image editing on the public Loongx dataset for semantic-aware local modifications; and (3) EEG-driven style transfer on our self-collected EEG-Style dataset. Extensive experimental results demonstrate significant improvements in generation fidelity, editing consistency, and style transfer quality while maintaining low computational overhead and strong scalability to additional modalities. Thus, Uni-Neur2Img offers a unified, efficient, and extensible solution for bridging neural signals and visual content generation.
翻译:直接从神经信号生成或编辑图像,在神经科学、视觉和脑机交互的交叉领域具有巨大潜力。本文提出Uni-Neur2Img,一个用于神经信号驱动图像生成与编辑的统一框架。该框架引入了一个参数高效的、基于LoRA的神经信号注入模块,该模块将每个条件信号作为可插拔组件独立处理,从而在不改变基础模型参数的情况下实现灵活的多模态条件控制。此外,我们采用因果注意力机制来满足条件生成任务对长序列建模的需求。现有的神经驱动生成研究主要将文本模态作为条件或中间表示,导致对视觉模态作为直接条件信号的探索有限。为填补这一研究空白,我们引入了EEG-Style数据集。我们在公开基准和自收集的神经信号数据集上进行了全面评估:(1)在公开的CVPR40数据集上进行EEG驱动的图像生成;(2)在公开的Loongx数据集上进行神经信号引导的图像编辑,以实现语义感知的局部修改;(3)在我们自收集的EEG-Style数据集上进行EEG驱动的风格迁移。大量实验结果表明,该方法在生成保真度、编辑一致性和风格迁移质量方面均有显著提升,同时保持了较低的计算开销以及对额外模态的强大可扩展性。因此,Uni-Neur2Img为连接神经信号与视觉内容生成提供了一个统一、高效且可扩展的解决方案。