State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. However, current approaches are limited to small closed vocabularies which are far from enough for natural communication. In addition, most of the high-performing approaches require data from invasive devices (e.g., ECoG). In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. We hypothesis that the human brain functions as a special text encoder and propose a novel framework leveraging pre-trained language models (e.g., BART). Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines. Furthermore, we show that our proposed model can handle data from various subjects and sources, showing great potential for a high-performance open vocabulary brain-to-text system once sufficient data is available
翻译:最先进的大脑到文字系统在通过神经网络直接从大脑信号中解码语言方面取得了巨大成功,但是,目前的方法仅限于小型封闭的词汇,远远不足以进行自然通信。此外,大多数高性能方法需要入侵装置的数据(例如,ECoG)。在本文件中,我们将问题扩大到开放词汇电子脑谱学(EEEG)-从远程序列到后序解码和自然阅读任务零发句情绪分类。我们假设,人类大脑作为特殊文本编码器运行,并提出了一个利用预先培训的语言模型(例如,BART)的新框架。我们的模型在EEEG-T-Text解码上达到40.1%的BLEU-1分,零发的EEG-T-Text解码上达到55.6%的F1分,这大大超出监督基线。此外,我们表明,我们提议的模型可以处理来自不同主题和来源的数据,显示高性开放的大脑-系统一旦具备了足够潜力的读制式数据系统数据。