Advances in natural language processing have resulted in increased capabilities with respect to multiple tasks. One of the possible causes of the observed performance gains is the introduction of increasingly sophisticated text representations. While many of the new word embedding techniques can be shown to capture particular notions of sentiment or associative structures, we explore the ability of two different word embeddings to uncover or capture the notion of logical shape in text. To this end we present a novel framework that we call Topological Word Embeddings which leverages mathematical techniques in dynamical system analysis and data driven shape extraction (i.e. topological data analysis). In this preliminary work we show that using a topological delay embedding we are able to capture and extract a different, shape-based notion of logic aimed at answering the question "Can we find a circle in a circular argument?"
翻译:自然语言处理的进展提高了多种任务的能力。观察到的绩效增益的可能原因之一是引入了日益复杂的文字表述。虽然许多新词嵌入技术可以显示为捕捉情绪或关联结构的特定概念,但我们探索了两种不同的词嵌入能力,以发现或捕捉文字中逻辑形状的概念。为此,我们提出了一个新颖的框架,我们称之为“地形单词嵌入”,它利用数学技术进行动态系统分析和数据驱动形状提取(例如,地形数据分析)。在这项初步工作中,我们表明,我们利用地形延迟嵌入能够捕捉和提取一个不同的、基于形状的逻辑概念,目的是回答“我们能否在循环辩论中找到圆圈?”