Distributed word embeddings such as Word2Vec and GloVe have been widely adopted in industrial context settings. Major technical applications of GloVe include recommender systems and natural language processing. The fundamental theory behind GloVe relies on the selection of a weighting function in the weighted least squres formulation that computes the powered ratio of word occurrence count and the maximum word count in the corpus. However, the initial formulation of GloVe is not theoretically sound in two aspects, namely the selection of the weighting function and its power exponent is ad-hoc. In this paper, we utilize the theory of extreme value analysis and propose a theoretically accurate version of GloVe. By reformulating the weighted least squares loss function as the expected loss function and accurately choosing the power exponent, we create a theoretically accurate version of GloVe. We demonstrate the competitiveness of our algorithm and show that the initial formulation of GloVe with the suggested optimal parameter can be viewed as a special case of our paradigm.
翻译:Word2Vec 和 GloVe 等分布式字嵌入在工业环境环境中被广泛采用。 GloVe 的主要技术应用包括推荐系统和自然语言处理。 GloVe 背后的基本理论依赖于在加权最小方位配方中选择加权函数,该配方计算出字数发生率的动力比和本体最大字数的最大值。然而, GloVe 最初的提法在两个方面在理论上并不合理,即选择权重函数及其权力推导符是临时性的。 在本文中,我们使用极端价值分析理论,并提出一个理论上准确的GloVe 版本。通过将加权最小方位损失函数重新定位为预期损失函数并准确地选择权重,我们创造了一个理论上准确的GloVe 算法版本。 我们展示了我们的算法的竞争力,并表明GloVe 最初配有建议的最佳参数的提法可以被视为我们模式的一个特殊案例。