We test the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted. Controlling the size factor, we investigate this hypothesis for a number of 25 subject areas. Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias. The article therefore deals with the question of whether Wikipedia exhibits this kind of linguistic relativity or not. From the perspective of educational science, the article develops a computational model of the information landscape from which multiple texts are drawn as typical input of web-based reading. For this purpose, it develops a hybrid model of intra- and intertextual similarity of different parts of the information landscape and tests this model on the example of 35 languages and corresponding Wikipedias. In this way the article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.
翻译:我们测试一个假设,即一个人通过维基百科获得关于某一主题的信息的程度取决于它所咨询的语言。控制大小因素,我们调查25个主题领域的这一假设。由于维基百科是基于网络的信息景观的核心部分,这表明语言偏见。因此,文章涉及维基百科是否展示这种语言相对性的问题。从教育科学的角度来看,文章开发了一个信息景观的计算模型,从中提取多种文本,作为基于网络的阅读的典型输入。为此,它开发了一个信息景观不同部分内部和文字之间的相似性混合模型,并以35种语言和相应的维基百科为例,测试这一模型。以此方式,文章在阅读研究、教育科学、维基百科研究和计算语言之间建立了桥梁。