平均还不够:多语种评价洞穴 (Average Is Not Enough: Caveats of Multilingual Evaluation) - 专知论文

会员服务 ·

0

有偏 · INFORMS · 统计量 · Analysis · SimPLe ·

2023 年 1 月 3 日

Average Is Not Enough: Caveats of Multilingual Evaluation

翻译：平均还不够:多语种评价洞穴

Matúš Pikuliak,Marián Šimko

from arxiv, The 2022 Workshop on Multilingual Representation Learning

This position paper discusses the problem of multilingual evaluation. Using simple statistics, such as average language performance, might inject linguistic biases in favor of dominant language families into evaluation methodology. We argue that a qualitative analysis informed by comparative linguistics is needed for multilingual results to detect this kind of bias. We show in our case study that results in published works can indeed be linguistically biased and we demonstrate that visualization based on URIEL typological database can detect it.

翻译：本立场文件讨论了多语种评估问题。使用普通语言表现等简单统计数据,可能会在评估方法中引入有利于主要语言家庭的语言偏见。我们认为,多语种结果需要用比较语言分析的质量分析来发现这种偏见。我们在案例研究中显示,出版作品的结果确实可能在语言上带有偏见,我们证明基于URIEL类型数据库的可视化可以检测到这一点。

0

相关内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

新型双栅纳米线双电荷层晶体管研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧空位对低维纳米结构钨基氧化物光电化学性能的影响机制

国家自然科学基金

0+阅读 · 2013年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Navigating the Metric Maze: A Taxonomy of Evaluation Metrics for Anomaly Detection in Time Series

Arxiv

0+阅读 · 2023年3月2日

The 2022 NIST Language Recognition Evaluation

Arxiv

0+阅读 · 2023年2月28日

Large Language Models Are State-of-the-Art Evaluators of Translation Quality

Arxiv

0+阅读 · 2023年2月28日

Distributed Subweb Specifications for Traversing the Web

Arxiv

0+阅读 · 2023年2月28日

ER-Test: Evaluating Explanation Regularization Methods for Language Models

Arxiv

0+阅读 · 2023年2月28日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【MIT博士论文】弱监督学习：理论、方法与应用

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

锚定情报：合成欺骗时代的地面真相

NeurIPS 2025 | NMKE：基于神经元归因与动态稀疏掩码的终身知识编辑

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Navigating the Metric Maze: A Taxonomy of Evaluation Metrics for Anomaly Detection in Time Series

Arxiv

0+阅读 · 2023年3月2日

The 2022 NIST Language Recognition Evaluation

Arxiv

0+阅读 · 2023年2月28日

Large Language Models Are State-of-the-Art Evaluators of Translation Quality

Arxiv

0+阅读 · 2023年2月28日

Distributed Subweb Specifications for Traversing the Web

Arxiv

0+阅读 · 2023年2月28日

ER-Test: Evaluating Explanation Regularization Methods for Language Models

Arxiv

0+阅读 · 2023年2月28日

相关基金

新型双栅纳米线双电荷层晶体管研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧空位对低维纳米结构钨基氧化物光电化学性能的影响机制

国家自然科学基金

0+阅读 · 2013年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员