INEX:印度尼西亚用于克隆测试的词汇和表达式数据集 (InDEX: Indonesian Idiom and Expression Dataset for Cloze Test) - 专知论文

会员服务 ·

0

随机初始化 · MoDELS · 数据集 · 基于上下文的表示 · 基准 ·

2022 年 11 月 24 日

InDEX: Indonesian Idiom and Expression Dataset for Cloze Test

翻译：INEX:印度尼西亚用于克隆测试的词汇和表达式数据集

Xinying Qiu,Guofeng Shi

from arxiv, Accepted to "2022 International Conference on Asian Language Processing (IALP)"

We propose InDEX, an Indonesian Idiom and Expression dataset for cloze test. The dataset contains 10438 unique sentences for 289 idioms and expressions for which we generate 15 different types of distractors, resulting in a large cloze-style corpus. Many baseline models of cloze test reading comprehension apply BERT with random initialization to learn embedding representation. But idioms and fixed expressions are different such that the literal meaning of the phrases may or may not be consistent with their contextual meaning. Therefore, we explore different ways to combine static and contextual representations for a stronger baseline model. Experimentations show that combining definition and random initialization will better support cloze test model performance for idioms whether independently or mixed with fixed expressions. While for fixed expressions with no special meaning, static embedding with random initialization is sufficient for cloze test model.

翻译：我们建议使用 InDEX 来测试凝聚。数据集包含 289 个语系和表达式的10438 个独有的句子, 我们为此生成了15种不同的分散器, 从而形成一个巨大的凝聚型体。许多凝聚测试理解的基线模型应用随机初始化BERT来学习嵌入式。但语系和固定表达式不同, 使得这些短语的字面含义可能或不符合其上下文含义。因此, 我们探索不同的方式, 将静态和背景表达方式结合起来, 以建立一个更强的基线模型。实验显示, 将定义和随机初始化结合起来, 将更好地支持独立地或与固定表达式混在一起的静态测试模型性能。虽然对于没有特殊意义的固定表达式来说, 随机初始化的静态嵌入对于凝聚测试模型来说就足够了。

0

相关内容

随机初始化

随机初始化

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

空间环境下SMA驱动器概率设计理论与方法

国家自然科学基金

1+阅读 · 2015年12月31日

SIRT1调控miR-15b-5p转录的新机制及其在结直肠癌转移的作用

国家自然科学基金

0+阅读 · 2015年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BER通路基因miRNA结合位点基因多态性与结直肠癌易感性的关联及功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经细胞发育过程中钙调蛋白激酶I表达的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

NOR1与β-catenin相互作用抑制鼻咽癌上皮间质变和侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向Notch-1的miRNA在浸润性膀胱癌中的功能及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Learning Visual Representations for Transfer Learning by Suppressing Texture

Arxiv

0+阅读 · 2023年1月27日

A Multi-Layout Design for Immersive Visualization of Network Data

Arxiv

0+阅读 · 2023年1月26日

Don't overfit the history -- Recursive time series data augmentation

Arxiv

0+阅读 · 2023年1月25日

Understanding and Improving Deep Graph Neural Networks: A Probabilistic Graphical Model Perspective

Arxiv

0+阅读 · 2023年1月25日

RDIS: Random Drop Imputation with Self-Training for Incomplete Time Series Data

Arxiv

0+阅读 · 2023年1月25日

Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data

Arxiv

0+阅读 · 2023年1月25日

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Arxiv

11+阅读 · 2021年1月28日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

随机初始化

基于上下文的表示

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

中文万字长文《灾难场景中支持决策的人机协作模式综述》

《军事领域人工智能网络安全的数字主权控制框架》

《以相关速度交付韧性作战能力》2025年200页

《英国陆军：1991-2024年作战中的任务式指挥与领导艺术》164页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

相关论文

Learning Visual Representations for Transfer Learning by Suppressing Texture

Arxiv

0+阅读 · 2023年1月27日

A Multi-Layout Design for Immersive Visualization of Network Data

Arxiv

0+阅读 · 2023年1月26日

Don't overfit the history -- Recursive time series data augmentation

Arxiv

0+阅读 · 2023年1月25日

Understanding and Improving Deep Graph Neural Networks: A Probabilistic Graphical Model Perspective

Arxiv

0+阅读 · 2023年1月25日

RDIS: Random Drop Imputation with Self-Training for Incomplete Time Series Data

Arxiv

0+阅读 · 2023年1月25日

Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data

Arxiv

0+阅读 · 2023年1月25日

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Arxiv

11+阅读 · 2021年1月28日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

空间环境下SMA驱动器概率设计理论与方法

国家自然科学基金

1+阅读 · 2015年12月31日

SIRT1调控miR-15b-5p转录的新机制及其在结直肠癌转移的作用

国家自然科学基金

0+阅读 · 2015年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BER通路基因miRNA结合位点基因多态性与结直肠癌易感性的关联及功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经细胞发育过程中钙调蛋白激酶I表达的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

NOR1与β-catenin相互作用抑制鼻咽癌上皮间质变和侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向Notch-1的miRNA在浸润性膀胱癌中的功能及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员