培训前诱导结构 (Structure Inducing Pre-Training) - 专知论文

会员服务 ·

0

可理解性 · Analysis · 归纳偏好 · 语言模型化 · INFORMS ·

2022 年 8 月 4 日

Structure Inducing Pre-Training

翻译：培训前诱导结构

Matthew B. A. McDermott,Brendan Yap,Peter Szolovits,Marinka Zitnik

Language model pre-training and derived methods are incredibly impactful in machine learning. However, there remains considerable uncertainty on exactly why pre-training helps improve performance for fine-tuning tasks. This is especially true when attempting to adapt language-model pre-training to domains outside of natural language. Here, we analyze this problem by exploring how existing pre-training methods impose relational structure in their induced per-sample latent spaces -- i.e., what constraints do pre-training methods impose on the distance or geometry between the pre-trained embeddings of two samples $\vec x_i$ and $\vec x_j$. Through a comprehensive review of existing pre-training methods, we find that this question remains open. This is true despite theoretical analyses demonstrating the importance of understanding this form of induced structure. Based on this review, we introduce a descriptive framework for pre-training that allows for a granular, comprehensive understanding of how relational structure can be induced. We present a theoretical analysis of this framework from first principles and establish a connection between the relational inductive bias of pre-training and fine-tuning performance. We also show how to use the framework to define new pre-training methods. We build upon these findings with empirical studies on benchmarks spanning 3 data modalities and ten fine-tuning tasks. These experiments validate our theoretical analyses, inform the design of novel pre-training methods, and establish consistent improvements over a compelling suite of baseline methods.

翻译：语言模式培训前和派生方法对机器学习的影响极大。然而,对于为什么培训前有助于改进改进微调任务的业绩,还存在着相当大的不确定性。在试图将语言模式培训前培训调整到自然语言以外的领域时,这一点尤其正确。在这里,我们通过探讨现有的培训前方法和派生的人均潜在空间如何将关系结构强加给现有的培训前方法和派生方法 -- -- 即培训前方法对预先培训的两种样本的嵌入之间的距离或几何限制是什么。我们从最初的原则出发对这一框架进行理论分析,并通过对现有培训前方法进行全面审查,发现这一问题仍然未解决。尽管理论分析表明理解这种诱导结构形式的重要性,这是确实存在的。根据这次审查,我们为培训前的每个潜在空间引入了一个描述性框架,以便能够对如何引导出关系结构有一个简单和全面的理解。我们从最初的原则出发对这一框架进行理论分析,并在培训前和调整前业绩的关系偏差之间建立起联系。我们还发现,我们如何利用这些实验性基准来制定新的标准。我们还展示了如何利用这些实验性框架来界定新的标准。

0

相关内容

可理解性

多模态认知计算

多模态认知计算

专知会员服务

180+阅读 · 2022年9月16日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

胆固醇转运子ABCA1调控CD4+T细胞免疫应答抑制动脉粥样硬化的新机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

瞬态光场在亚周期尺度的操控研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Brugada综合征的临床电生理及分子遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新癌基因E3连接酶HECTD3表达调节机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型抗生素Bagremycins生物合成基因簇的鉴定与解析

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧诱导NHE1参与calpain介导ABCA1降解及胆固醇逆转运障碍

国家自然科学基金

0+阅读 · 2012年12月31日

Dicer和Drosha基因遗传变异与膀胱癌易感性及其机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Soft Robots Modeling: a Structured Overview

Arxiv

1+阅读 · 2022年10月4日

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Arxiv

0+阅读 · 2022年10月4日

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Arxiv

0+阅读 · 2022年10月2日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

多模态认知计算

多模态认知计算

专知会员服务

180+阅读 · 2022年9月16日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型中的检索与结构化增强生成综述

《实现多层防御多轮交战机制的扩展型随机齐射模型》2025年最新83页

【CMU博士论文】交互驱动的人体动作估计与生成

如何避免生成式人工智能在作战中失控失效

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Soft Robots Modeling: a Structured Overview

Arxiv

1+阅读 · 2022年10月4日

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Arxiv

0+阅读 · 2022年10月4日

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Arxiv

0+阅读 · 2022年10月2日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

胆固醇转运子ABCA1调控CD4+T细胞免疫应答抑制动脉粥样硬化的新机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

瞬态光场在亚周期尺度的操控研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Brugada综合征的临床电生理及分子遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新癌基因E3连接酶HECTD3表达调节机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型抗生素Bagremycins生物合成基因簇的鉴定与解析

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧诱导NHE1参与calpain介导ABCA1降解及胆固醇逆转运障碍

国家自然科学基金

0+阅读 · 2012年12月31日

Dicer和Drosha基因遗传变异与膀胱癌易感性及其机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员