学习数据集成以增强语言模型的效果 (Learnings from Data Integration for Augmented Language Models) - 专知论文

会员服务 ·

0

集成 · 语言模型 · 数据集 · 集成研究 · 集成系统 ·

2023 年 4 月 10 日

Learnings from Data Integration for Augmented Language Models

翻译：学习数据集成以增强语言模型的效果

Alon Halevy,Jane Dwivedi-Yu

One of the limitations of large language models is that they do not have access to up-to-date, proprietary or personal data. As a result, there are multiple efforts to extend language models with techniques for accessing external data. In that sense, LLMs share the vision of data integration systems whose goal is to provide seamless access to a large collection of heterogeneous data sources. While the details and the techniques of LLMs differ greatly from those of data integration, this paper shows that some of the lessons learned from research on data integration can elucidate the research path we are conducting today on language models.

翻译：大型语言模型的一个限制是它们无法访问最新的、专有的或个人数据。因此，有多项努力将语言模型与访问外部数据的技术扩展。在这种意义上，LLM共享数据集成系统的愿景，其目标是为大量异构数据源提供无缝访问。虽然LLM的细节和技术与数据集成有很大不同，但本文表明，从数据集成研究中学到的一些经验教训可以阐明我们今天在语言模型上开展的研究路径。

0

相关内容

【ACL2022教程】有限文本数据学习，Learning with Limited Text Data

【ACL2022教程】有限文本数据学习，Learning with Limited Text Data

专知会员服务

29+阅读 · 2022年5月22日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

140+阅读 · 2020年7月10日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

基于半监督集成学习的不平衡数据研究

国家自然科学基金

0+阅读 · 2012年12月31日

改性生物质炭基吸附材料制备及选择吸附重金属离子的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的大规模异质信息网络的匹配查询关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的个人数据空间模型与查询方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

高维异构数据的测度学习算法及在图像分类中的应用研究

国家自然科学基金

3+阅读 · 2009年12月31日

DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions

Arxiv

0+阅读 · 2023年5月26日

Complex Logical Reasoning over Knowledge Graphs using Large Language Models

Arxiv

0+阅读 · 2023年5月24日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

14+阅读 · 2022年5月3日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

VIP会员

文章信息

相关主题

相关VIP内容

【ACL2022教程】有限文本数据学习，Learning with Limited Text Data

【ACL2022教程】有限文本数据学习，Learning with Limited Text Data

专知会员服务

29+阅读 · 2022年5月22日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

140+阅读 · 2020年7月10日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能对军事行动进行建模》

《利用人工智能学习、优化与推演美国海军作战部队的战略布局与分散（续文）》

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

《指挥官意图消息中关键概念自动提取》最新47页

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions

Arxiv

0+阅读 · 2023年5月26日

Complex Logical Reasoning over Knowledge Graphs using Large Language Models

Arxiv

0+阅读 · 2023年5月24日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

14+阅读 · 2022年5月3日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

相关基金

基于半监督集成学习的不平衡数据研究

国家自然科学基金

0+阅读 · 2012年12月31日

改性生物质炭基吸附材料制备及选择吸附重金属离子的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的大规模异质信息网络的匹配查询关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的个人数据空间模型与查询方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

高维异构数据的测度学习算法及在图像分类中的应用研究

国家自然科学基金

3+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员