自然语言处理基础模式 -- -- 预先培训的融合媒体的语言模式 (Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media)

from arxiv, This book has been accepted by Springer Nature and will be published as an open access monograph. https://link.springer.com/book/9783031231896. It is licensed under the CC BY-NC-SA license (https://creativecommons.org/licenses/by-nc-sa/4.0/), except for the material included from other authors, which may have different licenses

This open access book provides a comprehensive overview of the state of the art in research and applications of Foundation Models and is intended for readers familiar with basic Natural Language Processing (NLP) concepts. Over the recent years, a revolutionary new paradigm has been developed for training models for NLP. These models are first pre-trained on large collections of text documents to acquire general syntactic knowledge and semantic information. Then, they are fine-tuned for specific tasks, which they can often solve with superhuman accuracy. When the models are large enough, they can be instructed by prompts to solve new tasks without any fine-tuning. Moreover, they can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning. Because they provide a blueprint for solving many tasks in artificial intelligence, they have been called Foundation Models. After a brief introduction to basic NLP models the main pre-trained language models BERT, GPT and sequence-to-sequence transformer are described, as well as the concepts of self-attention and context-sensitive embedding. Then, different approaches to improving these models are discussed, such as expanding the pre-training criteria, increasing the length of input texts, or including extra knowledge. An overview of the best-performing models for about twenty application areas is then presented, e.g., question answering, translation, story generation, dialog systems, generating images from text, etc. For each application area, the strengths and weaknesses of current models are discussed, and an outlook on further developments is given. In addition, links are provided to freely available program code. A concluding chapter summarizes the economic opportunities, mitigation of risks, and potential developments of AI.

翻译：这份开放存取书全面概述了基金会模型的研究和应用方面的最新动态,供熟悉基本自然语言处理(NLP)概念的读者使用。近年来,为国家语言处理(NLP)培训模式开发了革命性的新范例。这些模型首先在大量文本文件收集方面进行了预先培训,以获得一般综合知识和语义信息。然后,它们为具体任务进行了微调,这些模型往往能够以超人精确度解决。当模型足够大时,它们可以通过快速解决新任务而不作任何微调。此外,这些模型可以应用于从图像和视频处理到机器人控制学习等一系列不同的媒体和问题领域。这些模型提供了解决人工智能中许多任务的蓝图。这些模型被称作“基础模型”模型。在对基本版本进行简要介绍之后,主要经过预先培训的语言模型BERT、GPT和从顺序到顺序转换变异的变异器往往能够解决。当这些模型足够大的时候,它们可以通过快速的提示来解决新任务,而无需微调。此外,它们可以应用于广泛的媒体和问题。此外,它们可以用于改进各种媒体和问题领域,从图像处理到机器人控制领域。因为这些模型的目前的各种方法, 正在讨论,这些模型的变异变变变现, 正在扩大一个新的版本, 正在扩大一个新的版本和变换版的版本, 正在扩大的版本, 和变式的版本,例如: 正在扩大的版本的版本的版本的版本的版本, 复制过程的版本的版本的版本的版本的版本, 的版本的版本, 正在扩展成过程的版本的版本, 正在扩展的版本, 正在扩展的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本, 正在扩大的版本,,,,, 正在扩展的版本的版本的版本的版本的版本的版本的版本的版本的版本, 的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本正在扩大的版本的版本的版本的版本的版本, 的版本的版本的版本的版本的版本的版本的版本正在扩大的版本, 的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本的版本正在扩展的版本的版本的版本,,

相关内容

MoDELS

关注 30

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/