ColBERT: 使用BERT判决嵌入平行神经网络以计算悍度 (ColBERT: Using BERT Sentence Embedding in Parallel Neural Networks for Computational Humor)

Automation of humor detection and rating has interesting use cases in modern technologies, such as humanoid robots, chatbots, and virtual assistants. In this paper, we propose a novel approach for detecting and rating humor in short texts based on a popular linguistic theory of humor. The proposed technical method initiates by separating sentences of the given text and utilizing the BERT model to generate embeddings for each one. The embeddings are fed to separate lines of hidden layers in a neural network (one line for each sentence) to extract latent features. At last, the parallel lines are concatenated to determine the congruity and other relationships between the sentences and predict the target value. We accompany the paper with a novel dataset for humor detection consisting of 200,000 formal short texts. In addition to evaluating our work on the novel dataset, we participated in a live machine learning competition focused on rating humor in Spanish tweets. The proposed model obtained F1 scores of 0.982 and 0.869 in the humor detection experiments which outperform general and state-of-the-art models. The evaluation performed on two contrasting settings confirm the strength and robustness of the model and suggests two important factors in achieving high accuracy in the current task: 1) usage of sentence embeddings and 2) utilizing the linguistic structure of humor in designing the proposed model.

翻译：幽默检测和评级自动化在现代技术(如人造机器人、聊天机器人和虚拟助手)中应用了有趣的案例。在本文中,我们提出一种新的方法,在流行的语言幽默理论基础上,在短文本中检测幽默和评分幽默;提议的技术方法通过将给定文本的句子分开,并利用BERT模型为每个文本生成嵌入器;嵌入到一个神经网络(每个句子一行)中隐蔽层的分行以提取潜在特征。最后,平行线被连接在一起,以确定句子之间的一致性和其他关系,并预测目标值。我们随论文一起用一套新的幽默检测数据集,由200 000份正式短文本组成。除了评估我们关于新数据集的工作外,我们还参加了一次现场机器学习竞赛,重点是在西班牙的推文中的幽默评级。在幽默检测实验中获得了0.982和0.869分的F1分,这些分数超过一般和最先进的模型。在两个对比环境中进行的评估证实了模型的强度和坚固性。我们随论文一起在设计高语言感化模型时使用的精度上提出了两项重要任务。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

专知会员服务

323+阅读 · 2020年11月26日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日