Sarcasm is a term that refers to the use of words to mock, irritate, or amuse someone. It is commonly used on social media. The metaphorical and creative nature of sarcasm presents a significant difficulty for sentiment analysis systems based on affective computing. The methodology and results of our team, UTNLP, in the SemEval-2022 shared task 6 on sarcasm detection are presented in this paper. We put different models, and data augmentation approaches to the test and report on which one works best. The tests begin with traditional machine learning models and progress to transformer-based and attention-based models. We employed data augmentation based on data mutation and data generation. Using RoBERTa and mutation-based data augmentation, our best approach achieved an F1-sarcastic of 0.38 in the competition's evaluation phase. After the competition, we fixed our model's flaws and achieved an F1-sarcastic of 0.414.
翻译:讽刺是一个术语,它指用词来嘲弄、刺激或取笑某人。它通常用于社交媒体。讽刺的隐喻性和创造性性质给基于感官计算的观点分析系统带来了巨大的困难。我们团队UTNLP在SemEval-2022年SemEval-2022年共同任务6中的方法和结果在本文中介绍了关于讽刺性检测的6项任务。我们在测试和报告中采用了不同的模型和数据增强方法。测试从传统的机器学习模型和基于变压器和关注模型的进展开始。我们利用基于数据突变和数据生成的数据增强数据。在竞争评估阶段,我们的最佳方法是利用RoBERTA和基于突变的数据增强,实现了0.38的F1-Sarcistic。在竞争后,我们弥补了模型的缺陷,实现了0.414的F1-Sarcastic。