Hate speech is regarded as one of the crucial issues plaguing the online social media. The current literature on hate speech detection leverages primarily the textual content to find hateful posts and subsequently identify hateful users. However, this methodology disregards the social connections between users. In this paper, we run a detailed exploration of the problem space and investigate an array of models ranging from purely textual to graph based to finally semi-supervised techniques using Graph Neural Networks (GNN) that utilize both textual and graph-based features. We run exhaustive experiments on two datasets -- Gab, which is loosely moderated and Twitter, which is strictly moderated. Overall the AGNN model achieves 0.791 macro F1-score on the Gab dataset and 0.780 macro F1-score on the Twitter dataset using only 5% of the labeled instances, considerably outperforming all the other models including the fully supervised ones. We perform detailed error analysis on the best performing text and graph based models and observe that hateful users have unique network neighborhood signatures and the AGNN model benefits by paying attention to these signatures. This property, as we observe, also allows the model to generalize well across platforms in a zero-shot setting. Lastly, we utilize the best performing GNN model to analyze the evolution of hateful users and their targets over time in Gab.
翻译:仇恨言论被视为困扰在线社交媒体的关键问题之一。当前关于仇恨言论检测的文献主要利用文本内容的文本内容,查找仇恨文章并随后识别仇恨用户。 但是,这一方法忽略了用户之间的社会联系。 在本文中,我们详细探索了问题空间,并调查了一系列模型,从纯文字空间到图表空间,到利用文本和图表功能的图像神经网络(GNN),到最后半监督技术,这些模型使用文字和图表两种特征。我们在两个数据集 -- -- Gab上进行了详尽的实验,Gab是松散调节的,Twitter是严格调节的。总体而言,AGNNM模型在加布数据集上实现了0.791个宏的F1核心,在推特数据集上实现了0.780个宏的F1核心,仅使用5%的标签实例,大大超过所有其他模型,包括完全受监督的模型。我们详细分析了最优秀的文本和图表模型的错误分析,我们发现,仇恨用户有独特的网络社区特征和AGNNNM模型的好处,这些签名是严格调节的。我们观察到这个属性,还利用GNP的模型,最后在G的模型中将最佳的模型进行全面分析。