SMS Scam Scam 探测系统的经验分析 (An Empirical Analysis of SMS Scam Detection Systems)

The short message service (SMS) was introduced a generation ago to the mobile phone users. They make up the world's oldest large-scale network, with billions of users and therefore attracts a lot of fraud. Due to the convergence of mobile network with internet, SMS based scams can potentially compromise the security of internet services as well. In this study, we present a new SMS scam dataset consisting of 153,551 SMSes. This dataset that we will release publicly for research purposes represents the largest publicly-available SMS scam dataset. We evaluate and compare the performance achieved by several established machine learning methods on the new dataset, ranging from shallow machine learning approaches to deep neural networks to syntactic and semantic feature models. We then study the existing models from an adversarial viewpoint by assessing its robustness against different level of adversarial manipulation. This perspective consolidates the current state of the art in SMS Spam filtering, highlights the limitations and the opportunities to improve the existing approaches.

翻译：短信息服务(SMS)是一代前向移动电话用户推出的,它构成了世界上最古老的大型网络,拥有数十亿用户,因此吸引了许多欺诈。由于移动网络与互联网的融合,基于SMS的骗局也有可能损害互联网服务的安全。在这项研究中,我们提出了一个新的SMS骗骗局数据集,由153 551个短信息数据集组成。我们将为研究目的公开发布该数据集是公众可公开获得的最大SMS骗局数据集。我们评估和比较了新数据集上若干既定机器学习方法的绩效,从浅机学习方法到深层神经网络到合成和语义特征模型。我们随后从对立的角度研究现有的模型,评估其强健性,以对付不同级别的对抗性操纵。这一视角巩固了SMS垃圾过滤系统当前的艺术状态,突出了改进现有方法的局限性和机会。

相关内容

SCAM

关注 0

代码分析与操作（SCAM）国际工作会议的目的是将从事与计算机系统源代码的分析和/或操作有关的理论、技术和应用的研究人员和实践者聚集在一起。虽然在更广泛的软件工程界中，人们的注意力都集中在系统开发和演化的其他方面，如规范、设计和需求工程，但源代码是对系统行为的唯一精确描述。因此，对源代码的分析和操作仍然是一个紧迫的问题。官网链接：http://www.ieee-scam.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

72+阅读 · 2022年3月15日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

27+阅读 · 2020年6月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日