Traffic classification on programmable data plane holds great promise for line-rate processing, with methods evolving from per-packet to flow-level analysis for higher accuracy. However, a trade-off between accuracy and efficiency persists. Statistical feature-based methods align with hardware constraints but often exhibit limited accuracy, while online deep learning methods using packet sequential features achieve superior accuracy but require substantial computational resources. This paper presents Synecdoche, the first traffic classification framework that successfully deploys packet sequential features on a programmable data plane via pattern matching, achieving both high accuracy and efficiency. Our key insight is that discriminative information concentrates in short sub-sequences--termed Key Segments--that serve as compact traffic features for efficient data plane matching. Synecdoche employs an "offline discovery, online matching" paradigm: deep learning models automatically discover Key Segment patterns offline, which are then compiled into optimized table entries for direct data plane matching. Extensive experiments demonstrate Synecdoche's superior accuracy, improving F1-scores by up to 26.4% against statistical methods and 18.3% against online deep learning methods, while reducing latency by 13.0% and achieving 79.2% reduction in SRAM usage. The source code of Synecdoche is publicly available to facilitate reproducibility and further research.
翻译:在可编程数据平面上进行流量分类为实现线速处理带来了巨大前景,其方法已从逐包分析演进至流级分析,以获得更高的准确性。然而,准确性与效率之间的权衡问题依然存在。基于统计特征的方法符合硬件约束,但准确性往往有限;而利用数据包序列特征的在线深度学习方法虽能实现更高的准确性,却需要大量的计算资源。本文提出了Synecdoche,这是首个通过模式匹配成功在可编程数据平面上部署数据包序列特征的流量分类框架,同时实现了高准确性和高效率。我们的核心洞见在于,判别性信息集中在被称为"关键片段"的短子序列中,这些片段可作为紧凑的流量特征,用于高效的数据平面匹配。Synecdoche采用"离线发现、在线匹配"的范式:深度学习模型离线自动发现关键片段模式,然后将其编译为优化的表项,用于直接的数据平面匹配。大量实验表明,Synecdoche具有卓越的准确性,其F1分数相比统计方法最高提升26.4%,相比在线深度学习方法提升18.3%,同时延迟降低13.0%,SRAM使用量减少79.2%。Synecdoche的源代码已公开,以促进可复现性和进一步研究。