Network Intrusion Detection Systems (NIDS) are essential for securing networks by identifying and mitigating unauthorized activities indicative of cyberattacks. As cyber threats grow increasingly sophisticated, NIDS must evolve to detect both emerging threats and deviations from normal behavior. This study explores the application of machine learning (ML) methods to improve the NIDS accuracy through analyzing intricate structures in deep-featured network traffic records. Leveraging the 1999 KDD CUP intrusion dataset as a benchmark, this research evaluates and optimizes several ML algorithms, including Support Vector Machines (SVM), Naïve Bayes variants (MNB, BNB), Random Forest (RF), k-Nearest Neighbors (k-NN), Decision Trees (DT), AdaBoost, XGBoost, Logistic Regression (LR), Ridge Classifier, Passive-Aggressive (PA) Classifier, Rocchio Classifier, Artificial Neural Networks (ANN), and Perceptron (PPN). Initial evaluations without hyper-parameter optimization demonstrated suboptimal performance, highlighting the importance of tuning to enhance classification accuracy. After hyper-parameter optimization using grid and random search techniques, the SVM classifier achieved 99.12% accuracy with a 0.0091 False Alarm Rate (FAR), outperforming its default configuration (98.08% accuracy, 0.0123 FAR) and all other classifiers. This result confirms that SVM accomplishes the highest accuracy among the evaluated classifiers. We validated the effectiveness of all classifiers using a tenfold cross-validation approach, incorporating Recursive Feature Elimination (RFE) for feature selection to enhance the classifiers accuracy and efficiency. Our outcomes indicate that ML classifiers are both adaptable and reliable, contributing to enhanced accuracy in systems for detecting network intrusions.
翻译:网络入侵检测系统(NIDS)对于保障网络安全至关重要,其通过识别和缓解指示网络攻击的未授权活动来发挥作用。随着网络威胁日益复杂化,NIDS必须不断演进以检测新兴威胁和偏离正常行为的情况。本研究探讨了应用机器学习(ML)方法通过分析深度特征网络流量记录中的复杂结构来提高NIDS的准确性。本研究以1999年KDD CUP入侵数据集为基准,评估并优化了多种机器学习算法,包括支持向量机(SVM)、朴素贝叶斯变体(MNB、BNB)、随机森林(RF)、k-最近邻(k-NN)、决策树(DT)、AdaBoost、XGBoost、逻辑回归(LR)、岭分类器、被动-主动(PA)分类器、Rocchio分类器、人工神经网络(ANN)以及感知器(PPN)。未经超参数优化的初步评估显示性能欠佳,突显了调优对于提升分类准确性的重要性。在使用网格搜索和随机搜索技术进行超参数优化后,SVM分类器达到了99.12%的准确率和0.0091的误报率(FAR),优于其默认配置(98.08%准确率,0.0123 FAR)及其他所有分类器。这一结果证实了SVM在所评估的分类器中实现了最高的准确性。我们采用十倍交叉验证方法验证了所有分类器的有效性,并结合递归特征消除(RFE)进行特征选择,以提升分类器的准确性和效率。我们的结果表明,机器学习分类器既具有适应性又可靠,有助于提高网络入侵检测系统的准确性。