Cybersecurity has become one of the earliest adopters of agentic AI, as security operations centers increasingly rely on multi-step reasoning, tool-driven analysis, and rapid decision-making under pressure. While individual large language models can summarize alerts or interpret unstructured reports, they fall short in real SOC environments that require grounded data access, reproducibility, and accountable workflows. In response, the field has seen a rapid architectural evolution from single-model helpers toward tool-augmented agents, distributed multi-agent systems, schema-bound tool ecosystems, and early explorations of semi-autonomous investigative pipelines. This survey presents a five-generation taxonomy of agentic AI in cybersecurity. It traces how capabilities and risks change as systems advance from text-only LLM reasoners to multi-agent collaboration frameworks and constrained-autonomy pipelines. We compare these generations across core dimensions - reasoning depth, tool use, memory, reproducibility, and safety. In addition, we also synthesize emerging benchmarks used to evaluate cyber-oriented agents. Finally, we outline the unresolved challenges that accompany this evolution, such as response validation, tool-use correctness, multi-agent coordination, long-horizon reasoning, and safeguards for high-impact actions. Collectively, this work provides a structured perspective on how agentic AI is taking shape within cybersecurity and what is required to ensure its safe and reliable deployment.
翻译:网络安全已成为智能体AI最早的应用领域之一,安全运营中心日益依赖多步推理、工具驱动分析及压力下的快速决策。尽管单一大型语言模型能够总结警报或解析非结构化报告,但在需要数据基础访问、可复现性及可追溯工作流的实际SOC环境中,它们仍显不足。为此,该领域经历了从单模型辅助工具到工具增强智能体、分布式多智能体系统、模式约束工具生态系统以及半自主调查流程早期探索的快速架构演进。本综述提出了网络安全中智能体AI的五代分类法,追溯了系统从纯文本LLM推理器发展到多智能体协作框架及受限自主流程过程中能力与风险的变化。我们从推理深度、工具使用、记忆能力、可复现性和安全性等核心维度对这些代际进行比较。此外,我们还综合了用于评估网络安全智能体的新兴基准测试。最后,我们阐述了伴随此演进过程尚未解决的挑战,如响应验证、工具使用正确性、多智能体协调、长程推理以及高影响行动的安全保障。整体而言,本研究为智能体AI如何在网络安全领域形成及其安全可靠部署所需条件提供了结构化视角。