Industrial Systems-on-Chips (SoCs) often comprise hundreds of thousands to millions of nets and millions to tens of millions of connectivity edges, making empirical evaluation of hardware-Trojan (HT) detectors on realistic designs both necessary and difficult. Public benchmarks remain significantly smaller and hand-crafted, while releasing truly malicious RTL raises ethical and operational risks. This work presents an automated and scalable methodology for generating HT-like patterns in industry-scale netlists whose purpose is to stress-test detection tools without altering user-visible functionality. The pipeline (i) parses large gate-level designs into connectivity graphs, (ii) explores rare regions using SCOAP testability metrics, and (iii) applies parameterized, function-preserving graph transformations to synthesize trigger-payload pairs that mimic the statistical footprint of stealthy HTs. When evaluated on the benchmarks generated in this work, representative state-of-the-art graph-learning models fail to detect Trojans. The framework closes the evaluation gap between academic circuits and modern SoCs by providing reproducible challenge instances that advance security research without sharing step-by-step attack instructions.
翻译:工业级片上系统(SoCs)通常包含数十万至数百万个网表节点以及数百万至数千万个连接边,这使得对硬件木马(HT)检测器在真实设计中的实证评估既必要又困难。现有公开基准规模仍显著偏小且多为手工构建,而发布真正恶意的寄存器传输级(RTL)代码会引发伦理与操作风险。本研究提出一种自动化、可扩展的方法论,用于在工业级网表中生成类硬件木马模式,其目标是在不改变用户可见功能的前提下对检测工具进行压力测试。该流程(i)将大规模门级设计解析为连接图,(ii)利用SCOAP可测试性度量探索稀有区域,(iii)应用参数化、功能保持的图变换来合成模拟隐蔽硬件木马统计特征的触发-载荷对。基于本工作生成的基准进行评估时,代表性的前沿图学习模型均未能检测出木马。该框架通过提供可复现的挑战实例(无需共享逐步攻击指令)来弥合学术电路与现代SoCs之间的评估鸿沟,从而推动安全研究发展。