Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a hybrid framework that combines the semantic alignment strength of Semantic Information Retrieval (SIR) with the probabilistic watermarking mechanism of SynthID-Text. Our approach jointly embeds watermarks at both lexical and semantic levels, enabling robust provenance tracking while preserving the original meaning. Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text. These findings demonstrate the effectiveness of semantic-aware watermarking in resisting real-world tampering. All code, datasets, and evaluation scripts are publicly available at: https://github.com/githshine/SynGuard.
翻译:谷歌DeepMind近期提出的SynthID-Text等大语言模型水印方法为追踪AI生成文本的溯源提供了前景广阔的解决方案。然而,我们的鲁棒性评估表明,SynthID-Text在面对保持语义不变的攻击时存在脆弱性,例如文本复述、复制粘贴修改及回译等操作,这些攻击会显著降低水印的可检测性。为应对这些局限,我们提出了SynGuard——一个融合语义信息检索(SIR)的语义对齐能力与SynthID-Text概率水印机制的混合框架。该方法在词汇与语义双层面协同嵌入水印,在保持原意的前提下实现鲁棒的溯源追踪。在多类攻击场景下的实验结果表明,相较于SynthID-Text,SynGuard在F1分数上平均提升11.1%的水印恢复率。这些发现证明了语义感知水印技术在抵抗现实篡改攻击方面的有效性。所有代码、数据集及评估脚本均已公开于:https://github.com/githshine/SynGuard。