Large Language Models (LLMs) exhibit strong general reasoning but struggle in molecular science due to the lack of explicit chemical priors in standard string representations. Current solutions face a fundamental dilemma. Training-based methods inject priors into parameters, but this static coupling hinders rapid knowledge updates and often compromises the model's general reasoning capabilities. Conversely, existing training-free methods avoid these issues but rely on surface-level prompting, failing to provide the fine-grained atom-level priors essential for precise chemical reasoning. To address this issue, we introduce ChemATP, a framework that decouples chemical knowledge from the reasoning engine. By constructing the first atom-level textual knowledge base, ChemATP enables frozen LLMs to explicitly retrieve and reason over this information dynamically. This architecture ensures interpretability and adaptability while preserving the LLM's intrinsic general intelligence. Experiments show that ChemATP significantly outperforms training-free baselines and rivals state-of-the-art training-based models, demonstrating that explicit prior injection is a competitive alternative to implicit parameter updates.
翻译:大型语言模型(LLMs)展现出强大的通用推理能力,但在分子科学领域表现欠佳,这主要源于标准字符串表示中缺乏显式的化学先验知识。现有解决方案面临一个根本性困境:基于训练的方法将先验知识注入模型参数,但这种静态耦合阻碍了知识的快速更新,并常常损害模型的通用推理能力;反之,现有的无训练方法虽避免了这些问题,但仅依赖表层提示,无法提供精确化学推理所必需的细粒度原子级先验知识。为解决这一问题,我们提出了ChemATP框架,该框架将化学知识与推理引擎解耦。通过构建首个原子级文本知识库,ChemATP使冻结的LLMs能够动态地显式检索并基于该信息进行推理。这一架构在保持LLM内在通用智能的同时,确保了可解释性与适应性。实验表明,ChemATP显著优于无训练基线方法,并与最先进的基于训练的模型性能相当,证明了显式先验注入是隐式参数更新的一种具有竞争力的替代方案。