Causal discovery from observational data remains fundamentally limited by identifiability constraints. Recent work has explored leveraging Large Language Models (LLMs) as sources of prior causal knowledge, but existing approaches rely on heuristic integration that lacks theoretical grounding. We introduce HOLOGRAPH, a framework that formalizes LLM-guided causal discovery through sheaf theory--representing local causal beliefs as sections of a presheaf over variable subsets. Our key insight is that coherent global causal structure corresponds to the existence of a global section, while topological obstructions manifest as non-vanishing sheaf cohomology. We propose the Algebraic Latent Projection to handle hidden confounders and Natural Gradient Descent on the belief manifold for principled optimization. Experiments on synthetic and real-world benchmarks demonstrate that HOLOGRAPH provides rigorous mathematical foundations while achieving competitive performance on causal discovery tasks with 50-100 variables. Our sheaf-theoretic analysis reveals that while Identity, Transitivity, and Gluing axioms are satisfied to numerical precision (<10^{-6}), the Locality axiom fails for larger graphs, suggesting fundamental non-local coupling in latent variable projections. Code is available at [https://github.com/hyunjun1121/holograph](https://github.com/hyunjun1121/holograph).
翻译:从观测数据中进行因果发现仍从根本上受到可识别性约束的限制。近期研究探索利用大语言模型作为先验因果知识的来源,但现有方法依赖于缺乏理论基础的启发式整合。我们提出HOLOGRAPH框架,通过层理论将LLM引导的因果发现形式化——将局部因果信念表示为变量子集上预层的截面。我们的核心见解是:一致的全局因果结构对应于全局截面的存在,而拓扑障碍则表现为非零的层上同调。我们提出代数潜在投影来处理隐藏混杂因子,并在信念流形上使用自然梯度下降进行原则性优化。在合成和真实世界基准测试上的实验表明,HOLOGRAPH在50-100个变量的因果发现任务中,既提供了严格的数学基础,又实现了具有竞争力的性能。我们的层理论分析揭示:虽然恒等性、传递性和粘合公理在数值精度(<10^{-6})上得到满足,但局部性公理在较大图中失效,这暗示了潜在变量投影中存在根本的非局部耦合。代码发布于[https://github.com/hyunjun1121/holograph](https://github.com/hyunjun1121/holograph)。