Decentralized inference is an appealing paradigm for serving large language models (LLMs), offering strong security, high efficiency, and lower operating costs. Yet the permissionless setting admits no a priori trust in participating nodes, making output verifiability a prerequisite for secure deployment. We present VeriLLM, a publicly verifiable protocol for decentralized LLM inference that (i) achieves security under a one-honest-verifier assumption, (ii) attains near-negligible verification cost (about 1% of the underlying inference) via a lightweight verification algorithm designed explicitly for LLMs, and (iii) enforces honest checking through a peer-prediction mechanism that mitigates lazy verification in naive voting. We further introduce an isomorphic inference-verification network that multiplexes both roles on the same set of GPU workers. This architecture (i) increases GPU utilization and thereby improves end-to-end throughput for both inference and verification, (ii) expands the effective pool of available validators, strengthening robustness and security, and (iii) enforces task indistinguishability at the worker boundary to prevent job-type-conditioned behavior. Finally, we provide a formal game-theoretic analysis and prove that, under our incentives, honest inference and verification constitute a Nash equilibrium, ensuring incentive compatibility against rational adversaries. To our knowledge, this is the first decentralized inference verification protocol with an end-to-end game-theoretic security proof.
翻译:暂无翻译