Modern vision--language models (VLMs) are increasingly used to interpret and generate educational content, yet their semantic outputs remain challenging to verify, reproduce, and audit over time. Inconsistencies across model families, inference settings, and computing environments undermine the reliability of AI-generated instructional material, particularly in high-stakes and quantitative STEM domains. This work introduces SlideChain, a blockchain-backed provenance framework designed to provide verifiable integrity for multimodal semantic extraction at scale. Using the SlideChain Slides Dataset-a curated corpus of 1,117 medical imaging lecture slides from a university course-we extract concepts and relational triples from four state-of-the-art VLMs and construct structured provenance records for every slide. SlideChain anchors cryptographic hashes of these records on a local EVM (Ethereum Virtual Machine)-compatible blockchain, providing tamper-evident auditability and persistent semantic baselines. Through the first systematic analysis of semantic disagreement, cross-model similarity, and lecture-level variability in multimodal educational content, we reveal pronounced cross-model discrepancies, including low concept overlap and near-zero agreement in relational triples on many slides. We further evaluate gas usage, throughput, and scalability under simulated deployment conditions, and demonstrate perfect tamper detection along with deterministic reproducibility across independent extraction runs. Together, these results show that SlideChain provides a practical and scalable step toward trustworthy, verifiable multimodal educational pipelines, supporting long-term auditability, reproducibility, and integrity for AI-assisted instructional systems.
翻译:现代视觉-语言模型(VLMs)越来越多地用于解释和生成教育内容,但其语义输出在长期验证、复现与审计方面仍面临挑战。不同模型家族、推理设置和计算环境之间的不一致性,削弱了AI生成教学材料的可靠性,尤其在关键定量STEM领域。本文提出SlideChain,一种基于区块链的溯源框架,旨在为大规模多模态语义提取提供可验证的完整性保障。通过使用SlideChain幻灯片数据集——一个包含大学课程中1,117张医学影像讲座幻灯片的精选语料库——我们从四种前沿VLM中提取概念与关系三元组,并为每张幻灯片构建结构化溯源记录。SlideChain将这些记录的加密哈希锚定在本地EVM(以太坊虚拟机)兼容区块链上,提供防篡改的可审计性与持久化语义基线。通过对多模态教育内容中语义分歧、跨模型相似性与讲座级变异性的首次系统性分析,我们揭示了显著的跨模型差异,包括许多幻灯片上的低概念重叠度与近乎为零的关系三元组一致性。我们进一步评估了模拟部署条件下的Gas消耗、吞吐量与可扩展性,并展示了在独立提取运行中完美的篡改检测能力与确定性复现结果。综合表明,SlideChain为实现可信、可验证的多模态教育流程提供了实用且可扩展的解决方案,为AI辅助教学系统提供了长期可审计性、可复现性与完整性支持。