Identifying whether two product listings refer to the same Stock Keeping Unit (SKU) is a persistent challenge in ecommerce, especially when explicit identifiers are missing and product names vary widely across platforms. Rule based heuristics and keyword similarity often misclassify products by overlooking subtle distinctions in brand, specification, or bundle configuration. To overcome these limitations, we propose Question to Knowledge (Q2K), a multi agent framework that leverages Large Language Models (LLMs) for reliable SKU mapping. Q2K integrates: (1) a Reasoning Agent that generates targeted disambiguation questions, (2) a Knowledge Agent that resolves them via focused web searches, and (3) a Deduplication Agent that reuses validated reasoning traces to reduce redundancy and ensure consistency. A human in the loop mechanism further refines uncertain cases. Experiments on real world consumer goods datasets show that Q2K surpasses strong baselines, achieving higher accuracy and robustness in difficult scenarios such as bundle identification and brand origin disambiguation. By reusing retrieved reasoning instead of issuing repeated searches, Q2K balances accuracy with efficiency, offering a scalable and interpretable solution for product integration.
翻译:识别两个产品列表是否指向同一库存单位(SKU)是电子商务中的一个持续挑战,尤其是在缺少明确标识符且产品名称在不同平台间差异巨大的情况下。基于规则的启发式方法和关键词相似性常因忽略品牌、规格或套装配置中的细微差异而误分类产品。为克服这些限制,我们提出了问题到知识(Q2K),一个利用大型语言模型(LLMs)进行可靠SKU映射的多智能体框架。Q2K整合了:(1)一个生成针对性消歧问题的推理智能体,(2)一个通过聚焦网络搜索解决这些问题的知识智能体,以及(3)一个重用已验证推理轨迹以减少冗余并确保一致性的去重智能体。一个包含人工干预的机制进一步优化不确定案例。在真实世界消费品数据集上的实验表明,Q2K超越了强基线,在如套装识别和品牌来源消歧等困难场景中实现了更高的准确性和鲁棒性。通过重用检索到的推理而非重复搜索,Q2K在准确性与效率之间取得平衡,为产品集成提供了一个可扩展且可解释的解决方案。