Effective human-AI collaboration hinges on the ability to dynamically integrate the complementary strengths of human experts and AI models across diverse decision contexts. Context-aware weighted combination of human and AI outputs is a promising technique, which involves the optimization of combination weights based on capabilities of decision agents on a given task. However, existing approaches treat humans and AI as isolated entities, lacking a unified representation to model the heterogeneous capabilities of multiple decision agents. To address this gap, we propose a novel capability-aware architecture that models both human and AI decision-makers using learnable capability vectors. These vectors encode task-relevant competencies in a shared latent space and are used by a transformer-based weight generation module to produce instance-specific aggregation weights. Our framework supports flexible integration of confidence scores or one-hot decisions from a variable number of agents. We further introduce a learning-free baseline using optimized global weights for human-AI collaboration. Extensive experiments on image classification and hate speech detection tasks demonstrate that our approach outperforms state-of-the-art methods under various collaboration settings with both simulated and real human labels. The results highlight the robustness, scalability, and superior accuracy of our method, underscoring its potential for real-world applications.
翻译:有效的人机协作关键在于能够动态整合人类专家与AI模型在不同决策情境下的互补优势。基于上下文感知的人类与AI输出加权组合是一种有前景的技术,它涉及根据决策代理在特定任务上的能力优化组合权重。然而,现有方法将人类与AI视为孤立实体,缺乏统一的表征来建模多个决策代理的异构能力。为填补这一空白,我们提出了一种新颖的能力感知架构,使用可学习的能力向量对人类和AI决策者进行建模。这些向量在共享潜在空间中编码任务相关能力,并由一个基于Transformer的权重生成模块用于生成实例特定的聚合权重。我们的框架支持灵活整合来自可变数量代理的置信度分数或独热决策。我们进一步引入了一种使用优化全局权重的无学习基线用于人机协作。在图像分类和仇恨言论检测任务上的大量实验表明,我们的方法在使用模拟和真实人类标签的各种协作设置下均优于现有最先进方法。结果凸显了我们方法的鲁棒性、可扩展性和卓越准确性,强调了其在现实应用中的潜力。