良好、更好、更好、最佳:通过强化学习,生成多选择视觉问题解答的文字抽取器 (Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning)

Multiple-choice VQA has drawn increasing attention from researchers and end-users recently. As the demand for automatically constructing large-scale multiple-choice VQA data grows, we introduce a novel task called textual Distractors Generation for VQA (DG-VQA) focusing on generating challenging yet meaningful distractors given the context image, question, and correct answer. The DG-VQA task aims at generating distractors without ground-truth training samples since such resources are rarely available. To tackle the DG-VQA unsupervisedly, we propose Gobbet, a reinforcement learning(RL) based framework that utilizes pre-trained VQA models as an alternative knowledge base to guide the distractor generation process. In Gobbet, a pre-trained VQA model serves as the environment in RL setting to provide feedback for the input multi-modal query, while a neural distractor generator serves as the agent to take actions accordingly. We propose to use existing VQA models' performance degradation as indicators of the quality of generated distractors. On the other hand, we show the utility of generated distractors through data augmentation experiments, since robustness is more and more important when AI models apply to unpredictable open-domain scenarios or security-sensitive applications. We further conduct a manual case study on the factors why distractors generated by Gobbet can fool existing models.

翻译：多重选择 VQA 最近引起了研究人员和终端用户越来越多的关注。随着自动构建大型多选择VQA数据的需求不断增长, 我们引入了一个新的任务, 名为VQA(DG- VQA)的文本吸引器生成( DG- VQA), 重点是根据上下文图像、问题和正确答案生成具有挑战性的、有意义的分散器。 DG- VQA 任务旨在生成没有地面真实性培训样本的分散器, 因为这类资源很少, 而没有地面真实性培训样本。为了不受监督地处理 DG- VQA, 我们建议Gobbbet, 以强化学习(RL)为基础的框架为基础, 将预先培训的 VQA 模型作为替代知识库, 指导驱动器生成过程。在 Gobbbbet 中, 预先培训的 VQA 模型作为环境, 为输入输入多模式的多模式提供反馈, 而神经性转移器生成的操作器作为相应行动的工具。我们提议使用现有的开放VQA模型作为生成的分心器质量指标。另一方面, 我们展示了更可靠的变动模型, 在生成的变动模型中, 变动模型中, 我们的变换了更能性应用了数据变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式, 我们的变式的变式的变式的变式的变式的变式的变式的变式的变式,, 我们的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式可以进一步应用了变式动作, 。