结构化产出空间组合问题单一多种解决办法的神经学学习 (Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces)

Recent research has proposed neural architectures for solving combinatorial problems in structured output spaces. In many such problems, there may exist multiple solutions for a given input, e.g. a partially filled Sudoku puzzle may have many completions satisfying all constraints. Further, we are often interested in finding any one of the possible solutions, without any preference between them. Existing approaches completely ignore this solution multiplicity. In this paper, we argue that being oblivious to the presence of multiple solutions can severely hamper their training ability. Our contribution is two fold. First, we formally define the task of learning one-of-many solutions for combinatorial problems in structured output spaces, which is applicable for solving several problems of interest such as N-Queens, and Sudoku. Second, we present a generic learning framework that adapts an existing prediction network for a combinatorial problem to handle solution multiplicity. Our framework uses a selection module, whose goal is to dynamically determine, for every input, the solution that is most effective for training the network parameters in any given learning iteration. We propose an RL based approach to jointly train the selection module with the prediction network. Experiments on three different domains, and using two different prediction networks, demonstrate that our framework significantly improves the accuracy in our setting, obtaining up to 21 pt gain over the baselines.

翻译：最近的研究提出了解决结构化输出空间的组合问题的神经结构。在许多这样的问题中,对特定输入可能存在多种解决方案,例如,部分填满的数独拼图可能有许多能够满足所有制约因素的完成。此外,我们常常有兴趣找到任何一种可能的解决办法,而没有两者之间的任何偏好。现有的方法完全忽视了这一解决办法的多重性。在本文中,我们争论说,忽视多种解决办法的存在会严重妨碍其培训能力。我们的贡献是两个折叠。首先,我们正式确定了在结构化输出空间学习组合问题的多种解决方案的任务,这些解决方案适用于解决若干感兴趣的问题,例如N-Q-Q和Sudoko。第二,我们提出了一个通用学习框架,将现有的组合问题预测网络用于处理解决方案的多重性。我们的框架使用一个选择模块,其目标是根据每一项投入动态地确定在任何给定的循环中培训网络参数最为有效的解决方案。我们提出了一个基于RL的方法,用于联合培训选择模块,用于解决N-Q-Q-Q-Q-Q-Sudok。我们提出了一个通用的通用框架,用不同的网络来大大改进我们的21基线,在不同的网络上进行实验。