2018顶会论文合集

获取全文PDF请查看：2018年度顶会论文完整盘点

CVPR 2018

大会时间：6月18日~22日

会议地点：盐湖城，UTAH

国际计算机视觉与模式识别会议（Conference on Computer Vision and Pattern Recognition，CVPR）是IEEE一年一度的学术性会议，会议的主要内容是计算机视觉与模式识别技术。CVPR是世界顶级的计算机视觉会议，近年来每年有约1000名参加者，收录的论文数量一般300篇左右。本会议每年都会有固定的研讨主题，而每一年都会有公司赞助该会议并获得在会场展示的机会。

最佳论文

《Taskonomy：Disentangling Task Transfer Learning》

Amir Zamir, Alexander Sax, William Shen, Leonidas Guibas, Jitendra Malik, Silvio Savarese

【Abstract】Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and pro- vides a principled way for identifying redundancies across tasks, in order to, for instance, seamlessly reuse supervision among related tasks or solve many tasks in one system without piling up the complexity.

We propose a fully computational approach for modeling the structure of the space of visual tasks. This is done via finding (first and higher-order) transfer learning dependencies across a dictionary of twenty-six 2D, 2.5D, 3D, and semantic tasks in a latent space. The product is a computational taxonomic map for task transfer learning. We study the consequences of this structure, e.g. emerged relationships, and exploit them to reduce the demand for labeled data. For example, we show that the total number of labeled data points needed for solving a set of 10 tasks can be reduced by roughly 2/3 (compared to training independently) while keeping the performance nearly the same. We provide a set of tools for computing and probing this taxonomical structure including a solver that users can employ to devise efficient supervision policies for their use cases.

【论文摘要】视觉任务之间是否有关联，或者它们是否无关？例如，法线(Surface Normals)可以简化估算图像的深度(Depth)吗？直觉回答了这些问题，暗示了视觉任务中存在结构。了解这种结构具有显著的价值;它是迁移学习的基本概念，并提供了一种原则性的方法来识别任务之间的冗余，例如，无缝地重用相关任务之间的监督或在一个系统中解决许多任务而不会增加复杂性。我们提出了一种完全计算的方法来建模视觉任务的空间结构。这是通过在潜在空间中的26个2D，2.5D，3D和语义任务的字典中查找（一阶和更高阶）迁移学习依赖性来完成的。该产品是用于任务迁移学习的计算分类地图。我们研究了这种结构的后果，例如非平凡的关系，并利用它们来减少对标签数据的需求。例如，我们表明，解决一组10个任务所需的标签数据点总数可以减少大约2/3（与独立训练相比），同时保持性能几乎相同。

最佳论文提名

《Deep Learning of Graph Matching》

Andrei Zanfir, Cristian Sminchisescu

【Abstract】The problem of graph matching under node and pair- wise constraints is fundamental in areas as diverse as combinatorial optimization, machine learning or computer vision, where representing both the relations between nodes and their neighborhood structure is essential. We present an end-to-end model that makes it possible to learn all parameters of the graph matching process, including the unary and pairwise node neighborhoods, represented as deep feature extraction hierarchies. The challenge is in the formulation of the different matrix computation layers of the model in a way that enables the consistent, efficient propagation of gradients in the complete pipeline from the loss function, through the combinatorial optimization layer solving the matching problem, and the feature extraction hierarchy. Our computer vision experiments and ablation studies on challenging datasets like PASCAL VOC keypoints, Sintel and CUB show that matching models refined end-to-end are superior to counterparts based on feature hierarchies trained for other problems.

【论文摘要】在节点和配对约束下的图匹配问题是组合优化、机器学习或计算机视觉等许多领域中的基本问题，其中表示节点之间的关系及其邻域结构是至关重要的。本文提出了一个端到端的模型，使其能够学习图形匹配过程的所有参数，包括表示为深度特征提取层次的一元节点邻域和二元节点邻域。挑战在于通过求解匹配问题的组合优化层和特征提取层次，以能够从损失函数在整个管道（pipeline）中实现梯度的一致。坐着在PASCAL VOC keypoints、Sintel和CUB等具有挑战性的数据集上的计算机视觉实验和消融研究表明，端到端精确匹配模型优于基于针对其他问题训练出的特征层次结构的模型。

《SPLATNet: Sparse Lattice Networks for Point Cloud Processing》

Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

【Abstract】We present a network architecture for processing point clouds that directly operates on a collection of points rep- resented as a sparse set of samples in a high-dimensional lattice. Na ̈ıvely applying convolutions on this lattice scales poorly, both in terms of memory and computational cost, as the size of the lattice increases. Instead, our network uses sparse bilateral convolutional layers as building blocks. These layers maintain efficiency by using indexing structures to apply convolutions only on occupied parts of the lattice, and allow flexible specifications of the lattice structure enabling hierarchical and spatially-aware feature learning, as well as joint 2D-3D reasoning. Both point-based and image-based representations can be easily incorporated in a network with such layers and the resulting model can be trained in an end-to-end manner. We present results on 3D segmentation tasks where our approach outperforms existing state-of-the-art techniques.

【论文摘要】本文提出了用于处理点云的网络结构，该点云直接在高维网格中表示为稀疏样本集的点集合上操作。随着晶格尺寸的增加，在这个晶格上应用卷积在存储和计算成本方面都表现得非常糟糕。相反，我们的网络使用稀疏的双边卷积层作为基本结构。这些层通过使用索引结构来保持效率，从而仅对格子的占用部分应用卷积，并且允许格子结构的灵活规范，从而实现分层和空间感知的特征学习以及联合2D-3D推理。基于点和基于图像的表示都可以很容易地结合到具有此类层的网络中，并且所得到的模型可以用端到端的方式训练。本文在3D分割任务上的结果显示该方法优于现有最优的技术。

《CodeSLAM-learning a Compact, Optimisable Representation for Dense Visual SLAM》

Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison

【Abstract】The representation of geometry in real-time 3D perception systems continues to be a critical research issue. Dense maps capture complete surface shape and can be augmented with semantic labels, but their high dimensionality makes them computationally costly to store and process, and unsuitable for rigorous probabilistic inference. Sparse feature-based representations avoid these problems, but capture only partial scene information and are mainly useful for localisation only.

We present a new compact but dense representation of scene geometry which is conditioned on the intensity data from a single image and generated from a code consisting of a small number of parameters. We are inspired by work both on learned depth from images, and auto-encoders. Our approach is suitable for use in a keyframe-based monocular dense SLAM system: While each keyframe with a code can produce a depth map, the code can be optimised efficiently jointly with pose variables and together with the codes of overlapping keyframes to attain global consistency. Conditioning the depth map on the image allows the code to only represent aspects of the local geometry which cannot directly be predicted from the image. We explain how to learn our code representation, and demonstrate its advantageous properties in monocular SLAM.

【论文摘要】实时三维感知系统中的几何表示仍然是一个关键的研究课题。稠密映射可以捕获完整的表面形状，并且可以用语义标签进行扩充，但是它们的高维数使得它们存储和处理的计算成本很高，并且不适合用于严格的概率推断。稀疏的基于特征的表示避免了这些问题，但是只捕获部分场景信息，并且主要用于定位。本文提出一种新的紧凑密集的场景几何表示，它以单个图像的强度数据为条件，并且由含少量参数的编码生成。这个方法的灵感来自于从图像学习的深度和自动编码器两方面的工作。该方法适合在基于关键帧的单目密集SLAM系统中使用：虽然每个带有编码的关键帧可以生成一个深度图，但是可以与姿态变量以及重叠关键帧的编码一起有效地优化编码，以实现全局一致性。对图像上的深度图进行条件化允许编码仅表示不能从图像中直接预测的局部几何体。本文还解释如何学习编码表示，并演示其在单目SLAM中的优势。

《Efficient Optimization for Rank-based Loss Functions》

Pritish Mohapatra, Michal Rolínek C.V. Jawahar, Vladimir Kolmogorov, M. Pawan Kumar

【Abstract】The accuracy of information retrieval systems is often measured using complex loss functions such as the aver- age precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time.

【论文摘要】信息检索系统的精度通常使用诸如平均精度（Average Precision，AP）或归一化折扣累积增益（Normalized Discounted Cumulative Gain，NDCG）的复杂损失函数来测量。给定一组正样本和负样本，可以通过最小化这些损失函数来估计检索系统的参数。然而，这些损失函数的不可微性和不可分解性使得我们无法使用简单的基于梯度的优化算法。这个问题通常通过优化损失函数的结构铰链损失（hinge-loss）上界或者使用像直接损失最小化框架（direct-loss minimization framework）这样的渐进方法来避免。然而，损失增强推理（loss-augmented inference）的高计算复杂度限制了它在大型训练数据集中的使用。为了克服这一不足，我们提出了一种针对大规模不可分解损失函数的快速排序算法。我们提供了符合这一算法的损失函数的特征描述，它可以处理包括AP和NDCC系列的损失函数。此外，我们证明了任何基于比较的算法都不能提高我们方法的渐近计算复杂度。在优化各种视觉任务学习模型的结构铰链损失上限的AP和NDCG损失，我们证明了该方法的有效性。我们证明该方法比简单的可分解损失函数提供更好的结果，同时只需要相当的训练时间。

ECCV 2018

会议时间：9月8日~14日

会议地点：慕尼黑，德国

欧洲计算机视觉国际会议（European Conference on Computer Vision，ECCV）两年一次，是计算机视觉三大会议（另外两个是ICCV和CVPR）之一。每次会议在全球范围录用论文300篇左右，主要的录用论文都来自美国、欧洲等顶尖实验室及研究所，中国大陆的论文数量一般在10-20篇之间。ECCV2010的论文录取率为27%。

本届大会收到论文投稿 2439 篇，接收 776 篇（31.8%），59 篇 oral 论文，717 篇 poster 论文。在活动方面，ECCV 2018 共有 43 场 Workshop 和 11 场 Tutorial。

最佳论文Best Paper Award（一篇）

《Implicit 3D Orientation Learning for 6D Object Detection from RGB Images》

Martin Sundermeyer, Zoltan-Csaba Marton, Maximilian Durner, Manuel Brucker, Rudolph Triebel

【Abstract】We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization.

This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Experiments on the T-LESS and LineMOD datasets show that our method outperforms similar model- based approaches and competes with state-of-the art approaches that require real pose-annotated images.

【论文摘要】本文提出了一种基于RGB图像的实时物体检测与6维姿态估计的方法。其中，新型的3维目标朝向估计方法是基于降噪自编码器（Denoising Autoencoder）的一个变种，它使用域随机化（Domain Randomization）方法在3维模型的模拟视图上进行训练。这种我们称之为“增强自编码器”（Augmented Autoencoder，AAE）的方法，比现有方法具有很多优点：它不需要真实的姿势标注的训练数据，可泛化到多种测试传感器，且能够内部处理目标和视图的对称性。该方法不学习从输入图像到目标姿势的明确映射，相反，它提供了样本在隐空间（latent space）中定义的目标朝向的隐式表达。在 T-LESS 和 LineMOD 数据集上的测试表明，我们的方法优于类似的基于模型的方法，可以媲美需要真实姿态标注图像的当前最优的方法。

最佳论文提名

Best Paper Award, Honorable Mention（两篇）

《Group Normalization》

【Abstract】Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems — BN’s error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN’s usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN’s computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN- based counterparts for object detection and segmentation in COCO,1 and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

【论文摘要】批量归一化（Batch Normalization，BN）是深度学习发展中的一项里程碑式技术，可以让各种网络进行训练。但是，批量维度进行归一化会带来一些问题——批量统计估算不准确导致批量变小时，BN的误差会迅速增加。因此，BN在训练大型网络或者将特征转移到计算机视觉任务（包括检测、分割和视频）的应用受到了限制，因为在这类问题中，内存消耗限制了只能使用小批量的BN。在这篇论文中，作者提出了群组归一化（Group Normalization，GN）的方法作为 BN 的替代方法。GN首先将通道（channel）分为许多组（group），对每一组计算均值和方差，以进行归一化。GN的计算与批大小（batch size）无关，并且它的精度在不同批大小的情况中都很稳定。在ImageNet上训练的ResNet-50上，当批量大小为2时，GN的误差比BN低10.6%。当使用经典的批量大小时，GN与BN相当，但优于其他归一化变体。此外，GN 可以很自然地从预训练阶段迁移到微调阶段。在COCO的目标检测和分割任务以及Kinetics的视频分类任务中，GN的性能优于或与BN变体相当，这表明GN可以在一系列不同任务中有效替代BN；在现代的深度学习库中，GN通过若干行代码即可轻松实现。

《GANimation: Anatomically-aware Facial Animation from a Single Image》

【Abstract】Recent advances in Generative Adversarial Networks(GANs) have shown impressive results for task of facial expression synthesis. The most successful architecture is StarGAN [4], that conditions GANs’ generation process with images of a specific domain, namely a set of images of persons sharing the same expression. While effective, this approach can only generate a discrete number of expressions, determined by the content of the dataset. To address this limitation, in this paper, we introduce a novel GAN conditioning scheme based on Action Units (AU) annotations, which describes in a continuous manifold the anatomical facial movements defining a human expression. Our approach allows controlling the magnitude of activation of each AU and combine several of them. Additionally, we propose a fully unsupervised strategy to train the model, that only requires images annotated with their activated AUs, and exploit attention mechanisms that make our network robust to changing backgrounds and lighting conditions. Extensive evaluation show that our approach goes beyond competing conditional generators both in the capability to synthesize a much wider range of expressions ruled by anatomically feasible muscle movements, as in the capacity of dealing with images in the wild.

【论文摘要】生成式对抗网络（Generative Adversarial Networks, GANs）近期在面部表情合成任务中取得了惊人表现，其中最成功的架构是StarGAN，它把GANs的图像生成过程限定在了特定情形中，即一组不同的人做出同一个表情的图像。这种方法虽然有效，但只能生成若干离散的表情，具体生成哪一种取决于训练数据内容。为了处理这种限制问题，本文提出了一种新的GAN条件限定方法，该方法基于动作单元（Action Units，AU）标注，而在连续的流形中，动作单元标注可以描述定义人类表情的解剖学面部动作。这种方法可以使我们控制每个AU的激活程度，并将之组合。除此以外，本文还提出一种完全无监督的方法用来训练模型，只需要标注了激活的AU的图像，并通过应用注意力机制（attention mechanism）就可使网络对背景和光照条件的改变保持鲁棒性。大量评估表明该方法比其他的条件生成方法有明显更好的表现，不仅表现在有能力根据解剖学上可用的肌肉动作生成多样的表情，而且也能更好地处理来自户外的图像。

IJCAI-ECAI-2018

会议日期：7月13日~19日

会议地点：斯德哥尔摩，瑞典

国际人工智能联合会议（International Joint Conference on Artificial Intelligence, IJCAI）是人工智能领域中最主要的学术会议之一，原为单数年召开，自2015年起改为每年召开。今年来华人在IJCAI的参与度不断增加，尤其是南京大学的周志华教授将担任 IJCAI-21 的程序主席，成为 IJCAI 史上第一位华人大会程序主席。

欧洲人工智能会议（European Conference on Artificial Intelligence，ECAI）是在欧洲举行的主要人工智能和机器学习会议，始于1974年，由欧洲人工智能协调委员会主办。ECAI通常与IJCAI和AAAI并称AI领域的三大顶会。

今年IJCAI和ECAI两个会议将与7月13日~19日再瑞典首都斯德哥尔摩联合举办。此外，今年IJCAI并未颁发最佳论文、最佳学生论文等奖项，而是一连放出了7篇杰出论文。来自北京大学、武汉大学、清华大学、北京理工大学的研究榜上有名。

杰出论文：

《SentiGAN: Generating Sentimental Texts via Mixture Adversarial Networks》

Ke Wang, Xiaojun Wan

【Abstract】Generating texts of different sentiment labels is get- ting more and more attention in the area of natural language generation. Recently, Generative Adversarial Net (GAN) has shown promising results in text generation. However, the texts generated by GAN usually suffer from the problems of poor quality, lack of diversity and mode collapse. In this paper, we propose a novel framework SentiGAN, which has multiple generators and one multi-class discriminator, to address the above problems. In our framework, multiple generators are trained simultaneously, aiming at generating texts of different sentiment labels without supervision. We pro- pose a penalty based objective in the generators to force each of them to generate diversified examples of a specific sentiment label. Moreover, the use of multiple generators and one multi-class discriminator can make each generator focus on generating its own examples of a specific sentiment label accurately. Experimental results on four datasets demonstrate that our model consistently outperforms several state-of-the-art text generation methods in the sentiment accuracy and quality of generated texts.

【论文摘要】在自然语言生成领域，不同情感文本的生成受到越来越广泛的关注。近年来，生成对抗网（GAN）在文本生成中取得了成功的应用。然而，GAN 所产生的文本通常存在质量差、缺乏多样性和模式崩溃的问题。在本文中，我们提出了一个新的框架——SentiGAN，包含多个生成器和一个多类别判别器，以解决上述问题。在我们的框架中，多个生成器同时训练，旨在无监督环境下产生不同情感标签的文本。我们提出了一个基于目标的惩罚函数，使每个生成器都能在特定情感标签下生成具有多样性的样本。此外，使用多个生成器和一个多类判别器可以使每个生成器专注于准确地生成自己的特定情感标签的例子。在四个数据集上的实验结果表明，我们的模型在情感准确度和生成文本的质量方面始终优于几种最先进的文本生成方法。

《Reasoning about Consensus when Opinions Diffuse through Majority Dynamics》

Vincenzo Auletta，Diodato Ferraioli，Gianluigi Greco

【Abstract】Opinion diffusion is studied on social graphs where agents hold binary opinions and where social pressure leads them to conform to the opinion manifested by the majority of their neighbors. Within this setting, questions related to whether a minority/majority can spread the opinion it supports to all the other agents are considered. It is shown that, no matter of the underlying graph, there is always a group formed by a half of the agents that can annihilate the opposite opinion. Instead, the influence power of minorities depends on certain features of the given graph, which are NP-hard to be identified. Deciding whether the two opinions can coexist in some stable configuration is NP-hard, too.

【论文摘要】在社会图中，agent持有二元意见，并且社会压力导致他们遵从大多数邻居所表示的意见。在这种背景下，考虑有关少数/多数是否能够将其支持的意见传播到所有其他agent的问题。研究结果表明，无论底层图如何，总是存在一个由半数agent组成的群体可以消除相反的意见。相反，少数群体的影响力取决于给定图的某些特征，这些特征的识别是NP难问题。决定这两种观点是否可以在某种稳定的配置中共存也是NP难的。

《R-SVM+: Robust Learning with Privileged Information》

Xue Li , Bo Du , Chang Xu , Yipeng Zhang , Lefei Zhang , Dacheng Tao

【Abstract】In practice, the circumstance that training and test data are clean is not always satisfied. The performance of existing methods in the learning using privileged information (LUPI) paradigm may be seriously challenged, due to the lack of clear strategies to address potential noises in the data. This paper proposes a novel Robust SVM+ (R- SVM+) algorithm based on a rigorous theoretical analysis. Under the SVM+ framework in the LUPI paradigm, we study the lower bound of perturbations of both example feature data and privileged feature data, which will mislead the model to make wrong decisions. By maximizing the lower bound, tolerance of the learned model over perturbations will be increased. Accordingly, a novel regularization function is introduced to upgrade a variant form of SVM+. The objective function of R- SVM+ is transformed into a quadratic programming problem, which can be efficiently optimized using off-the-shelf solvers. Experiments on real- world datasets demonstrate the necessity of studying robust SVM+ and the effectiveness of the proposed algorithm.

【论文摘要】实际应用场景下，训练数据和测试数据质量并不足够干净。由于缺少解决数据中潜在噪声的有效策略，现有方法的效果在特权信息学习（learning using privileged information，LUPI）范式中可能受到很大的挑战。本文基于严格的理论分析，提出了一种新的鲁棒SVM+（R-SVM+）算法。我们在SVM+框架下的LUPI中研究了样本标签数据和特权标签数据的扰动下界，这个扰动下界会误导模型做出错误的决策。通过最大化下界，所学习的模型在扰动下的容忍度将会增大。因此，新的正则化函数被引入，用于升级SVM+的变体。将R-SVM+的目标函数转化为二次规划问题，利用现成的求解方法可以很容易进行优化求解。实证结果展现了R-SVM+的必要性和算法的有效性。

《From Conjunctive Queries to Instance Queries in Ontology-Mediated Querying》

Cristina Feier, Carsten Lutz, Frank Wolter

【Abstract】We consider ontology-mediated queries (OMQs) based on expressive description logics of the ALC family and (unions) of conjunctive queries, studying the rewritability into OMQs based on instance queries (IQs). Our results include exact characterizations of when such a rewriting is possible and tight complexity bounds for deciding rewritability. We also give a tight complexity bound for the related problem of deciding whether a given MMSNP sentence is equivalent to a CSP.

【论文摘要】我们考虑基于ALC族和连接查询的表达性描述逻辑的本体中介查询（ontology-mediated queries，OMQs），研究基于实例查询（instance queries，IQs）的OMQ的可重写性。我们的结果包括这种重写何时能精确表征以及决定重写性的严格复杂性界限。我们还给出了判定给定MMSNP语句是否等价于CSP的相关问题的严格复杂度界限。

《What Game are We Playing? End-to-end Learning in Normal and Extensive from Games》

Chun Kai Ling, Fei Fang, J. Zico Kolter

【Abstract】Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents. This paper deals with the relatively under-explored but equally important “in- verse” setting, where the parameters of the under- lying game are not known to all agents, but must be learned through observations. We propose a differentiable, end-to-end learning framework for ad- dressing this task. In particular, we consider a regularized version of the game, equivalent to a particular form of quantal response equilibrium, and develop 1) a primal-dual Newton method for finding such equilibrium points in both normal and extensive form games; and 2) a backpropagation method that lets us analytically compute gradients of all relevant game parameters through the solution itself. This ultimately lets us learn the game by training in an end-to-end fashion, effectively by integrating a “differentiable game solver” into the loop of larger deep network architectures. We demonstrate the effectiveness of the learning method in several set- tings including poker and security game tasks.

【论文摘要】虽然最近人工智能的研究在求解大型、零和、扩展形式的博弈方面取得了很大进展，但过去大多数工作中的基本假设是博弈本身的参数是agent已知的。本文讨论相对未被充分探索但同样重要的“逆”设置，其中底层博弈的参数不是所有agent都知道的，必须通过观察来学习。我们提出一个可微的、端到端的学习框架来处理这个任务。特别地，我们考虑博弈的正则化版本，等价于随机最优反应均衡（quantal response equilibrium）的特定形式，并改进：1)在正规形式博弈和扩展形式博弈中寻找这种平衡点的原始-对偶牛顿（primal-dual Newton）方法；2)反向传播方法，它使我们能够通过解本身来计算所有相关博弈参数的梯度。这最终让我们通过端到端的训练来学习博弈，通过将“可微的博弈求解器”有效地集成到更大的深层网络体系结构的循环中。我们展示了该学习方法在多种设置中的有效性，包括扑克和安全博弈任务。

《Commonsense Knowledge Aware Conversation Generation with Graph Attention》

Hao Zhou, Tom Young, Minlie Huang, Haizhou Zhao, Jingfang Xu, Xiaoyan Zhu

【Abstract】Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language under- standing and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus sup- ports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state- of-the-art baselines.

【论文摘要】常识知识对许多自然语言处理任务至关重要。本文提出了一种新的开放领域会话生成模型，以演示大规模常识知识如何促进语言理解和生成。给定用户帖子，模型从知识库中检索相关知识图，然后用静态图注意力机制对图进行编码，从而增强帖子的语义信息，从而支持对帖子的更好理解。然后，在单词生成过程中，该模型通过动态图注意力机制仔细地读取检索到的知识图和每个图中的知识三元组，以便于更好地生成。这是第一次尝试在对话生成中使用大规模常识知识。此外，与现有模型分别和独立地使用知识三元组（实体）不同，我们的模型将每个知识图作为一个整体来处理，从而在图中编码更结构化、连接的语义信息。实验表明，该模型能够产生比现有基准更合适、信息量更大的响应。

《A Degeneracy Framework for Graph Similarity》

Giannis Nikolentzos，Polykarpos Meladianos，Stratis Limnios，Michalis Vazirgiannis

【Abstract】The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different structure at different scales. In this paper, we present a general framework for graph similarity which takes into account structure at multiple different scales. The proposed framework capitalizes on the well- known k-core decomposition of graphs in order to build a hierarchy of nested subgraphs. We apply the framework to derive variants of four graph kernels, namely graphlet kernel, shortest-path kernel, Weisfeiler-Lehman subtree kernel, and pyramid match graph kernel. The framework is not limited to graph kernels, but can be applied to any graph comparison algorithm. The proposed frame- work is evaluated on several benchmark datasets for graph classification. In most cases, the core- based kernels achieve significant improvements in terms of classification accuracy over the base kernels, while their time complexity remains very at- tractive.

【论文摘要】精确测量图形之间的相似性是许多学科应用的核心问题。大多数现有的确定图相似性的方法要么关注图的局部性质，要么关注图的全局性质。然而，即使从局部或全局的角度来看，图形看起来非常相似，但它们可能在不同的尺度上表现出不同的结构。本文提出了一个通用的图相似性框架，该框架考虑了多个不同尺度上的结构。该框架利用图的k核（k-core）分解来构建嵌套子图的层次结构。应用该框架导出了四种图核（graph kernels）的变体，即图核、最短路径核、Weisfeiler-Lehman子树核和金字塔匹配图核。该框架不仅限于图核，而是可以应用于任何图比较算法。该框架在多个用于图分类的基准数据集上进行了评估。在大多数情况下，基于核(core-based)的内核(kernel)在分类精度方面比基本内核(base kernel)有显著的提高，而它们的时间复杂度仍然非常优秀。

ICML 2018

会议时间：7月10日~15日

会议地点：斯德哥尔摩，瑞典

国际机器学习大会（International Conference on Machine Learning，ICML），如今已发展为由国际机器学习学会（IMLS）主办的年度机器学习国际顶级会议。

最佳论文Best Paper Awards

《Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples》

Anish Athalye，Nicholas Carlini，David Wagner

【Abstract】We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization- based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining noncertified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

【论文摘要】我们发现混淆梯度（obfuscated gradient）——这种梯度掩蔽（gradient masking）现象会导致在防御对抗样本（adversarial examples）中有种虚假安全感。尽管基于混淆梯度的防御看起来击败了基于优化的攻击，但是我们发现依赖于此的防御并非万无一失。我们描述了表现出这种效果的防御的特征行为，并且对于我们发现的三种类型的混淆梯度中的每一种，我们都开发了攻击技术来克服它。在一个案例研究中，在ICLR 2018上检查未经认证的白盒安全防御，我们发现混淆梯度是很常见的——9个中的7个依赖于混淆梯度。在每篇论文所考虑的原始威胁模型中，我们的新攻击成功完全绕过了6个，只有一个是部分绕过。

《Delayed Impact of Fair Machine Learning》

Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt

【Abstract】Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.

We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.

Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.

【论文摘要】机器学习的公平性主要在静态分类设置中进行研究，而不关心决策如何随着时间的推移改变潜在的群体。传统观点认为，公平标准可以促进他们旨在保护的群体的长期利益。

我们研究静态公平标准如何与暂时的利益指标相互作用，例如利益变量的长期提升、停滞和下降。我们证明了即使在一步反馈模型中，常见的公平标准通常也不会随着时间的推移而带来改善，并且实际上可能在无约束的目标不会导致损害的情况下造成伤害。我们全面的总结了三个标准准则的延迟影响，对比了这些标准表现出质量上的不同的行为。此外，我们发现自然形式的测量误差放宽了公平标准，从而有利地发挥作用的制度。

我们的结果强调了度量和时序建模在评估公平准则中的重要性，提出了一系列新的挑战和权衡取舍。

最佳论文亚军Best Paper Runner Up Awards

《Near Optimal Frequent Directions for Sketching Dense and Sparse Matrices》

Zengfeng Huang

【Abstract】Given a large matrix A ∈ Rn×d, we consider the problem of computing a sketch matrix B ∈ Rl×d which is significantly smaller than but still well approximates A. We are interested in minimizing the covariance error ∥AT A − BT B∥2. We consider the problems in the streaming model, where the algorithm can only make one pass over the input with limited working space. The popular Frequent Directions algorithm of (Liberty, 2013) and its variants achieve optimal space-error tradeoff. However, whether the running time can be improved remains an unanswered question. In this paper, we almost settle the time complexity of this problem. In particular, we provide new space-optimal algorithms with faster running times. Moreover, we also show that the running times of our algorithms are near-optimal unless the state-of-the-art running time of matrix multiplication can be improved significantly.

【论文摘要】给定一个维的大型矩阵A，我们考虑计算l x d维的草图矩阵（sketch matrix），这个矩阵的维度要显著小于原矩阵A，但它仍可以很好的近似A。我们希望最小化协方误差∥AT A − BT B∥2。我们再考虑流模型（streaming model）中的问题，在这个模型里，算法只能在有限的工作空间内传输输入一次。流行的 Frequent Directions 算法（Liberty, 2013）与它的变体实现了最优空间和误差间的权衡，然而，运行时间能否缩减还是一个未解决问题。在本论文中，我们几乎解决了这个问题的时间复杂度。特别是，我们提供了有更快运行时间的新型空间-最优（space-optimal）算法。此外，除非矩阵乘法的当前最优运行时间能显著提升，否则我们算法的运行时间是近似最优的（near-optimal）。

《The Mechanics of n-Player Differentiable Games》

David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

【Abstract】The cornerstone underpinning deep learning is the guarantee that gradient descent on an objective converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, where there are multiple interacting losses. The behavior of gradient-based methods in games is not well understood – and is becoming increasingly important as adversarial and multi- objective architectures proliferate. In this paper, we develop new techniques to understand and control the dynamics in general games. The key result is to decompose the second-order dynamics into two components. The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in general games. Basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs – whilst at the same time being applicable to – and having guarantees in – much more general games.

【论文摘要】深度学习的基石是保证目标函数能利用梯度下降收敛到局部极小值。不幸的是，这个保证在某些情况下会失效，例如在生成对抗网络中有多个交互损失。在博弈中，基于梯度的方法的行为并没有得到很好的理解，随着对抗性和多目标架构的数量激增，这个问题变得越来越重要。在这篇论文中，我们开发了新的技术来理解和控制一般博弈中的动态。主要的结果是将二阶动态分解为两个部分。第一个和潜在博弈（potential game）相关，可以用内含的函数简化为梯度下降；第二个和汉密尔顿博弈（Hamiltonian game）相关，这是一种新的博弈类型，遵循一种守恒定律——类似于经典力学系统中的守恒定律。该分解启发了辛梯度调整（Symplectic Gradient Adjustment，SGA），这是一种用于寻找一般博弈中的稳定不动点的新算法。基础实验表明 SGA 的性能和近期提出的寻找 GAN 稳定不动点的算法不相上下，同时可以应用到更多的一般博弈中，并保证收敛性。

《Fairness Without Demographics in Repeated Loss Minimization》

Tatsunori Hashimoto, Megha Srivastava, Hongseok Namkoong, Percy Liang

【Abstract】Machine learning models (e.g., speech recognizers) are usually trained to minimize average loss, which results in representation disparity— minority groups (e.g., non-native speakers) con- tribute less to the training objective and thus tend to suffer higher loss. Worse, as model accuracy affects user retention, a minority group can shrink over time. In this paper, we first show that the status quo of empirical risk minimization (ERM) amplifies representation disparity over time, which can even make initially fair models unfair. To mitigate this, we develop an approach based on distributionally robust optimization (DRO), which minimizes the worst case risk over all distributions close to the empirical distribution. We prove that this approach controls the risk of the minority group at each time step, in the spirit of Rawlsian distributive justice, while remaining oblivious to the identity of the groups. We demonstrate that DRO prevents disparity amplification on examples where ERM fails, and show improvements in minority group user satisfaction in a real-world text autocomplete task.

【论文摘要】机器学习模型（如语音识别器）通常被训练以最小化平均损失，这导致了表征差异（representation disparity）问题——少数群体（如非母语说话者）对训练目标函数的贡献较少，并因此带来了更高的损失。更糟糕的是，由于模型准确率会影响用户留存，因此少数群体的数量会随着时间而日益减少。本论文首先展示了经验风险最小化（empirical risk minimization，ERM）的现状放大了表征差异，这甚至使得最初公平的模型也变得不公平了。为了减小这一问题，我们提出了一种基于分布式鲁棒优化（distributionally robust optimization，DRO）的方法，可以最小化所有分布上的最大风险，使其接近经验分布。我们证明了该方法可以控制每个时间步的少数群体风险，使其符合罗尔斯分配正义（rawlsian distributive justice），不过并不清楚该方法对群体的标识如何。我们证明DRO可以阻止样本的表征差异扩大，而这是ERM做不到的，我们还在现实世界的文本自动完成任务上证明了该方法对少数群体用户满意度有所改进。

NIPS 2018

会议时间：12月3日~8日

会议地点：蒙特利尔，加拿大

神经信息处理系统大会(Conference and Workshop on Neural Information Processing Systems，NIPS)，是一个关于机器学习和计算神经科学的国际会议。该会议固定在每年的12月举行,由NIPS基金会主办。NIPS是机器学习领域的顶级会议。在中国计算机学会的国际学术会议排名中，NIPS为人工智能领域的A类会议

最佳论文

《Neural Ordinary Differential Equations》

Tian Qi Chen， Yulia Rubanova， Jesse Bettencourt， David Duvenaud

【Abstract】We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black- box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.

【论文摘要】本文提出了一种新的深度神经网络模型。我们使用神经网络来参数化隐藏状态的导数，而不是指定一个离散的隐藏层序列。利用黑盒微分方程求解器计算网络的输出。这些连续深度模型具有固定的存储成本，可以根据每个输入调整其评估策略，并且可以显式地通过改变数值精度换取速度。我们在连续深度残差网络和连续时间潜在变量模型中证明了这些性质。我们还构建了连续标准化流（continuous normalizing flows），这是一个可以通过极大似然进行训练、而无需对数据维度进行分区或排序的生成模型。对于训练过程，我们展示了如何在不访问任何ODE求解器内部操作的情况下，可扩展地反向传播。这允许在更大的模型中对ODE进行端到端训练。

《Non-delusional Q-learning and Value-iteration》

Tyler Lu， Dale Schuurmans， Craig Boutilier

【Abstract】We identify a fundamental source of error in Q-learning and other forms of dynamic programming with function approximation. Delusional bias arises when the approximation architecture limits the class of expressible greedy policies. Since standard Q-updates make globally uncoordinated action choices with respect to the expressible policy class, inconsistent or even conflicting Q-value estimates can result, leading to pathological behaviour such as over/under-estimation, instability and even divergence. To solve this problem, we introduce a new notion of policy consistency and define a local backup process that ensures global consistency through the use of information sets—sets that record constraints on policies consistent with backed-up Q-values. We prove that both the model-based and model-free algorithms using this backup remove delusional bias, yielding the first known algorithms that guarantee optimal results under general conditions. These algorithms furthermore only require poly nomially many information sets (from a potentially exponential support). Finally, we suggest other practical heuristics for value-iteration and Q-learning that attempt to reduce delusional bias.

【论文摘要】我们确定了Q-learning和其它形式的动态规划中的一个基本的误差来源。当近似体系结构限制了可表达的贪婪策略类时，就会产生妄想偏差（delusional bias）。由于标准Q-updates对可表达的策略类做出了全局不协调的动作选择，可能导致不一致甚至冲突的Q值估计，从而导致错误行为，如过高/过低估计、不稳定甚至分歧。为了解决这个问题，我们引入了新的策略一致性概念，并定义了一个本地备份流程，该流程通过使用信息集来确保全局一致性，这些信息集记录了与备份后的Q值一致的策略约束。我们证明使用此备份的基于模型和无模型的算法都可消除妄想偏差，从而产生第一种已知算法，可在一般条件下保证最佳结果。此外，这些算法仅需要多项式的一些信息集即可。最后，我们建议尝试其它实用启发式方法，以减少妄想偏差的Value-iteration和 Q-learning。

《Optimal Algorithms for Non-Smooth Distributed Optimization in Networks》

Kevin Scaman， Francis Bach， Sebastien Bubeck， Laurent Massoulié， Yin Tat Lee

【Abstract】In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in O(1/√t), the structure of the communication network only impacts a second-order term in O(1/t), where t is time. In other words, the error due to lim- its in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a d1/4 multiplicative factor of the optimal convergence rate, where d is the underlying dimension.

【论文摘要】我们利用计算单元网络，研究了非光滑凸函数的分布优化问题。我们在两个正则性假设下研究这个问题：(1)全局目标函数的Lipschitz连续性，(2)局部单个函数的Lipschitz连续性。在局部正则性假设下，我们提出第一个最优一阶分散算法，即多步原始对偶算法(multimulti-step primal-dual, MSPD)，并给出了相应的最优收敛速度。值得注意是，对于非光滑函数，虽然误差的主导项在中，但是通信网络的结构只影响的二阶项，其中t为时间。也就是说，即使在非强凸目标函数的情况下，由于通信资源的限制而产生的误差也会快速减小。在全局正则性假设下，我们提出了一种基于目标函数局部平滑的简单而有效的分布式随机平滑算法(distributed smooth, DRS)，并证明了DRS是在最优收敛率的乘因子范围内，其中d为底层维数。

《Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes》

Hassan Ashtiani， Shai Ben-David， Nick Harvey， Christopher Liaw， Abbas Mehrabian， Yaniv Plan

【Abstract】We prove that \(\widetilde{\theta } （kd^{2}/\varepsilon ^{2})\) samples are necessary and sufficient for learning a mixture of k Gaussians in \(R^{d}\), up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that \(\widetilde{\theta } （kd/\varepsilon ^{2})\) samples suffice, matching a known lower bound.

The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in \(R^{d}\) has an efficient sample compression.

【论文摘要】我们证明了\(\widetilde{\theta } （kd^{2}/\varepsilon ^{2})\)样本对于学习\(R^{d}\)中的k阶高斯混合是充分必要的，直到整体偏差距离为误差ε。这改善了该问题已知的上限和下限。对于轴对齐高斯分布（axis-aligned Gaussians）的混合，我们证明\(\widetilde{\theta } （kd/\varepsilon ^{2})\)样本是足够的，这与已知的下界相匹配。上界是基于一种新的方法，即基于样本压缩(sample compression)概念的分布式学习。任何一类允许这种样本压缩方案的分布也可以通过很少的样本来学习。我们的主要结果是证明了\(R^{d}\)中的高斯类具有有效的样本压缩。

AAAI 2018

会议时间：2月2日~7日

会议地点：新奥尔良市，美国

美国人工智能协会（American Association for Artificial Intelligence）美国人工智能协会是人工智能领域的主要学术组织之一。该协会主办的年会（AAAI, The National Conference on Artificial Intelligence）是一个人工智能领域的主要学术会议。

今年的AAAI本届共收到了3808篇论文投稿，其中录用了938篇，较去年的投稿量增加了47%。

最佳论文

《Memory-Augmented Monte Carlo Tree Search》

Chenjun Xiao, Jincheng Mei and Martin Muller

【Abstract】This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-MCTS), which provides a new approach to exploit generalization in online real- time search. The key idea of M-MCTS is to incorporate MCTS with a memory structure, where each entry contains information of a particular state. This memory is used to generate an approximate value estimation by combining the estimations of similar states. We show that the memory based value approximation is better than the vanilla Monte Carlo estimation with high probability under mild conditions. We evaluate M-MCTS in the game of Go. Experimental results show that M- MCTS outperforms the original MCTS with the same number of simulations.

【论文摘要】本文提出并评价了记忆增强蒙特卡罗树搜索（Memory-Augmented Monte Carlo Tree Search，M-MCTS），为在线实时搜索提供了一种新的一般化方法。M-MCTS的关键思想是将MCTS与存储器结构合并，其中每个条目包含特定状态的信息。该存储器用于通过组合相似状态的估计来生成近似值估计。结果表明，在温和的条件下，基于记忆的值逼近方法优于具有高概率的普通蒙特卡罗方法。我们在围棋游戏中评估M-MCTS，结果表明，在相同的仿真次数下，MMCTS性能优于原MCTS。

最佳学生论文

《Counterfactual Multi-Agent Policy Gradients》

Jakob N. Foerster , Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson

【Abstract】Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents’ policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent’s action, while keeping the other agents’ actions fixed. COMA also uses a critic representation that al- lows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor- critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.

【论文摘要】许多现实世界的问题，例如网络分组路由和自动驾驶车辆的协调，都很自然地被建模为多智能体协作系统。这类问题非常需要一种新的强化学习方法，可以有效地学习这种系统的分散策略。为此，我们提出一种新的多智能体 actor-critic方法，称为反事实多智能体（counterfactual multi-agent，COMA）策略梯度。COMA使用一个中心化的critic来估计Q函数，以及一个去中心化的actors来优化智能体的策略。此外，为了解决多智能体信度分配的问题，COMA使用一个反事实基线（counterfactual baseline），将单个智能体的行为边缘化，同时保持其他智能体的行为固定不变。COMA还使用critic表示允许在单个前向传播中有效地计算反事实基线。我们在星际争霸单位微操的测试平台上评估COMA，使用具有显着局部可观察性的去中心化变体。在这种条件下，COMA相比其他多智能体actor-critic 方法的平均性能显著要高，而且性能最好的智能体可以与当前最优的中心化控制器相媲美，并能获得全部状态的信息访问。

ACL 2018

会议时间：7月15日~20日

会议地点：墨尔本，澳大利亚

ACL大会（Annual Meeting of the Association for Computational Linguistics）是计算语言学学会一年一度的年会，也是该领域最重要的学术会议。计算语言学学会始于1962年，原名为机器翻译与计算语言学学会（Association for Machine Translation and Computational Linguistics, AMTCL），于1968年更名为ACL。每年夏季，来自世界各地的相关领域研究人员齐聚一堂，共同交流自然语言处理领域的理论发展和技术进步。近年来，自然语言处理在包括机器翻译、语言分析、信息抽取、自动问答和文本摘要等众多方向取得了长足的进步。

本届大会投稿量和接受量均有增长，共收到投稿1544篇，最终录用381篇，其中长文256篇（录取率25.1%），短文125篇（录取率23.8%）。

最佳论文——长论文（3篇）

《Finding syntax in human encephalography with beam search》

John Hale, Chris Dyer, Adhiguna Kuncoro and Jonathan Brennan.

【Abstract】Recurrent neural network grammars (RNNGs) are generative models of (tree, string) pairs that rely on neural net- works to evaluate derivational choices. Parsing with them using beam search yields a variety of incremental complexity metrics such as word surprisal and parser action count. When used as regressors against human electrophysiological responses to naturalistic text, they derive two amplitude effects: an early peak and a P600-like later peak. By contrast, a non-syntactic neural language model yields no reliable effects. Model comparisons attribute the early peak to syntactic composition within the RNNG. This pattern of results recommends the RNNG+beam search combination as a mechanistic model of the syntactic processing that occurs during normal human language comprehension.

【论文摘要】递归神经网络语法（recurrent neural network grammers，RNNGs）是依靠神经网络来评估衍生选择的（树，串）对的生成模型。使用束搜索（beam search）进行解析会产生各种增量复杂性度量，如单词惊异数（word surprisal count）和解析器动作计数（parser action count）。当把它们用作回归因子，解析人类大脑成像图像中对于自然语言文本的电生理学响应时，它们可以带来两个增幅效果：一个较早的峰值以及一个类似 P600 的稍滞后的峰值。相比之下，一个不具有句法结构的神经语言模型无法达到任何可靠的增幅效果。通过对不同模型的对比，早期峰值的出现可以归功于RNNG中的句法组合。结果中体现出的这种模式表明RNNG+束搜索的组合可以作为正常人类语言处理中的语法处理的很好的机理解释模型。

《Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information》

Sudha Rao and Hal Daumé III.

【Abstract】Inquiry is fundamental to communication, and machines cannot effectively collabo- rate with humans unless they can ask questions. In this work, we build a neural net- work model for the task of ranking clarification questions. Our model is inspired by the idea of expected value of perfect information: a good question is one whose expected answer will be useful. We study this problem using data from StackExchange, a plentiful online resource in which people routinely ask clarifying questions to posts so that they can better offer assistance to the original poster. We create a dataset of clarification questions consisting of ∼77K posts paired with a clarification question (and answer) from three domains of StackExchange: askubuntu, unix and superuser. We evaluate our model on 500 samples of this dataset against expert human judgments and demonstrate significant improvements over controlled base- lines.

【论文摘要】提问是一种基本的沟通方式，如果机器不知道如何问问题，那它们也就无法高效地与人类合作。在本研究中，作者们构建了一个神经网络用于给追问的问题做排名。模型来源于完全信息情况下的期待值：一个可以期待获得有用的答案的问题就是一个好问题。作者们根据StackExchange上抓取的数据研究了这个问题；StackExchange 是一个内容丰富的在线咨询平台，有人发帖咨询以后，别的用户会在下面追问起到解释澄清作用的问题，以便更好地了解状况、帮助到发帖人。论文作者们创建了一个由这样的追问问题组成的数据集，其中包含了 StackExchange 上askubuntu、unix、superuser这三个领域的约77k组发帖及其追问问题（和问题的回答）。作者们在其中的500组样本上评估了自己的模型，相比其他基准模型有显著的提高；同时他们也与人类专家的判断进行了对比。

《Let’s do it “again”: A First Computational Approach to Detecting Adverbial Presupposition Triggers》

Andre Cianflone, Yulan Feng, Jad Kabbara and Jackie Chi Kit Cheung.

【Abstract】We introduce the task of predicting adverbial presupposition triggers such as also and again. Solving such a task requires detecting recurring or similar events in the discourse context, and has applications in natural language generation tasks such as summarization and dialogue systems. We create two new datasets for the task, de- rived from the Penn Treebank and the An- notated English Gigaword corpora, as well as a novel attention mechanism tailored to this task. Our attention mechanism augments a baseline recurrent neural network without the need for additional trainable parameters, minimizing the added computational cost of our mechanism. We demonstrate that our model statistically outperforms a number of baselines, including an LSTM-based language model.

【论文摘要】本文介绍了预测副词词性的假定状态触发语（adverbial presupposition triggers）（比如also和again）这一任务。完成这样的任务需要在对话上下文里寻找重复出现的或者相似的内容，这项任务的研究成果可以在文本总结或者对话系统等自然语言生成任务中起到帮助。我们为这项任务创造了两个新的数据集，分别由Penn Treebank和Annotated English Gigaword生成，而且也专为这项任务设计了一种新的注意力机制，该注意力机制无需额外的可训练网络参数就可以增强基准RNN模型的表现，因此最小化了这一注意力机制带来的额外计算开销。我们的模型相比多个基准模型都有统计上显著的更好表现，包括相比基于LSTM的语言模型。

最佳论文——短论文（2篇）

《Know What You Don’t Know: Unanswerable Questions for SQuAD. Pranav Rajpurkar, Robin Jia and Percy Liang》

Pranav Rajpurkar，Robin Jia，Percy Liang

【Abstract】Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuAD 2.0, the latest version of the Stanford Question Answering Dataset (SQuAD). SQuAD 2.0 combines existing SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD 2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD 2.0 is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD 1.1 achieves only 66% F1 on SQuAD 2.0.

【论文摘要】摘要式阅读理解系统（Extractive reading comprehension systems）通常可以在上下文文档中找到问题的正确答案，但是它们也倾向于对在上下文中没有陈述正确答案的问题做出不可靠的猜测。现有数据集或者专注于可回答的问题，或者使用易于识别的自动生成的无法回答的问题。为了解决这些弱点，我们提供了SQuAD 2.0，这是斯坦福问答数据集（SQuAD）的最新版本。SQuAD 2.0将现有的SQuAD数据与5万多个无法回答的问题结合在一起，这些问题由众包人员以相反的方式撰写，看起来与可回答问题类似。为了在SQuAD 2.0上取得好成绩，系统不仅必须尽可能回答问题，还要确定何时段落不支持答案并且不回答问题。SQuAD 2.0对于现有模型来说是一个具有挑战性的自然语言理解任务：在SQuAD 1.1上获得86% F1的强大的神经系统在SQuAD 2.0上仅获得66%F1。

《‘Lighter’ Can Still Be Dark: Modeling Comparative Color Descriptions》

Olivia Winn，Smaranda Muresan

【Abstract】We propose a novel paradigm of grounding comparative adjectives within the realm of color descriptions. Given a reference RGB color and a comparative term (e.g., ‘lighter’, ‘darker’), our model learns to ground the comparative as a direction in the RGB space such that the colors along the vector, rooted at the reference color, satisfy the comparison. Our model generates grounded representations of comparative adjectives with an average accuracy of 0.65 cosine similarity to the desired direction of change. These vectors approach colors with Delta-E scores of under 7 compared to the target colors, indicating the differences are very small with respect to human perception. Our approach makes use of a newly created dataset for this task derived from existing labeled color data.

【论文摘要】我们提出了一个将比较形容词用于颜色描述领域的新范式。给定一个参考RGB颜色和一个比较项（例如，‘lighter’，‘darker’），我们的模型在RGB空间中将比较项作为方向进行学习，使得沿着矢量的以参考颜色为基准的颜色可进行比较。

我们的模型生成了比较形容词的表示，与期望的变化方向达到了平均精度为0.65余弦相似度。与目标颜色相比，这些矢量接近δ-E分数低于7的颜色，这表明在人类感知方面的差异非常小。我们的方法使用的是新创建的数据集来完成从现有标记颜色数据中派生的任务。

编辑于 2019-02-26 11:48

人工智能

2018顶会论文合集

文章被以下专栏收录

SIGAI