Transformer是谷歌发表的论文《Attention Is All You Need》提出一种完全基于Attention的翻译架构

知识荟萃

论文列表

原文:

相关论文

开源代码

VIP内容

摘要:在一篇 ICCV 2021 Oral 论文中,来自百度 VIS 团队和罗格斯大学等机构的研究者将神经绘画视作一个集合预测问题,提出了全新的、基于 Transformer 的框架——Paint Transformer,从而利用前馈网络来预测笔画集合的参数。就其效果而言,研究者提出的模型可以并行地生成一系列笔画,并几乎能够实时地得到尺寸为 512×512 的重建绘画。

更重要的是,由于训练 Paint Transformer 没有可用的数据集,研究者设计了一个自训练的 pipeline,这样既可以在不使用任何现成数据集的情况下训练,又依然能够实现极好的泛化能力。实验结果表明,Paint Transformer 在训练和推理成本更低的情况下,实现了较以往方法更好的性能。

研究者将神经绘画视作一个渐进的笔画预测过程。在每一步并行地预测多个笔画,以前馈的方式最小化当前画布和目标图像之间的差异。就其结构而言,Paint Transformer 由两个模块组成,分别是笔画预测器(Stroke Predictor)和笔画渲染器(Stroke Renderer)。

成为VIP会员查看完整内容
0
9

最新论文

Point cloud registration plays a critical role in a multitude of computer vision tasks, such as pose estimation and 3D localization. Recently, a plethora of deep learning methods were formulated that aim to tackle this problem. Most of these approaches find point or feature correspondences, from which the transformations are computed. We give a different perspective and frame the registration problem as a Markov Decision Process. Instead of directly searching for the transformation, the problem becomes one of finding a sequence of translation and rotation actions that is equivalent to this transformation. To this end, we propose an artificial agent trained end-to-end using deep supervised learning. In contrast to conventional reinforcement learning techniques, the observations are sampled i.i.d. and thus no experience replay buffer is required, resulting in a more streamlined training process. Experiments on ModelNet40 show results comparable or superior to the state of the art in the case of clean, noisy and partially visible datasets.

0
0
下载
预览
Top