Transformer是谷歌发表的论文《Attention Is All You Need》提出一种完全基于Attention的翻译架构

知识荟萃

论文列表

原文:

相关论文

开源代码

VIP内容

本文介绍了一种新型高效的变换器模型GANsformer,并将其应用于可视化生成建模。该网络采用了两部分结构,使跨图像的远距离交互成为可能,同时保持线性效率的计算,可以很容易地扩展到高分辨率合成。它从一组潜在变量迭代地传播信息到进化的视觉特征,反之亦然,以支持每一个根据另一个来细化,并鼓励物体和场景的合成表现形式的出现。与经典的变换器架构相比,它利用了乘法积分,允许灵活的基于区域的调制,因此可以被视为成功的StyleGAN网络的推广。我们通过对一系列数据集(从模拟的多目标环境到丰富的真实室内和室外场景)的仔细评估,展示了该模型的强度和鲁棒性,表明它在图像质量和多样性方面达到了最先进的结果,同时拥有快速学习和更好的数据效率。进一步的定性和定量实验为我们提供了对模型内部工作的深入了解,揭示了改进的可解释性和更强的解纠缠性,并说明了我们方法的好处和有效性。

成为VIP会员查看完整内容
0
15

最新内容

End-to-end paradigms significantly improve the accuracy of various deep-learning-based computer vision models. To this end, tasks like object detection have been upgraded by replacing non-end-to-end components, such as removing non-maximum suppression by training with a set loss based on bipartite matching. However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection. In this paper, we propose an instance segmentation Transformer, termed ISTR, which is the first end-to-end framework of its kind. ISTR predicts low-dimensional mask embeddings, and matches them with ground truth mask embeddings for the set loss. Besides, ISTR concurrently conducts detection and segmentation with a recurrent refinement strategy, which provides a new way to achieve instance segmentation compared to the existing top-down and bottom-up frameworks. Benefiting from the proposed end-to-end mechanism, ISTR demonstrates state-of-the-art performance even with approximation-based suboptimal embeddings. Specifically, ISTR obtains a 46.8/38.6 box/mask AP using ResNet50-FPN, and a 48.1/39.9 box/mask AP using ResNet101-FPN, on the MS COCO dataset. Quantitative and qualitative results reveal the promising potential of ISTR as a solid baseline for instance-level recognition. Code has been made available at: https://github.com/hujiecpp/ISTR.

0
1
下载
预览

最新论文

End-to-end paradigms significantly improve the accuracy of various deep-learning-based computer vision models. To this end, tasks like object detection have been upgraded by replacing non-end-to-end components, such as removing non-maximum suppression by training with a set loss based on bipartite matching. However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection. In this paper, we propose an instance segmentation Transformer, termed ISTR, which is the first end-to-end framework of its kind. ISTR predicts low-dimensional mask embeddings, and matches them with ground truth mask embeddings for the set loss. Besides, ISTR concurrently conducts detection and segmentation with a recurrent refinement strategy, which provides a new way to achieve instance segmentation compared to the existing top-down and bottom-up frameworks. Benefiting from the proposed end-to-end mechanism, ISTR demonstrates state-of-the-art performance even with approximation-based suboptimal embeddings. Specifically, ISTR obtains a 46.8/38.6 box/mask AP using ResNet50-FPN, and a 48.1/39.9 box/mask AP using ResNet101-FPN, on the MS COCO dataset. Quantitative and qualitative results reveal the promising potential of ISTR as a solid baseline for instance-level recognition. Code has been made available at: https://github.com/hujiecpp/ISTR.

0
1
下载
预览
Top