Attention机制最早是在视觉图像领域提出来的,应该是在九几年思想就提出来了,但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14],他们在RNN模型上使用了attention机制来进行图像分类。随后,Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中,使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行,他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近,如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

VIP内容

题目: A Transformer-based Embedding Model for Personalized Product Search

摘要: 产品搜索是人们在电子商务平台上浏览和购买商品的重要方式。虽然客户倾向于根据他们的个人品味和偏好做出选择,但对商业产品搜索日志的分析表明,个性化并不总是能提高产品搜索质量。然而,大多数现有的产品搜索技术在搜索会话之间执行无差异的个性化设置。他们要么用一个固定的系数来控制个性化的影响,要么用一个注意力机制让个性化一直发挥作用。唯一值得注意的例外是最近提出的零注意模型(zero-attention model, ZAM),该模型允许查询关注一个零向量,从而自适应地调整个性化的效果。尽管如此,在ZAM中,个性化最多可以发挥与查询同等重要的作用,并且不管用户的历史购买中同时出现的item是什么,item的表示在整个集合中都是静态的。考虑到这些局限性,我们提出了一种基于Transformer的个性化产品搜索嵌入模型(TEM),该模型通过使用Transformer架构对查询序列和用户购买历史进行编码,从而动态地控制个性化的影响。个性化可以在必要时发挥主导作用,在计算注意力权重时可以考虑item之间的交互。实验结果表明,TEM的性能明显优于目前最先进的个性化产品检索模型。

成为VIP会员查看完整内容
0
5

最新内容

An integral function of fully autonomous robots and humans is the ability to focus attention on a few relevant percepts to reach a certain goal while disregarding irrelevant percepts. Humans and animals rely on the interactions between the Pre-Frontal Cortex (PFC) and the Basal Ganglia (BG) to achieve this focus called Working Memory (WM). The Working Memory Toolkit (WMtk) was developed based on a computational neuroscience model of this phenomenon with Temporal Difference (TD) Learning for autonomous systems. Recent adaptations of the toolkit either utilize Abstract Task Representations (ATRs) to solve Non-Observable (NO) tasks or storage of past input features to solve Partially-Observable (PO) tasks, but not both. We propose a new model, PONOWMtk, which combines both approaches, ATRs and input storage, with a static or dynamic number of ATRs. The results of our experiments show that PONOWMtk performs effectively for tasks that exhibit PO, NO, or both properties.

0
1
下载
预览

最新论文

An integral function of fully autonomous robots and humans is the ability to focus attention on a few relevant percepts to reach a certain goal while disregarding irrelevant percepts. Humans and animals rely on the interactions between the Pre-Frontal Cortex (PFC) and the Basal Ganglia (BG) to achieve this focus called Working Memory (WM). The Working Memory Toolkit (WMtk) was developed based on a computational neuroscience model of this phenomenon with Temporal Difference (TD) Learning for autonomous systems. Recent adaptations of the toolkit either utilize Abstract Task Representations (ATRs) to solve Non-Observable (NO) tasks or storage of past input features to solve Partially-Observable (PO) tasks, but not both. We propose a new model, PONOWMtk, which combines both approaches, ATRs and input storage, with a static or dynamic number of ATRs. The results of our experiments show that PONOWMtk performs effectively for tasks that exhibit PO, NO, or both properties.

0
1
下载
预览
子主题
Top