## 原创 | Attention Modeling for Targeted Sentiment

2017 年 11 月 5 日 黑龙江大学自然语言处理实验室 Guangyao Zhao

## 一、导读

• 如何去建立target实体和它上下文（context）的关系

• 三种不同的Attention模型的对比

• 如何用Pytorch实现该论文

## 二、问题

Targeted Sentiment Analysis是研究target实体的情感极性分类的问题。只根据target实体，我们没有办法判断出target实体的情感极性，所以我们要根据target实体的上下文来判断。那么，第一个要解决的问题就是：如何建立target实体和上下文之间的关系

She began to love [miley ray cyrus]1 since 2013 :)

## 三、注意力机制模型

target表示，的平均值。left_context就是right_context就是

### 1.Vanilla Attention Model (BILSTM-ATT)

α的公式：

β的值是由target和上下文单词共同决定的。从下面公式可以看出，每个词与target相连接，经过一个线性层与tanh激活函数，然后与张量U相乘得到一个标量。公式如下：

### 2.Contextualized Attention Model (BILSTM-ATT-C)

Contextualized Attention Model是在Vanilla Model的基础上衍生的一个模型。这个模型将上下文拆分成两部分，分别为left_contextright_context。然后对left_contextright_context分别做与Vanilla Attention Model一样的操作。即：

## 四、编码

Vanilla Model不用把句子分割成left_contextright_context。因此很容易实现batch_size > 1的设计。主要的层次有embedding层，dropout层，双向lstm层，线性层linear_2。linear_1层和参数u主要是为了计算的。

class Vanilla(nn.Module):        def __init__(self, embedding):
super(Vanilla, self).__init__()
self.embedding = nn.Embedding(embed_num, embed_dim)
self.embedding.weight.data.copy_(embedding)
self.dropout = nn.Dropout(dropout)
self.bilstm = nn.LSTM(embed_dim,hidden_size,dropout)
self.linear_1 = nn.Linear(hidden_size,attention_size,bias=True)
self.u = Parameter(torch.randn(1,attention_size),
self.linear_2 = nn.Linear(hidden_size,label_num,bias=True)       def forward():
pass

BILSTM-ATT-C和BILSTM-ATT-G模型需要把句子分割成left_contextright_context。编程上很难设计出很好的batch_size > 1的方法，因此使用的是batch_size = 1的方法。这两个模型是Vanilla Model的衍生，所以在代码上有所依赖，因此设计了一个Attention Class，作用是为了计算s。这个类是计算图的一部分，输入是context和target，输出是s

class Attention(nn.Module):
def __init__(self):
super(Attention, self).__init__()
self.linear = nn.Linear(input_size * 2, output_size, bias=True)
self.u = Parameter(torch.randn(output_size, 1))    def forward():
pass

BILSTM-ATT-C模型会用到Attention类。
BILSTM-ATT-G模型中的操作，使用softmax函数实现该操作。

# cat z_all, z_l, z_rz_all = torch.cat([z_all, z_l, z_r], 1)# softmaxz_all = F.softmax(z_all)

https://github.com/vipzgy/AttentionTargetSentiment

## 五、实验结果与结论

### 1.数据

Z-Dataset:

T-Dataset:

2.实验结果

Z-Dataset上的实验结果:

Z数据集精确度与宏平均F1值（%）

Z数据集F1值（%）

T-Dataset上的实验结果:

T数据集精确度与宏平均F1值（%）

T数据集F1值（%）

## 六、引用

Jiangming Liu and Yue Zhang. 2017. Attention modeling for targeted sentiment. EACL 2017

Meishan Zhang, Yue Zhang, and Tin Duy Vo. 2015.Neural networks for open domain targeted sentiment. Association for Computational Linguistics

Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classifi- cation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Meishan Zhang, Yue Zhang, and Duy-Tin Vo. 2016. Gated neural networks for targeted sentiment analysis. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA.

### 相关内容

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

Aspect level sentiment classification aims to identify the sentiment expressed towards an aspect given a context sentence. Previous neural network based methods largely ignore the syntax structure in one sentence. In this paper, we propose a novel target-dependent graph attention network (TD-GAT) for aspect level sentiment classification, which explicitly utilizes the dependency relationship among words. Using the dependency graph, it propagates sentiment features directly from the syntactic context of an aspect target. In our experiments, we show our method outperforms multiple baselines with GloVe embeddings. We also demonstrate that using BERT representations further substantially boosts the performance.

While the general task of textual sentiment classification has been widely studied, much less research looks specifically at sentiment between a specified source and target. To tackle this problem, we experimented with a state-of-the-art relation extraction model. Surprisingly, we found that despite reasonable performance, the model's attention was often systematically misaligned with the words that contribute to sentiment. Thus, we directly trained the model's attention with human rationales and improved our model performance by a robust 4~8 points on all tasks we defined on our data sets. We also present a rigorous analysis of the model's attention, both trained and untrained, using novel and intuitive metrics. Our results show that untrained attention does not provide faithful explanations; however, trained attention with concisely annotated human rationales not only increases performance, but also brings faithful explanations. Encouragingly, a small amount of annotated human rationales suffice to correct the attention in our task.

Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple modalities. In this paper, we propose two methods for unsupervised learning of joint multimodal representations using sequence to sequence (Seq2Seq) methods: a \textit{Seq2Seq Modality Translation Model} and a \textit{Hierarchical Seq2Seq Modality Translation Model}. We also explore multiple different variations on the multimodal inputs and outputs of these seq2seq models. Our experiments on multimodal sentiment analysis using the CMU-MOSI dataset indicate that our methods learn informative multimodal representations that outperform the baselines and achieve improved performance on multimodal sentiment analysis, specifically in the Bimodal case where our model is able to improve F1 Score by twelve points. We also discuss future directions for multimodal Seq2Seq methods.

We propose a novel two-layered attention network based on Bidirectional Long Short-Term Memory for sentiment analysis. The novel two-layered attention network takes advantage of the external knowledge bases to improve the sentiment prediction. It uses the Knowledge Graph Embedding generated using the WordNet. We build our model by combining the two-layered attention network with the supervised model based on Support Vector Regression using a Multilayer Perceptron network for sentiment analysis. We evaluate our model on the benchmark dataset of SemEval 2017 Task 5. Experimental results show that the proposed model surpasses the top system of SemEval 2017 Task 5. The model performs significantly better by improving the state-of-the-art system at SemEval 2017 Task 5 by 1.7 and 3.7 points for sub-tracks 1 and 2 respectively.

Expressing in language is subjective. Everyone has a different style of reading and writing, apparently it all boil downs to the way their mind understands things (in a specific format). Language style transfer is a way to preserve the meaning of a text and change the way it is expressed. Progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data, use cases, and reliable evaluation metrics. In response to the challenge of lacking parallel data, we explore learning style transfer from non-parallel data. We propose a model combining seq2seq, autoencoders, and adversarial loss to achieve this goal. The key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks. Considering the problem of evaluating style transfer tasks, we frame the problem as sentiment transfer and evaluation using a sentiment classifier to calculate how many sentiments was the model able to transfer. We report our results on several kinds of models.

6+阅读 · 2018年9月17日

5+阅读 · 2018年7月8日

7+阅读 · 2018年5月6日

7+阅读 · 2018年5月4日

12+阅读 · 2018年4月11日

10+阅读 · 2018年3月14日

4+阅读 · 2017年12月25日

7+阅读 · 2017年12月10日

4+阅读 · 2017年9月30日

56+阅读 · 2020年2月3日

101+阅读 · 2019年10月13日

Binxuan Huang,Kathleen M. Carley
8+阅读 · 2019年9月5日
Ruiqi Zhong,Steven Shao,Kathleen McKeown
5+阅读 · 2019年8月19日
Jindong Chen,Yizhou Hu,Jingping Liu,Yanghua Xiao,Haiyun Jiang
7+阅读 · 2019年2月21日
Hai Pham,Thomas Manzini,Paul Pu Liang,Barnabas Poczos
4+阅读 · 2018年8月6日
Ethem F. Can,Aysu Ezen-Can,Fazli Can
10+阅读 · 2018年6月8日