Sentiment analysis is a key component in various text mining applications. Numerous sentiment classification techniques, including conventional and deep learning-based methods, have been proposed in the literature. In most existing methods, a high-quality training set is assumed to be given. Nevertheless, constructing a high-quality training set that consists of highly accurate labels is challenging in real applications. This difficulty stems from the fact that text samples usually contain complex sentiment representations, and their annotation is subjective. We address this challenge in this study by leveraging a new labeling strategy and utilizing a two-level long short-term memory network to construct a sentiment classifier. Lexical cues are useful for sentiment analysis, and they have been utilized in conventional studies. For example, polar and privative words play important roles in sentiment analysis. A new encoding strategy, that is, $\rho$-hot encoding, is proposed to alleviate the drawbacks of one-hot encoding and thus effectively incorporate useful lexical cues. We compile three Chinese data sets on the basis of our label strategy and proposed methodology. Experiments on the three data sets demonstrate that the proposed method outperforms state-of-the-art algorithms.

### 更多

Sentiment analysis is a widely studied NLP task where the goal is to determine opinions, emotions, and evaluations of users towards a product, an entity or a service that they are reviewing. One of the biggest challenges for sentiment analysis is that it is highly language dependent. Word embeddings, sentiment lexicons, and even annotated data are language specific. Further, optimizing models for each language is very time consuming and labor intensive especially for recurrent neural network models. From a resource perspective, it is very challenging to collect data for different languages. In this paper, we look for an answer to the following research question: can a sentiment analysis model trained on a language be reused for sentiment analysis in other languages, Russian, Spanish, Turkish, and Dutch, where the data is more limited? Our goal is to build a single model in the language with the largest dataset available for the task, and reuse it for languages that have limited resources. For this purpose, we train a sentiment analysis model using recurrent neural networks with reviews in English. We then translate reviews in other languages and reuse this model to evaluate the sentiments. Experimental results show that our robust approach of single model trained on English reviews statistically significantly outperforms the baselines in several different languages.

We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.

Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. We summarize previous approaches into two subtasks: aspect-category sentiment analysis (ACSA) and aspect-term sentiment analysis (ATSA). Most previous approaches employ long short-term memory and attention mechanisms to predict the sentiment polarity of the concerned targets, which are often complicated and need more training time. We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. First, the novel Gated Tanh-ReLU Units can selectively output the sentiment features according to the given aspect or entity. The architecture is much simpler than attention layer used in the existing models. Second, the computations of our model could be easily parallelized during training, because convolutional layers do not have time dependency as in LSTM layers, and gating units also work independently. The experiments on SemEval datasets demonstrate the efficiency and effectiveness of our models.

Expressing in language is subjective. Everyone has a different style of reading and writing, apparently it all boil downs to the way their mind understands things (in a specific format). Language style transfer is a way to preserve the meaning of a text and change the way it is expressed. Progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data, use cases, and reliable evaluation metrics. In response to the challenge of lacking parallel data, we explore learning style transfer from non-parallel data. We propose a model combining seq2seq, autoencoders, and adversarial loss to achieve this goal. The key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks. Considering the problem of evaluating style transfer tasks, we frame the problem as sentiment transfer and evaluation using a sentiment classifier to calculate how many sentiments was the model able to transfer. We report our results on several kinds of models.

The complexities of Arabic language in morphology, orthography and dialects makes sentiment analysis for Arabic more challenging. Also, text feature extraction from short messages like tweets, in order to gauge the sentiment, makes this task even more difficult. In recent years, deep neural networks were often employed and showed very good results in sentiment classification and natural language processing applications. Word embedding, or word distributing approach, is a current and powerful tool to capture together the closest words from a contextual text. In this paper, we describe how we construct Word2Vec models from a large Arabic corpus obtained from ten newspapers in different Arab countries. By applying different machine learning algorithms and convolutional neural networks with different text feature selections, we report improved accuracy of sentiment classification (91%-95%) on our publicly available Arabic language health sentiment dataset [1]. Keywords - Arabic Sentiment Analysis, Machine Learning, Convolutional Neural Networks, Word Embedding, Word2Vec for Arabic, Lexicon.

Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.

Sentiment Analysis (SA) is a major field of study in natural language processing, computational linguistics and information retrieval. Interest in SA has been constantly growing in both academia and industry over the recent years. Moreover, there is an increasing need for generating appropriate resources and datasets in particular for low resource languages including Persian. These datasets play an important role in designing and developing appropriate opinion mining platforms using supervised, semi-supervised or unsupervised methods. In this paper, we outline the entire process of developing a manually annotated sentiment corpus, SentiPers, which covers formal and informal written contemporary Persian. To the best of our knowledge, SentiPers is a unique sentiment corpus with such a rich annotation in three different levels including document-level, sentence-level, and entity/aspect-level for Persian. The corpus contains more than 26000 sentences of users opinions from digital product domain and benefits from special characteristics such as quantifying the positiveness or negativity of an opinion through assigning a number within a specific range to any given sentence. Furthermore, we present statistics on various components of our corpus as well as studying the inter-annotator agreement among the annotators. Finally, some of the challenges that we faced during the annotation process will be discussed as well.

In this work, we present our findings and experiments for stock-market prediction using various textual sentiment analysis tools, such as mood analysis and event extraction, as well as prediction models, such as LSTMs and specific convolutional architectures.

This project addresses the problem of sentiment analysis in twitter; that is classifying tweets according to the sentiment expressed in them: positive, negative or neutral. Twitter is an online micro-blogging and social-networking platform which allows users to write short status updates of maximum length 140 characters. It is a rapidly expanding service with over 200 million registered users - out of which 100 million are active users and half of them log on twitter on a daily basis - generating nearly 250 million tweets per day. Due to this large amount of usage we hope to achieve a reflection of public sentiment by analysing the sentiments expressed in the tweets. Analysing the public sentiment is important for many applications such as firms trying to find out the response of their products in the market, predicting political elections and predicting socioeconomic phenomena like stock exchange. The aim of this project is to develop a functional classifier for accurate and automatic sentiment classification of an unknown tweet stream.

Ethem F. Can,Aysu Ezen-Can,Fazli Can
11+阅读 · 2018年6月8日
Anthony Hu,Seth Flaxman
19+阅读 · 2018年5月25日
Wei Xue,Tao Li
12+阅读 · 2018年5月18日
Ayush Singh,Ritu Palod
4+阅读 · 2018年4月10日
Nicole Novielli,Daniela Girardi,Filippo Lanubile
3+阅读 · 2018年3月17日
Abdulaziz M. Alayba,Vasile Palade,Matthew England,Rahat Iqbal
3+阅读 · 2018年2月28日
Lei Zhang,Shuai Wang,Bing Liu
25+阅读 · 2018年1月24日
Pedram Hosseini,Ali Ahmadian Ramaki,Hassan Maleki,Mansoureh Anvari,Seyed Abolghasem Mirroshandel
5+阅读 · 2018年1月23日
Jordan Prosky,Xingyou Song,Andrew Tan,Michael Zhao
6+阅读 · 2018年1月18日
Afroze Ibrahim Baqapuri
3+阅读 · 2015年9月14日

31+阅读 · 2019年10月11日

44+阅读 · 2019年10月10日

47+阅读 · 2019年10月10日

8+阅读 · 2018年5月6日

7+阅读 · 2018年5月4日

54+阅读 · 2018年1月26日

10+阅读 · 2018年1月14日

16+阅读 · 2017年10月4日

34+阅读 · 2017年9月11日

24+阅读 · 2017年9月8日
Top