Machine Learning models become increasingly proficient in complex tasks. However, even for experts in the field, it can be difficult to understand what the model learned. This hampers trust and acceptance, and it obstructs the possibility to correct the model. There is therefore a need for transparency of machine learning models. The development of transparent classification models has received much attention, but there are few developments for achieving transparent Reinforcement Learning (RL) models. In this study we propose a method that enables a RL agent to explain its behavior in terms of the expected consequences of state transitions and outcomes. First, we define a translation of states and actions to a description that is easier to understand for human users. Second, we developed a procedure that enables the agent to obtain the consequences of a single action, as well as its entire policy. The method calculates contrasts between the consequences of a policy derived from a user query, and of the learned policy of the agent. Third, a format for generating explanations was constructed. A pilot survey study was conducted to explore preferences of users for different explanation properties. Results indicate that human users tend to favor explanations about policy rather than about single actions.
Multispectral imaging is an important technique for improving the readability of written or printed text where the letters have faded, either due to deliberate erasing or simply due to the ravages of time. Often the text can be read simply by looking at individual wavelengths, but in some cases the images need further enhancement to maximise the chances of reading the text. There are many possible enhancement techniques and this paper assesses and compares an extended set of dimensionality reduction methods for image processing. We assess 15 dimensionality reduction methods in two different manuscripts. This assessment was performed both subjectively by asking the opinions of scholars who were experts in the languages used in the manuscripts which of the techniques they preferred and also by using the Davies-Bouldin and Dunn indexes for assessing the quality of the resulted image clusters. We found that the Canonical Variates Analysis (CVA) method which was using a Matlab implementation and we have used previously to enhance multispectral images, it was indeed superior to all the other tested methods. However it is very likely that other approaches will be more suitable in specific circumstance so we would still recommend that a range of these techniques are tried. In particular, CVA is a supervised clustering technique so it requires considerably more user time and effort than a non-supervised technique such as the much more commonly used Principle Component Analysis Approach (PCA). If the results from PCA are adequate to allow a text to be read then the added effort required for CVA may not be justified. For the purposes of comparing the computational times and the image results, a CVA method is also implemented in C programming language and using the GNU (GNUs Not Unix) Scientific Library (GSL) and the OpenCV (OPEN source Computer Vision) computer vision programming library.
Sentiment analysis is a widely studied NLP task where the goal is to determine opinions, emotions, and evaluations of users towards a product, an entity or a service that they are reviewing. One of the biggest challenges for sentiment analysis is that it is highly language dependent. Word embeddings, sentiment lexicons, and even annotated data are language specific. Further, optimizing models for each language is very time consuming and labor intensive especially for recurrent neural network models. From a resource perspective, it is very challenging to collect data for different languages. In this paper, we look for an answer to the following research question: can a sentiment analysis model trained on a language be reused for sentiment analysis in other languages, Russian, Spanish, Turkish, and Dutch, where the data is more limited? Our goal is to build a single model in the language with the largest dataset available for the task, and reuse it for languages that have limited resources. For this purpose, we train a sentiment analysis model using recurrent neural networks with reviews in English. We then translate reviews in other languages and reuse this model to evaluate the sentiments. Experimental results show that our robust approach of single model trained on English reviews statistically significantly outperforms the baselines in several different languages.
We observe that end-to-end memory networks (MN) trained for task-oriented dialogue, such as for recommending restaurants to a user, suffer from an out-of-vocabulary (OOV) problem -- the entities returned by the Knowledge Base (KB) may not be seen by the network at training time, making it impossible for it to use them in dialogue. We propose a Hierarchical Pointer Memory Network (HyP-MN), in which the next word may be generated from the decode vocabulary or copied from a hierarchical memory maintaining KB results and previous utterances. Evaluating over the dialog bAbI tasks, we find that HyP-MN drastically outperforms MN obtaining 12% overall accuracy gains. Further analysis reveals that MN fails completely in recommending any relevant restaurant, whereas HyP-MN recommends the best next restaurant 80% of the time.
Policy gradient methods are widely used in reinforcement learning algorithms to search for better policies in the parameterized policy space. They do gradient search in the policy space and are known to converge very slowly. Nesterov developed an accelerated gradient search algorithm for convex optimization problems. This has been recently extended for non-convex and also stochastic optimization. We use Nesterov's acceleration for policy gradient search in the well-known actor-critic algorithm and show the convergence using ODE method. We tested this algorithm on a scheduling problem. Here an incoming job is scheduled into one of the four queues based on the queue lengths. We see from experimental results that algorithm using Nesterov's acceleration has significantly better performance compared to algorithm which do not use acceleration. To the best of our knowledge this is the first time Nesterov's acceleration has been used with actor-critic algorithm.