Feature importance is commonly used to explain machine predictions. While feature importance can be derived from a machine learning model with a variety of methods, the consistency of feature importance via different methods remains understudied. In this work, we systematically compare feature importance from built-in mechanisms in a model such as attention values and post-hoc methods that approximate model behavior such as LIME. Using text classification as a testbed, we find that 1) no matter which method we use, important features from traditional models such as SVM and XGBoost are more similar with each other, than with deep learning models; 2) post-hoc methods tend to generate more similar important features for two models than built-in methods. We further demonstrate how such similarity varies across instances. Notably, important features do not always resemble each other better when two models agree on the predicted label than when they disagree.
In plenty of machine learning applications, the most relevant items for a particular query should be efficiently extracted, while the relevance function is based on a highly-nonlinear model, e.g., DNNs or GBDTs. Due to the high computational complexity of such models, exhaustive search is infeasible even for medium-scale problems. To address this issue, we introduce Relevance Proximity Graphs (RPG): an efficient non-exhaustive approach that provides a high-quality approximate solution for maximal relevance retrieval. Namely, we extend the recent similarity graphs framework to the setting, when there is no similarity measure defined on item pairs, which is a common practical use-case. By design, our approach directly maximizes off-the-shelf relevance functions and does not require any proxy auxiliary models. Via extensive experiments, we show that the developed method provides excellent retrieval accuracy while requiring only a few model computations, outperforming indirect models. We open-source our implementation as well as two large-scale datasets to support further research on relevance retrieval.
One of the basic tasks of computational language documentation (CLD) is to identify word boundaries in an unsegmented phonemic stream. While several unsupervised monolingual word segmentation algorithms exist in the literature, they are challenged in real-world CLD settings by the small amount of available data. A possible remedy is to take advantage of glosses or translation in a foreign, well-resourced, language, which often exist for such data. In this paper, we explore and compare ways to exploit neural machine translation models to perform unsupervised boundary detection with bilingual information, notably introducing a new loss function for jointly learning alignment and segmentation. We experiment with an actual under-resourced language, Mboshi, and show that these techniques can effectively control the output segmentation length.
Session-based recommender systems have attracted much attention recently. To capture the sequential dependencies, previous sequential recommendation models resort either to data augmentation techniques or a left-to-right style autoregressive training approach. While effective, an obvious drawback is that future user behaviors are always missing during model training. In this paper, we argue that users' future action signals can be exploited to boost the recommendation quality. We present GRec, a simple Gap-filling based encoder-decoder Recommendation framework to generative modelling using both past and future contexts. GfedRec encodes a partially-complete item sequence with blank masks, and autoregressively reconstructs the missing item distributions. In contrast with the typical encoder-decoder paradigm used in the computer vision and NLP domains, GfedRec does not have the data leakage problem when jointly training the encoder and decoder conditioned on the same user action sequence. Experiments on real-word datasets with short-, medium- and long-range user sessions show that GRec largely exceeds the performance of its left-to-right counterparts. Empirical evidence confirms that training sequential recommendation models with future contexts is a promising way to offer better recommendation accuracy.
Developing recommendation system for fashion images is challenging due to the inherent ambiguity associated with what criterion a user is looking at. Suggesting multiple images where each output image is similar to the query image on the basis of a different feature or part is one way to mitigate the problem. Existing works for fashion recommendation have used Siamese or Triplet network to learn features between a similar pair and a similar-dissimilar triplet respectively. However, these methods do not provide basic information such as, how two clothing images are similar, or which parts present in the two images make them similar. In this paper, we propose to recommend images by explicitly learning and exploiting part based similarity. We propose a novel approach of learning discriminative features from weakly-supervised data by using visual attention over the parts and a texture encoding network. We show that the learned features surpass the state-of-the-art in retrieval task on DeepFashion dataset. We then use the proposed model to recommend fashion images having an explicit variation with respect to similarity of any of the parts.
Knowledge graphs (KGs) have proven to be effective for highquality recommendation. However, existing methods mainly investigate separate paths connecting user-item pairs from KGs, thus failing to fully capture the rich semantics and underlying topology of KGs. We, therefore, propose a novel attentive knowledge graph embedding (AKGE) framwork to exploit the complex subgraphs of KGs linking user-item pairs to help better infer user preference. Specifically, AKGE first employs a distance-aware sampling strategy to automatically extract high-order subgraphs, which represent user-item relations with rich semantics. The subgraphs are then encoded by the proposed attentive graph neural network to help learn accurate user preference over items. Extensive validation shows that AKGE consistently outperforms state-of-the-arts. It additionally provides potential explanations for recommendation results.
Online customer reviews on large-scale e-commerce websites, represent a rich and varied source of opinion data, often providing subjective qualitative assessments of product usage that can help potential customers to discover features that meet their personal needs and preferences. Thus they have the potential to automatically answer specific queries about products, and to address the problems of answer starvation and answer augmentation on associated consumer Q & A forums, by providing good answer alternatives. In this work, we explore several recently successful neural approaches to modeling sentence pairs, that could better learn the relationship between questions and ground truth answers, and thus help infer reviews that can best answer a question or augment a given answer. In particular, we hypothesize that our neural domain adaptation-based approach, due to its ability to additionally learn domain-invariant features from a large number of unlabeled, unpaired question-review samples, would perform better than our proposed baselines, at answering specific, subjective product-related queries using reviews. We validate this hypothesis using a small gold standard dataset of question-review pairs evaluated by human experts, significantly surpassing our chosen baselines. Moreover, our approach, using no labeled question-review sentence pair data for training, gives performance at par with another method utilizing labeled question-review samples for the same task.
Social media sites are becoming a key factor in politics. These platforms are easy to manipulate for the purpose of distorting information space to confuse and distract voters. It is of paramount importance for social media platforms, users engaged with online political discussions, as well as government agencies to understand the dynamics on social media, and identify malicious groups engaging in misinformation campaigns and thus polluting the general discourse around a topic of interest. Past works to identify such disruptive patterns are mostly focused on analyzing user-generated content such as tweets. In this study, we take a holistic approach and propose SGP to provide an informative birds eye view of all the activities in these social media sites around a broad topic and detect coordinated groups suspicious of engaging in misinformation campaigns. To show the effectiveness of SGP, we deploy it to provide a concise overview of polluting activity on Twitter around the upcoming 2019 Canadian Federal Elections, by analyzing over 60 thousand user accounts connected through 3.4 million connections and 1.3 million hashtags. Users in the polluting groups detected by SGP-flag are over 4x more likely to become suspended while majority of these highly suspicious users detected by SGP-flag escaped Twitter's suspending algorithm. Moreover, while few of the polluting hashtags detected are linked to misinformation campaigns, SGP-sig also flags others that have not been picked up on. More importantly, we also show that a large coordinated set of right-winged conservative groups based in the US are heavily engaged in Canadian politics.
The increasing availability of semantic data, which is commonly represented as entity-property-value triples, has enabled novel information retrieval applications. However, the magnitude of semantic data, in particular the large number of triples describing an entity, could overload users with excessive amounts of information. This has motivated fruitful research on automated generation of summaries for entity descriptions to satisfy users' information needs efficiently and effectively. We focus on this important topic of entity summarization, and present the first comprehensive survey of existing research. We review existing methods and evaluation efforts, and suggest directions for future work.