In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document. In tasks characterized by the presence of training data (such as text classification) it seems logical that the term weighting function should take into account the distribution (as estimated from training data) of the term across the classes of interest. Although `supervised term weighting' approaches that use this intuition have been described before, they have failed to show consistent improvements. In this article we analyse the possible reasons for this failure, and call consolidated assumptions into question. Following this criticism we propose a novel supervised term weighting approach that, instead of relying on any predefined formula, learns a term weighting function optimised on the training set of interest; we dub this approach \emph{Learning to Weight} (LTW). The experiments that we run on several well-known benchmarks, and using different learning methods, show that our method outperforms previous term weighting approaches in text classification.

8
下载
关闭预览

相关内容

文本分类(Text Classification)任务是根据给定文档的内容或主题,自动分配预先定义的类别标签。

Natural Language Processing (NLP) and especially natural language text analysis have seen great advances in recent times. Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results. Different deep learning architectures like CNN, LSTM, and very recent Transformer have been used to achieve state of the art results variety on NLP tasks. In this work, we survey a host of deep learning architectures for text classification tasks. The work is specifically concerned with the classification of Hindi text. The research in the classification of morphologically rich and low resource Hindi language written in Devanagari script has been limited due to the absence of large labeled corpus. In this work, we used translated versions of English data-sets to evaluate models based on CNN, LSTM and Attention. Multilingual pre-trained sentence embeddings based on BERT and LASER are also compared to evaluate their effectiveness for the Hindi language. The paper also serves as a tutorial for popular text classification techniques.

0
4
下载
预览

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

0
16
下载
预览

Deep ConvNets have shown great performance for single-label image classification (e.g. ImageNet), but it is necessary to move beyond the single-label classification task because pictures of everyday life are inherently multi-label. Multi-label classification is a more difficult task than single-label classification because both the input images and output label spaces are more complex. Furthermore, collecting clean multi-label annotations is more difficult to scale-up than single-label annotations. To reduce the annotation cost, we propose to train a model with partial labels i.e. only some labels are known per image. We first empirically compare different labeling strategies to show the potential for using partial labels on multi-label datasets. Then to learn with partial labels, we introduce a new classification loss that exploits the proportion of known labels per example. Our approach allows the use of the same training settings as when learning with all the annotations. We further explore several curriculum learning based strategies to predict missing labels. Experiments are performed on three large-scale multi-label datasets: MS COCO, NUS-WIDE and Open Images.

0
5
下载
预览

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.

0
7
下载
预览

Learning with limited data is a key challenge for visual recognition. Few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels. This style of transfer learning is task-agnostic: the embedding function is not learned optimally discriminative with respect to the unseen classes, where discerning among them is the target task. In this paper, we propose a novel approach to adapt the embedding model to the target classification task, yielding embeddings that are task-specific and are discriminative. To this end, we employ a type of self-attention mechanism called Transformer to transform the embeddings from task-agnostic to task-specific by focusing on relating instances from the test instances to the training instances in both seen and unseen classes. Our approach also extends to both transductive and generalized few-shot classification, two important settings that have essential use cases. We verify the effectiveness of our model on two standard benchmark few-shot classification datasets --- MiniImageNet and CUB, where our approach demonstrates state-of-the-art empirical performance.

0
8
下载
预览

Meta-learning has been proposed as a framework to address the challenging few-shot learning setting. The key idea is to leverage a large number of similar few-shot tasks in order to learn how to adapt a base-learner to a new task for which only a few labeled samples are available. As deep neural networks (DNNs) tend to overfit using a few samples only, meta-learning typically uses shallow neural networks (SNNs), thus limiting its effectiveness. In this paper we propose a novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks. Specifically, "meta" refers to training multiple tasks, and "transfer" is achieved by learning scaling and shifting functions of DNN weights for each task. In addition, we introduce the hard task (HT) meta-batch scheme as an effective learning curriculum for MTL. We conduct experiments using (5-class, 1-shot) and (5-class, 5-shot) recognition tasks on two challenging few-shot learning benchmarks: miniImageNet and Fewshot-CIFAR100. Extensive comparisons to related works validate that our meta-transfer learning approach trained with the proposed HT meta-batch scheme achieves top performance. An ablation study also shows that both components contribute to fast convergence and high accuracy.

0
5
下载
预览

Convolutional neural networks (CNNs) have recently emerged as a popular building block for natural language processing (NLP). Despite their success, most existing CNN models employed in NLP share the same learned (and static) set of filters for all input sentences. In this paper, we consider an approach of using a small meta network to learn context-sensitive convolutional filters for text processing. The role of meta network is to abstract the contextual information of a sentence or document into a set of input-aware filters. We further generalize this framework to model sentence pairs, where a bidirectional filter generation mechanism is introduced to encapsulate co-dependent sentence representations. In our benchmarks on four different tasks, including ontology classification, sentiment analysis, answer sentence selection, and paraphrase identification, our proposed model, a modified CNN with context-sensitive filters, consistently outperforms the standard CNN and attention-based CNN baselines. By visualizing the learned context-sensitive filters, we further validate and rationalize the effectiveness of proposed framework.

0
5
下载
预览

Meta-learning is a powerful tool that builds on multi-task learning to learn how to quickly adapt a model to new tasks. In the context of reinforcement learning, meta-learning algorithms can acquire reinforcement learning procedures to solve new problems more efficiently by meta-learning prior tasks. The performance of meta-learning algorithms critically depends on the tasks available for meta-training: in the same way that supervised learning algorithms generalize best to test points drawn from the same distribution as the training points, meta-learning methods generalize best to tasks from the same distribution as the meta-training tasks. In effect, meta-reinforcement learning offloads the design burden from algorithm design to task design. If we can automate the process of task design as well, we can devise a meta-learning algorithm that is truly automated. In this work, we take a step in this direction, proposing a family of unsupervised meta-learning algorithms for reinforcement learning. We describe a general recipe for unsupervised meta-reinforcement learning, and describe an effective instantiation of this approach based on a recently proposed unsupervised exploration technique and model-agnostic meta-learning. We also discuss practical and conceptual considerations for developing unsupervised meta-learning methods. Our experimental results demonstrate that unsupervised meta-reinforcement learning effectively acquires accelerated reinforcement learning procedures without the need for manual task design, significantly exceeds the performance of learning from scratch, and even matches performance of meta-learning methods that use hand-specified task distributions.

0
6
下载
预览

Most existing sentiment analysis approaches heavily rely on a large amount of labeled data that usually involve time-consuming and error-prone manual annotations. The distribution of this labeled data is significantly imbalanced among languages, e.g., more English texts are labeled than texts in other languages, which presents a major challenge to cross-lingual sentiment analysis. There have been several cross-lingual representation learning techniques that transfer the knowledge learned from a language with abundant labeled examples to another language with much fewer labels. Their performance, however, is usually limited due to the imperfect quality of machine translation and the scarce signal that bridges two languages. In this paper, we employ emojis, a ubiquitous and emotional language, as a new bridge for sentiment analysis across languages. Specifically, we propose a semi-supervised representation learning approach through the task of emoji prediction to learn cross-lingual representations of text that can capture both semantic and sentiment information. The learned representations are then utilized to facilitate cross-lingual sentiment classification. We demonstrate the effectiveness and efficiency of our approach on a representative Amazon review data set that covers three languages and three domains.

0
6
下载
预览

Clustering and classification critically rely on distance metrics that provide meaningful comparisons between data points. We present mixed-integer optimization approaches to find optimal distance metrics that generalize the Mahalanobis metric extensively studied in the literature. Additionally, we generalize and improve upon leading methods by removing reliance on pre-designated "target neighbors," "triplets," and "similarity pairs." Another salient feature of our method is its ability to enable active learning by recommending precise regions to sample after an optimal metric is computed to improve classification performance. This targeted acquisition can significantly reduce computational burden by ensuring training data completeness, representativeness, and economy. We demonstrate classification and computational performance of the algorithms through several simple and intuitive examples, followed by results on real image and medical datasets.

0
8
下载
预览
小贴士
相关论文
Ramchandra Joshi,Purvi Goel,Raviraj Joshi
4+阅读 · 2020年1月19日
Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification
Jiawei Wu,Wenhan Xiong,William Yang Wang
16+阅读 · 2019年9月9日
Learning a Deep ConvNet for Multi-label Classification with Partial Labels
Thibaut Durand,Nazanin Mehrasa,Greg Mori
5+阅读 · 2019年2月26日
Jindong Chen,Yizhou Hu,Jingping Liu,Yanghua Xiao,Haiyun Jiang
7+阅读 · 2019年2月21日
Learning Embedding Adaptation for Few-Shot Learning
Han-Jia Ye,Hexiang Hu,De-Chuan Zhan,Fei Sha
8+阅读 · 2018年12月10日
Meta-Transfer Learning for Few-Shot Learning
Qianru Sun,Yaoyao Liu,Tat-Seng Chua,Bernt Schiele
5+阅读 · 2018年12月6日
Dinghan Shen,Martin Renqiang Min,Yitong Li,Lawrence Carin
5+阅读 · 2018年8月30日
Abhishek Gupta,Benjamin Eysenbach,Chelsea Finn,Sergey Levine
6+阅读 · 2018年6月12日
Zhenpeng Chen,Sheng Shen,Ziniu Hu,Xuan Lu,Qiaozhu Mei,Xuanzhe Liu
6+阅读 · 2018年6月7日
Krishnan Kumaran,Dimitri Papageorgiou,Yutong Chang,Minhan Li,Martin Takáč
8+阅读 · 2018年3月28日
相关资讯
强化学习三篇论文 避免遗忘等
CreateAMind
10+阅读 · 2019年5月24日
Hierarchically Structured Meta-learning
CreateAMind
9+阅读 · 2019年5月22日
Transferring Knowledge across Learning Processes
CreateAMind
6+阅读 · 2019年5月18日
深度自进化聚类:Deep Self-Evolution Clustering
我爱读PAMI
11+阅读 · 2019年4月13日
逆强化学习-学习人先验的动机
CreateAMind
5+阅读 · 2019年1月18日
强化学习的Unsupervised Meta-Learning
CreateAMind
6+阅读 · 2019年1月7日
Unsupervised Learning via Meta-Learning
CreateAMind
26+阅读 · 2019年1月3日
meta learning 17年:MAML SNAIL
CreateAMind
8+阅读 · 2019年1月2日
Hierarchical Imitation - Reinforcement Learning
CreateAMind
15+阅读 · 2018年5月25日
读书报告 | Deep Learning for Extreme Multi-label Text Classification
科技创新与创业
38+阅读 · 2018年1月10日
Top