Fine-tuning a deep network trained with the standard cross-entropy loss is a strong baseline for few-shot learning. When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters. The simplicity of this approach enables us to demonstrate the first few-shot learning results on the ImageNet-21k dataset. We find that using a large number of meta-training classes results in high few-shot accuracies even for a large number of few-shot classes. We do not advocate our approach as the solution for few-shot learning, but simply use the results to highlight limitations of current benchmarks and few-shot protocols. We perform extensive studies on benchmark datasets to propose a metric that quantifies the "hardness" of a few-shot episode. This metric can be used to report the performance of few-shot algorithms in a more systematic way.

### 相关内容

Few-shot image classification aims to classify unseen classes with limited labeled samples. Recent works benefit from the meta-learning process with episodic tasks and can fast adapt to class from training to testing. Due to the limited number of samples for each task, the initial embedding network for meta learning becomes an essential component and can largely affects the performance in practice. To this end, many pre-trained methods have been proposed, and most of them are trained in supervised way with limited transfer ability for unseen classes. In this paper, we proposed to train a more generalized embedding network with self-supervised learning (SSL) which can provide slow and robust representation for downstream tasks by learning from the data itself. We evaluate our work by extensive comparisons with previous baseline methods on two few-shot classification datasets ({\em i.e.,} MiniImageNet and CUB). Based on the evaluation results, the proposed method achieves significantly better performance, i.e., improve 1-shot and 5-shot tasks by nearly \textbf{3\%} and \textbf{4\%} on MiniImageNet, by nearly \textbf{9\%} and \textbf{3\%} on CUB. Moreover, the proposed method can gain the improvement of (\textbf{15\%}, \textbf{13\%}) on MiniImageNet and (\textbf{15\%}, \textbf{8\%}) on CUB by pretraining using more unlabeled data. Our code will be available at \hyperref[https://github.com/phecy/SSL-FEW-SHOT.]{https://github.com/phecy/ssl-few-shot.}

Many meta-learning approaches for few-shot learning rely on simple base learners such as nearest-neighbor classifiers. However, even in the few-shot regime, discriminatively trained linear predictors can offer better generalization. We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and performance across a range of few-shot recognition benchmarks. Our objective is to learn feature embeddings that generalize well under a linear classification rule for novel categories. To efficiently solve the objective, we exploit two properties of linear classifiers: implicit differentiation of the optimality conditions of the convex problem and the dual formulation of the optimization problem. This allows us to use high-dimensional embeddings with improved generalization at a modest increase in computational overhead. Our approach, named MetaOptNet, achieves state-of-the-art performance on miniImageNet, tieredImageNet, CIFAR-FS, and FC100 few-shot learning benchmarks. Our code is available at https://github.com/kjunelee/MetaOptNet.

In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document. In tasks characterized by the presence of training data (such as text classification) it seems logical that the term weighting function should take into account the distribution (as estimated from training data) of the term across the classes of interest. Although supervised term weighting' approaches that use this intuition have been described before, they have failed to show consistent improvements. In this article we analyse the possible reasons for this failure, and call consolidated assumptions into question. Following this criticism we propose a novel supervised term weighting approach that, instead of relying on any predefined formula, learns a term weighting function optimised on the training set of interest; we dub this approach \emph{Learning to Weight} (LTW). The experiments that we run on several well-known benchmarks, and using different learning methods, show that our method outperforms previous term weighting approaches in text classification.

Few-shot Learning aims to learn classifiers for new classes with only a few training examples per class. Existing meta-learning or metric-learning based few-shot learning approaches are limited in handling diverse domains with various number of labels. The meta-learning approaches train a meta learner to predict weights of homogeneous-structured task-specific networks, requiring a uniform number of classes across tasks. The metric-learning approaches learn one task-invariant metric for all the tasks, and they fail if the tasks diverge. We propose to deal with these limitations with meta metric learning. Our meta metric learning approach consists of task-specific learners, that exploit metric learning to handle flexible labels, and a meta learner, that discovers good parameters and gradient decent to specify the metrics in task-specific learners. Thus the proposed model is able to handle unbalanced classes as well as to generate task-specific metrics. We test our approach in the $k$-shot $N$-way' few-shot learning setting used in previous work and new realistic few-shot setting with diverse multi-domain tasks and flexible label numbers. Experiments show that our approach attains superior performances in both settings.

Learning with limited data is a key challenge for visual recognition. Few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels. This style of transfer learning is task-agnostic: the embedding function is not learned optimally discriminative with respect to the unseen classes, where discerning among them is the target task. In this paper, we propose a novel approach to adapt the embedding model to the target classification task, yielding embeddings that are task-specific and are discriminative. To this end, we employ a type of self-attention mechanism called Transformer to transform the embeddings from task-agnostic to task-specific by focusing on relating instances from the test instances to the training instances in both seen and unseen classes. Our approach also extends to both transductive and generalized few-shot classification, two important settings that have essential use cases. We verify the effectiveness of our model on two standard benchmark few-shot classification datasets --- MiniImageNet and CUB, where our approach demonstrates state-of-the-art empirical performance.

Deep learning has made remarkable achievement in many fields. However, learning the parameters of neural networks usually demands a large amount of labeled data. The algorithms of deep learning, therefore, encounter difficulties when applied to supervised learning where only little data are available. This specific task is called few-shot learning. To address it, we propose a novel algorithm for few-shot learning using discrete geometry, in the sense that the samples in a class are modeled as a reduced simplex. The volume of the simplex is used for the measurement of class scatter. During testing, combined with the test sample and the points in the class, a new simplex is formed. Then the similarity between the test sample and the class can be quantized with the ratio of volumes of the new simplex to the original class simplex. Moreover, we present an approach to constructing simplices using local regions of feature maps yielded by convolutional neural networks. Experiments on Omniglot and miniImageNet verify the effectiveness of our simplex algorithm on few-shot learning.

The key issue of few-shot learning is learning to generalize. In this paper, we propose a large margin principle to improve the generalization capacity of metric based methods for few-shot learning. To realize it, we develop a unified framework to learn a more discriminative metric space by augmenting the softmax classification loss function with a large margin distance loss function for training. Extensive experiments on two state-of-the-art few-shot learning models, graph neural networks and prototypical networks, show that our method can improve the performance of existing models substantially with very little computational overhead, demonstrating the effectiveness of the large margin principle and the potential of our method.

Deep learning (DL) based semantic segmentation methods have been providing state-of-the-art performance in the last few years. More specifically, these techniques have been successfully applied to medical image classification, segmentation, and detection tasks. One deep learning technique, U-Net, has become one of the most popular for these applications. In this paper, we propose a Recurrent Convolutional Neural Network (RCNN) based on U-Net as well as a Recurrent Residual Convolutional Neural Network (RRCNN) based on U-Net models, which are named RU-Net and R2U-Net respectively. The proposed models utilize the power of U-Net, Residual Network, as well as RCNN. There are several advantages of these proposed architectures for segmentation tasks. First, a residual unit helps when training deep architecture. Second, feature accumulation with recurrent residual convolutional layers ensures better feature representation for segmentation tasks. Third, it allows us to design better U-Net architecture with same number of network parameters with better performance for medical image segmentation. The proposed models are tested on three benchmark datasets such as blood vessel segmentation in retina images, skin cancer segmentation, and lung lesion segmentation. The experimental results show superior performance on segmentation tasks compared to equivalent models including U-Net and residual U-Net (ResU-Net).

Da Chen,Yuefeng Chen,Yuhong Li,Feng Mao,Yuan He,Hui Xue
13+阅读 · 2019年11月14日
Kwonjoon Lee,Subhransu Maji,Avinash Ravichandran,Stefano Soatto
4+阅读 · 2019年4月23日
Alejandro Moreo Fernández,Andrea Esuli,Fabrizio Sebastiani
8+阅读 · 2019年3月28日
Yu Cheng,Mo Yu,Xiaoxiao Guo,Bowen Zhou
12+阅读 · 2019年1月26日
Han-Jia Ye,Hexiang Hu,De-Chuan Zhan,Fei Sha
11+阅读 · 2018年12月10日
Qianru Sun,Yaoyao Liu,Tat-Seng Chua,Bernt Schiele
6+阅读 · 2018年12月6日
Bowen Zhang,Xifan Zhang,Fan Cheng,Deli Zhao
4+阅读 · 2018年7月27日
Yong Wang,Xiao-Ming Wu,Qimai Li,Jiatao Gu,Wangmeng Xiang,Lei Zhang,Victor O. K. Li
9+阅读 · 2018年7月8日
Abhishek Gupta,Benjamin Eysenbach,Chelsea Finn,Sergey Levine
6+阅读 · 2018年6月12日
Md Zahangir Alom,Mahmudul Hasan,Chris Yakopcic,Tarek M. Taha,Vijayan K. Asari
12+阅读 · 2018年2月20日

CreateAMind
13+阅读 · 2019年5月22日
CreateAMind
6+阅读 · 2019年1月18日
CreateAMind
7+阅读 · 2019年1月7日
CreateAMind
22+阅读 · 2019年1月4日
CreateAMind
32+阅读 · 2019年1月3日
CreateAMind
9+阅读 · 2019年1月2日
CreateAMind
16+阅读 · 2018年5月25日
CreateAMind
4+阅读 · 2017年10月31日

9+阅读 · 2017年9月24日
Top