Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data. Instead of relying solely on labeled data, practitioners can harness unlabeled or related data to improve model performance, which is often more accessible and ubiquitous. Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. It involves first pre-training a model on a large amount of unlabeled data, then adapting the model to target tasks of interest. In this review, we survey self-supervised learning methods and their applications within the sequential transfer learning framework. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains. Finally, we discuss recent trends and suggest areas for future investigation.
Highlight detection in sports videos has a broad viewership and huge commercial potential. It is thus imperative to detect highlight scenes more suitably for human interest with high temporal accuracy. Since people instinctively suppress blinks during attention-grabbing events and synchronously generate blinks at attention break points in videos, the instantaneous blink rate can be utilized as a highly accurate temporal indicator of human interest. Therefore, in this study, we propose a novel, automatic highlight detection method based on the blink rate. The method trains a one-dimensional convolution network (1D-CNN) to assess blink rates at each video frame from the spatio-temporal pose features of figure skating videos. Experiments show that the method successfully estimates the blink rate in 94% of the video clips and predicts the temporal change in the blink rate around a jump event with high accuracy. Moreover, the method detects not only the representative athletic action, but also the distinctive artistic expression of figure skating performance as key frames. This suggests that the blink-rate-based supervised learning approach enables high-accuracy highlight detection that more closely matches human sensibility.
Humans learn all their life long. They accumulate knowledge from a sequence of learning experiences and remember the essential concepts without forgetting what they have learned previously. Artificial neural networks struggle to learn similarly. They often rely on data rigorously preprocessed to learn solutions to specific problems such as classification or regression. In particular, they forget their past learning experiences if trained on new ones. Therefore, artificial neural networks are often inept to deal with real-life settings such as an autonomous-robot that has to learn on-line to adapt to new situations and overcome new problems without forgetting its past learning-experiences. Continual learning (CL) is a branch of machine learning addressing this type of problem. Continual algorithms are designed to accumulate and improve knowledge in a curriculum of learning-experiences without forgetting. In this thesis, we propose to explore continual algorithms with replay processes. Replay processes gather together rehearsal methods and generative replay methods. Generative Replay consists of regenerating past learning experiences with a generative model to remember them. Rehearsal consists of saving a core-set of samples from past learning experiences to rehearse them later. The replay processes make possible a compromise between optimizing the current learning objective and the past ones enabling learning without forgetting in sequences of tasks settings. We show that they are very promising methods for continual learning. Notably, they enable the re-evaluation of past data with new knowledge and the confrontation of data from different learning-experiences. We demonstrate their ability to learn continually through unsupervised learning, supervised learning and reinforcement learning tasks.
Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning from demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task.
In power systems, an asset class is a group of power equipment that has the same function and shares similar electrical or mechanical characteristics. Predicting failures for different asset classes is critical for electric utilities towards developing cost-effective asset management strategies. Previously, physical age based Weibull distribution has been widely used to failure prediction. However, this mathematical model cannot incorporate asset condition data such as inspection or testing results. As a result, the prediction cannot be very specific and accurate for individual assets. To solve this important problem, this paper proposes a novel and comprehensive data-driven approach based on asset condition data: K-means clustering as an unsupervised learning method is used to analyze the inner structure of historical asset condition data and produce the asset conditional ages; logistic regression as a supervised learning method takes in both asset physical ages and conditional ages to classify and predict asset statuses. Furthermore, an index called average aging rate is defined to quantify, track and estimate the relationship between asset physical age and conditional age. This approach was applied to an urban distribution system in West Canada to predict medium-voltage cable failures. Case studies and comparison with standard Weibull distribution are provided. The proposed approach demonstrates superior performance and practicality for predicting asset class failures in power systems.
Anomaly detection algorithms are often thought to be limited because they don't facilitate the process of validating results performed by domain experts. In Contrast, deep learning algorithms for anomaly detection, such as autoencoders, point out the outliers, saving experts the time-consuming task of examining normal cases in order to find anomalies. Most outlier detection algorithms output a score for each instance in the database. The top-k most intense outliers are returned to the user for further inspection; however the manual validation of results becomes challenging without additional clues. An explanation of why an instance is anomalous enables the experts to focus their investigation on most important anomalies and may increase their trust in the algorithm. Recently, a game theory-based framework known as SHapley Additive exPlanations (SHAP) has been shown to be effective in explaining various supervised learning models. In this research, we extend SHAP to explain anomalies detected by an autoencoder, an unsupervised model. The proposed method extracts and visually depicts both the features that most contributed to the anomaly and those that offset it. A preliminary experimental study using real world data demonstrates the usefulness of the proposed method in assisting the domain experts to understand the anomaly and filtering out the uninteresting anomalies, aiming at minimizing the false positive rate of detected anomalies.
In this paper, we initiate the study of sample complexity of teaching, termed as "teaching dimension" (TDim) in the literature, for Q-learning. While the teaching dimension of supervised learning has been studied extensively, these results do not extend to reinforcement learning due to the temporal constraints posed by the underlying Markov Decision Process environment. We characterize the TDim of Q-learning under different teachers with varying control over the environment, and present matching optimal teaching algorithms. Our TDim results provide the minimum number of samples needed for reinforcement learning, thus complementing standard PAC-style RL sample complexity analysis. Our teaching algorithms have the potential to speed up RL agent learning in applications where a helpful teacher is available.
Data scientists seeking a good supervised learning model on a new dataset have many choices to make: they must preprocess the data, select features, possibly reduce the dimension, select an estimation algorithm, and choose hyperparameters for each of these pipeline components. With new pipeline components comes a combinatorial explosion in the number of choices! In this work, we design a new AutoML system to address this challenge: an automated system to design a supervised learning pipeline. Our system uses matrix and tensor factorization as surrogate models to model the combinatorial pipeline search space. Under these models, we develop greedy experiment design protocols to efficiently gather information about a new dataset. Experiments on large corpora of real-world classification problems demonstrate the effectiveness of our approach.