Learning reduced component representations of data using nonlinear transformations is a central problem in unsupervised learning with a rich history. Recently, a new family of algorithms based on maximum likelihood optimization with change of variables has demonstrated an impressive ability to model complex nonlinear data distributions. These algorithms learn to map from arbitrary random variables to independent components using invertible nonlinear function approximators. Despite the potential of this framework, the underlying optimization objective is ill-posed for a large class of variables, inhibiting accurate component estimates in many use cases. We present a new Tikhonov regularization technique for nonlinear independent component estimation that mediates the instability of the algorithm and facilitates robust component estimates. In addition, we provide a theoretically grounded procedure for feature extraction that produces PCA-like representations of nonlinear distributions using the learned model. We apply our technique to a handful of nonlinear data manifolds and show that the resulting representations possess important consistencies lacked by unregularized models.
Keyword extraction is used for summarizing the content of a document and supports efficient document retrieval, and is as such an indispensable part of modern text-based systems. We explore how load centrality, a graph-theoretic measure applied to graphs derived from a given text can be used to efficiently identify and rank keywords. Introducing meta vertices (aggregates of existing vertices) and systematic redundancy filters, the proposed method performs on par with state-of-the-art for the keyword extraction task on 14 diverse datasets. The proposed method is unsupervised, interpretable and can also be used for document visualization.
Given new pairs of source and target point sets, standard point set registration methods often repeatedly conduct the independent iterative search of desired geometric transformation to align the source point set with the target one. This limits their use in applications to handle the real-time point set registration with large volume dataset. This paper presents a novel method, named coherent point drift networks (CPD-Net), for the unsupervised learning of geometric transformation towards real-time non-rigid point set registration. In contrast to previous efforts (e.g. coherent point drift), CPD-Net can learn displacement field function to estimate geometric transformation from a training dataset, consequently, to predict the desired geometric transformation for the alignment of previously unseen pairs without any additional iterative optimization process. Furthermore, CPD-Net leverages the power of deep neural networks to fit an arbitrary function, that adaptively accommodates different levels of complexity of the desired geometric transformation. Particularly, CPD-Net is proved with a theoretical guarantee to learn a continuous displacement vector function that could further avoid imposing additional parametric smoothness constraint as in previous works. Our experiments verify the impressive performance of CPD-Net for non-rigid point set registration on various 2D/3D datasets, even in the presence of significant displacement noise, outliers, and missing points. Our code will be available at https://github.com/nyummvc/CPD-Net.
Experimental data is often affected by uncontrolled variables that make analysis and interpretation difficult. For spatiotemporal systems, this problem is further exacerbated by their intricate dynamics. Modern machine learning methods are well-suited for modeling complex datasets, but to be effective in science, the result needs to be interpretable. We demonstrate an unsupervised learning technique for extracting interpretable physical parameters from noisy spatiotemporal data and for building a transferable model of the system. In particular, we implement a physics-informed architecture based on variational autoencoders that is designed for analyzing systems governed by partial differential equations (PDEs). The architecture is trained end-to-end and extracts latent parameters that parameterize the dynamics of a learned predictive model for the system. To test our method, we train the architecture on simulated data from a variety of PDEs with varying dynamical parameters that act as uncontrolled variables. Specifically, we examine the Kuramoto-Sivashinsky equation with varying viscosity damping parameter, the nonlinear Schr\"odinger equation with varying nonlinearity coefficient, and the convection-diffusion equation with varying diffusion constant and drift velocity. Numerical experiments show that our method can accurately identify relevant parameters and extract them from raw and even noisy spatiotemporal data (tested with roughly 10% added noise). These extracted parameters correlate well (linearly with $R^2>0.95$) with the ground truth physical parameters used to generate the datasets. Our method for discovering interpretable latent parameters in spatiotemporal systems will allow us to better analyze and understand real-world phenomena and datasets, which often have uncontrolled variables that alter the system dynamics and cause varying behaviors that are difficult to disentangle.
Studies show that Deep Neural Network (DNN)-based image classification models are vulnerable to maliciously constructed adversarial examples. However, little effort has been made to investigate how DNN-based image retrieval models are affected by such attacks. In this paper, we introduce Unsupervised Adversarial Attacks with Generative Adversarial Networks (UAA-GAN) to attack deep feature-based image retrieval systems. UAA-GAN is an unsupervised learning model that requires only a small amount of unlabeled data for training. Once trained, it produces query-specific perturbations for query images to form adversarial queries. The core idea is to ensure that the attached perturbation is barely perceptible to human yet effective in pushing the query away from its original position in the deep feature space. UAA-GAN works with various application scenarios that are based on deep features, including image retrieval, person Re-ID and face search. Empirical results show that UAA-GAN cripples retrieval performance without significant visual changes in the query images. UAA-GAN generated adversarial examples are less distinguishable because they tend to incorporate subtle perturbations in textured or salient areas of the images, such as key body parts of human, dominant structural patterns/textures or edges, rather than in visually insignificant areas (e.g., background and sky). Such tendency indicates that the model indeed learned how to toy with both image retrieval systems and human eyes.
Previous literature on unsupervised learning focused on designing structural priors and optimization functions with the aim of learning meaningful features, but without considering the description length of the representations. Here we present Sparsely Activated Networks (SANs), which decompose their input as a sum of sparsely reoccurring patterns of varying amplitude, and combined with a newly proposed metric $\varphi$ they learn representations with minimal description lengths. SANs consist of kernels with shared weights that during encoding are convolved with the input and then passed through a ReLU and a sparse activation function. During decoding, the same weights are convolved with the sparse activation map and the individual reconstructions from each weight are summed to reconstruct the input. We also propose a metric $\varphi$ for model selection that favors models which combine high compression ratio and low reconstruction error and we justify its definition by exploring the hyperparameter space of SANs. We compare four sparse activation functions (Identity, Max-Activations, Max-Pool indices, Peaks) on a variety of datasets and show that SANs learn interpretable kernels that combined with $\varphi$, they minimize the description length of the representations.
An assumption-free automatic check of medical images for potentially overseen anomalies would be a valuable assistance for a radiologist. Deep learning and especially Variational Auto-Encoders (VAEs) have shown great potential in the unsupervised learning of data distributions. In principle, this allows for such a check and even the localization of parts in the image that are most suspicious. Currently, however, the reconstruction-based localization by design requires adjusting the model architecture to the specific problem looked at during evaluation. This contradicts the principle of building assumption-free models. We propose complementing the localization part with a term derived from the Kullback-Leibler (KL)-divergence. For validation, we perform a series of experiments on FashionMNIST as well as on a medical task including >1000 healthy and >250 brain tumor patients. Results show that the proposed formalism outperforms the state of the art VAE-based localization of anomalies across many hyperparameter settings and also shows a competitive max performance.
This paper introduces an incremental semantic mapping approach, with on-line unsupervised learning, based on Self-Organizing Maps (SOM) for robotic agents. The method includes a mapping module, which incrementally creates a topological map of the environment, enriched with objects recognized around each topological node, and a module of places categorization, endowed with an incremental unsupervised learning SOM with on-line training. The proposed approach was tested in experiments with real-world data, in which it demonstrates promising capabilities of incremental acquisition of topological maps enriched with semantic information, and for clustering together similar places based on this information. The approach was also able to continue learning from newly visited environments without degrading the information previously learned.
Human-centric applications such as virtual reality and immersive gaming will be central to the future wireless networks. Common features of such services include: a) their dependence on the human user's behavior and state, and b) their need for more network resources compared to conventional cellular applications. To successfully deploy such applications over wireless and cellular systems, the network must be made cognizant of not only the quality-of-service (QoS) needs of the applications, but also of the perceptions of the human users on this QoS. In this paper, by explicitly modeling the limitations of the human brain, a concrete measure for the delay perception of human users in a wireless network is introduced. Then, a novel learning method, called probability distribution identification, is proposed to find a probabilistic model for this delay perception based on the brain features of a human user. The proposed learning method uses both supervised and unsupervised learning techniques to build a Gaussian mixture model of the human brain features. Given a model for the delay perception of the human brain, a novel brain-aware resource management algorithm based on Lyapunov optimization is proposed for allocating radio resources to human users while minimizing the transmit power and taking into account the reliability of both machine type devices and human users. The proposed algorithm is shown to have a low complexity. Moreover, a closed-form relationship between the reliability measure and wireless physical layer metrics of the network is derived. Simulation results using real data from actual human users show that a brain-aware approach can yield savings of up to 78% in power compared to the system
Graph classification is practically important in many domains. To solve this problem, one usually calculates a low-dimensional representation for each node in the graph with supervised or unsupervised approaches. Most existing approaches consider all the edges between nodes while overlooking whether the edge will brings positive or negative influence to the node representation learning. In many real-world applications, however, some connections among the nodes can be noisy for graph convolution, and not all the edges deserve your attention. In this work, we distinguish the positive and negative impacts of the neighbors to the node in graph node classification, and propose to enhance the graph convolutional network by considering the labels between the neighbor edges. We present a novel GCN framework, called Label-aware Graph Convolutional Network (LAGCN), which incorporates the supervised and unsupervised learning by introducing the edge label predictor. As a general model, LAGCN can be easily adapted in various previous GCN and enhance their performance with some theoretical guarantees. Experimental results on multiple real-world datasets show that LAGCN is competitive against various state-of-the-art methods in graph classification.