How to generate semantically meaningful and structurally sound adversarial examples? We propose to answer this question by restricting the search for adversaries in the true data manifold. To this end, we introduce a stochastic variational inference method to learn the data manifold, in the presence of continuous latent variables with intractable posterior distributions, without requiring an a priori form for the data underlying distribution. We then propose a manifold perturbation strategy that ensures the cases we perturb remain in the manifold of the original examples and thereby generate the adversaries. We evaluate our approach on a number of image and text datasets. Our results show the effectiveness of our approach in producing coherent, and realistic-looking adversaries that can evade strong defenses known to be resilient to traditional adversarial attacks
In recent years, kernel-based sparse coding (K-SRC) has received particular attention due to its efficient representation of nonlinear data structures in the feature space. Nevertheless, the existing K-SRC methods suffer from the lack of consistency between their training and test optimization frameworks. In this work, we propose a novel confident K-SRC and dictionary learning algorithm (CKSC) which focuses on the discriminative reconstruction of the data based on its representation in the kernel space. CKSC focuses on reconstructing each data sample via weighted contributions which are confident in its corresponding class of data. We employ novel discriminative terms to apply this scheme to both training and test frameworks in our algorithm. This specific design increases the consistency of these optimization frameworks and improves the discriminative performance in the recall phase. In addition, CKSC directly employs the supervised information in its dictionary learning framework to enhance the discriminative structure of the dictionary. For empirical evaluations, we implement our CKSC algorithm on multivariate time-series benchmarks such as DynTex++ and UTKinect. Our claims regarding the superior performance of the proposed algorithm are justified throughout comparing its classification results to the state-of-the-art K-SRC algorithms.
This paper presents a novel framework that jointly exploits Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) in the context of multi-label remote sensing (RS) image classification. The proposed framework consists of four main modules. The first module aims to extract preliminary local descriptors by considering that RS image bands can be associated with different spatial resolutions. To this end, we introduce a K-Branch CNN in which each branch aims at extracting descriptors of image bands that have the same spatial resolution. The second module aims to model spatial relationship among local descriptors. To this end, we propose a Bidirectional RNN architecture in which Long Short-Term Memory nodes enrich local descriptors by considering spatial relationships of local areas (image patches). The third module aims to define multiple attention scores for local descriptors. To this end, we introduce a novel patch-based multi-attention mechanism that takes into account the joint occurrence of multiple land-cover classes and provides the attention-based local descriptors. The last module aims to employ these descriptors for multi-label RS image classification. Experimental results obtained on our large-scale Sentinel-2 benchmark archive (called as BigEarthNet) show the effectiveness of the proposed framework compared to a state of the art method.
Food image recognition is one of the promising applications of visual object recognition in computer vision. In this study, a small-scale dataset consisting of 5822 images of ten categories and a five-layer CNN was constructed to recognize these images. The bag-of-features (BoF) model coupled with support vector machine (SVM) was first evaluated for image classification, resulting in an overall accuracy of 56%; while the CNN model performed much better with an overall accuracy of 74%. Data augmentation techniques based on geometric transformation were applied to increase the size of training images, which achieved a significantly improved accuracy of more than 90% while preventing the overfitting issue that occurred to the CNN based on raw training data. Further improvements can be expected by collecting more images and optimizing the network architecture and hyper-parameters.
Bayesian Decision Trees are known for their probabilistic interpretability. However, their construction can sometimes be costly. In this article we present a general Bayesian Decision Tree algorithm applicable to both regression and classification problems. The algorithm does not apply Markov Chain Monte Carlo and does not require a pruning step. While it is possible to construct a weighted probability tree space we find that one particular tree, the greedy-modal tree (GMT), explains most of the information contained in the numerical examples. This approach seems to perform similarly to Random Forests.
In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional layers are replaced by novel compound dictionary learning and coding layers. The dictionary learning layer learns an over-complete dictionary for the input training data. At the deep coding layer, a locality constraint is added to guarantee that the activated dictionary bases are close to each other. Next, the activated dictionary atoms are assembled together and passed to the next compound dictionary learning and coding layers. In this way, the activated atoms in the first layer can be represented by the deeper atoms in the second dictionary. Intuitively, the second dictionary is designed to learn the fine-grained components which are shared among the input dictionary atoms. In this way, a more informative and discriminative low-level representation of the dictionary atoms can be obtained. We empirically compare the proposed DDLCN with several dictionary learning methods and deep learning architectures. The experimental results on four popular benchmark datasets demonstrate that the proposed DDLCN achieves competitive results compared with state-of-the-art approaches.
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.
In the last decade, traditional dictionary learning methods have been successfully applied to various pattern classification tasks. Although these methods produce sparse representations of signals which are robust against distortions and missing data, such representations quite often turn out to be unsuitable if the final objective is signal classification. In order to overcome or at least to attenuate such a weakness, several new methods which incorporate discriminative information into sparse-inducing models have emerged in recent years. In particular, methods for discriminative dictionary learning have shown to be more accurate (in terms of signal classification) than the traditional ones, which are only focused on minimizing the total representation error. In this work, we present both a novel multi-class discriminative measure and an innovative dictionary learning method. For a given dictionary, this new measure, which takes into account not only when a particular atom is used for representing signals coming from a certain class and the magnitude of its corresponding representation coefficient, but also the effect that such an atom has in the total representation error, is capable of efficiently quantifying the degree of discriminability of each one of the atoms. On the other hand, the new dictionary construction method yields dictionaries which are highly suitable for multi-class classification tasks. Our method was tested with a widely used database for handwritten digit recognition and compared with three state-of-the-art classification methods. The results show that our method significantly outperforms the other three achieving good recognition rates and additionally, reducing the computational cost of the classifier.
A discriminative structured analysis dictionary is proposed for the classification task. A structure of the union of subspaces (UoS) is integrated into the conventional analysis dictionary learning to enhance the capability of discrimination. A simple classifier is also simultaneously included into the formulated functional to ensure a more complete consistent classification. The solution of the algorithm is efficiently obtained by the linearized alternating direction method of multipliers. Moreover, a distributed structured analysis dictionary learning is also presented to address large scale datasets. It can group-(class-) independently train the structured analysis dictionaries by different machines/cores/threads, and therefore avoid a high computational cost. A consensus structured analysis dictionary and a global classifier are jointly learned in the distributed approach to safeguard the discriminative power and the efficiency of classification. Experiments demonstrate that our method achieves a comparable or better performance than the state-of-the-art algorithms in a variety of visual classification tasks. In addition, the training and testing computational complexity are also greatly reduced.
In the era of big data, reducing data dimensionality is critical in many areas of science. Widely used Principal Component Analysis (PCA) addresses this problem by computing a low dimensional data embedding that maximally explain variance of the data. However, PCA has two major weaknesses. Firstly, it only considers linear correlations among variables (features), and secondly it is not suitable for categorical data. We resolve these issues by proposing Maximally Correlated Principal Component Analysis (MCPCA). MCPCA computes transformations of variables whose covariance matrix has the largest Ky Fan norm. Variable transformations are unknown, can be nonlinear and are computed in an optimization. MCPCA can also be viewed as a multivariate extension of Maximal Correlation. For jointly Gaussian variables we show that the covariance matrix corresponding to the identity (or the negative of the identity) transformations majorizes covariance matrices of non-identity functions. Using this result we characterize global MCPCA optimizers for nonlinear functions of jointly Gaussian variables for every rank constraint. For categorical variables we characterize global MCPCA optimizers for the rank one constraint based on the leading eigenvector of a matrix computed using pairwise joint distributions. For a general rank constraint we propose a block coordinate descend algorithm and show its convergence to stationary points of the MCPCA optimization. We compare MCPCA with PCA and other state-of-the-art dimensionality reduction methods including Isomap, LLE, multilayer autoencoders (neural networks), kernel PCA, probabilistic PCA and diffusion maps on several synthetic and real datasets. We show that MCPCA consistently provides improved performance compared to other methods.