This work addresses a new problem of learning generative adversarial networks (GANs) from multiple data collections that are each i) owned separately and privately by different clients and ii) drawn from a non-identical distribution that comprises different classes. Given such multi-client and non-iid data as input, we aim to achieve a distribution involving all the classes input data can belong to, while keeping the data decentralized and private in each client storage. Our key contribution to this end is a new decentralized approach for learning GANs from non-iid data called Forgiver-First Update (F2U), which a) asks clients to train an individual discriminator with their own data and b) updates a generator to fool the most `forgiving' discriminators who deem generated samples as the most real. Our theoretical analysis proves that this updating strategy indeed allows the decentralized GAN to learn a generator's distribution with all the input classes as its global optimum based on f-divergence minimization. Moreover, we propose a relaxed version of F2U called Forgiver-First Aggregation (F2A), which adaptively aggregates the discriminators while emphasizing forgiving ones to perform well in practice. Our empirical evaluations with image generation tasks demonstrated the effectiveness of our approach over state-of-the-art decentralized learning methods.
This paper presents a novel data-driven crowd simulation method that can mimic the observed traffic of pedestrians in a given environment. Given a set of observed trajectories, we use a recent form of neural networks, Generative Adversarial Networks (GANs), to learn the properties of this set and generate new trajectories with similar properties. We define a way for simulated pedestrians (agents) to follow such a trajectory while handling local collision avoidance. As such, the system can generate a crowd that behaves similarly to observations, while still enabling real-time interactions between agents. Via experiments with real-world data, we show that our simulated trajectories preserve the statistical properties of their input. Our method simulates crowds in real time that resemble existing crowds, while also allowing insertion of extra agents, combination with other simulation methods, and user interaction.
Deep neural networks have been shown to perform well in many classical machine learning problems, especially in image classification tasks. However, researchers have found that neural networks can be easily fooled, and they are surprisingly sensitive to small perturbations imperceptible to humans. Carefully crafted input images (adversarial examples) can force a well-trained neural network to provide arbitrary outputs. Including adversarial examples during training is a popular defense mechanism against adversarial attacks. In this paper we propose a new defensive mechanism under the generative adversarial network (GAN) framework. We model the adversarial noise using a generative network, trained jointly with a classification discriminative network as a minimax game. We show empirically that our adversarial network approach works well against black box attacks, with performance on par with state-of-art methods such as ensemble adversarial training and adversarial training with projected gradient descent.
Online reviews have become a vital source of information in purchasing a service (product). Opinion spammers manipulate reviews, affecting the overall perception of the service. A key challenge in detecting opinion spam is obtaining ground truth. Though there exists a large set of reviews online, only a few of them have been labeled spam or non-spam. In this paper, we propose spamGAN, a generative adversarial network which relies on limited set of labeled data as well as unlabeled data for opinion spam detection. spamGAN improves the state-of-the-art GAN based techniques for text classification. Experiments on TripAdvisor dataset show that spamGAN outperforms existing spam detection techniques when limited labeled data is used. Apart from detecting spam reviews, spamGAN can also generate reviews with reasonable perplexity.
Inferring model parameters from experimental data is a grand challenge in many sciences, including cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulators of fully-fledged simulations. These generative models have the potential to make a dramatic shift in the field of scientific simulations, but for that shift to happen we need to study the performance of such generators in the precision regime needed for science applications. To this end, in this work we apply Generative Adversarial Networks to the problem of generating weak lensing convergence maps. We show that our generator network produces maps that are described by, with high statistical confidence, the same summary statistics as the fully simulated maps.
Supervised machine learning applications in the health domain often face the problem of insufficient training datasets. The quantity of labelled data is small due to privacy concerns and the cost of data acquisition and labelling by a medical expert. Furthermore, it is quite common that collected data are unbalanced and getting enough data to personalize models for individuals is very expensive or even infeasible. This paper addresses these problems by (1) designing a recurrent Generative Adversarial Network to generate realistic synthetic data and to augment the original dataset, (2) enabling the generation of balanced datasets based on heavily unbalanced dataset, and (3) to control the data generation in such a way that the generated data resembles data from specific individuals. We apply these solutions for sleep apnea detection and study in the evaluation the performance of four well-known techniques, i.e., K-Nearest Neighbour, Random Forest, Multi-Layer Perceptron, and Support Vector Machine. All classifiers exhibit in the experiments a consistent increase in sensitivity and a kappa statistic increase by between 0.007 and 0.182.
In this work, we consider one challenging training time attack by modifying training data with bounded perturbation, hoping to manipulate the behavior (both targeted or non-targeted) of any corresponding trained classifier during test time when facing clean samples. To achieve this, we proposed to use an auto-encoder-like network to generate the pertubation on the training data paired with one differentiable system acting as the imaginary victim classifier. The perturbation generator will learn to update its weights by watching the training procedure of the imaginary classifier in order to produce the most harmful and imperceivable noise which in turn will lead the lowest generalization power for the victim classifier. This can be formulated into a non-linear equality constrained optimization problem. Unlike GANs, solving such problem is computationally challenging, we then proposed a simple yet effective procedure to decouple the alternating updates for the two networks for stability. The method proposed in this paper can be easily extended to the label specific setting where the attacker can manipulate the predictions of the victim classifiers according to some predefined rules rather than only making wrong predictions. Experiments on various datasets including CIFAR-10 and a reduced version of ImageNet confirmed the effectiveness of the proposed method and empirical results showed that, such bounded perturbation have good transferability regardless of which classifier the victim is actually using on image data.
Generative adversarial network (GAN)-based image inpainting methods which utilize coarse-to-fine network with a contextual attention module (CAM) have shown remarkable performance. However, they require numerous computational resources such as convolution operations and network parameters due to two stacked generative networks, which results in a low speed. To address this problem, we propose a novel network structure called PEPSI: parallel extended-decoder path for semantic inpainting network, which aims at not only reducing hardware costs but also improving the inpainting performance. The PEPSI consists of a single shared encoding network and parallel decoding networks with coarse and inpainting paths. The coarse path generates a preliminary inpainting result to train the encoding network for prediction of features for the CAM. At the same time, the inpainting path results in higher inpainting quality with refined features reconstructed using the CAM. In addition, we propose a Diet-PEPSI which significantly reduces the network parameters while maintaining the performance. In the proposed method, we present a Diet-PEPSI unit (DPU) which effectively aggregates the global contextual information with a small number of parameters. Extensive experiments and comparisons with state-of-the-art image inpainting methods demonstrate that both PEPSI and Diet-PEPSI achieve significant improvements in qualitative scores and reduced computation cost.
Visual inspection of underwater structures by vehicles, e.g. remotely operated vehicles (ROVs), plays an important role in scientific, military, and commercial sectors. However, the automatic extraction of information using software tools is hindered by the characteristics of water which degrade the quality of captured videos. As a contribution for restoring the color of underwater images, Underwater Denoising Autoencoder (UDAE) model is developed using a denoising autoencoder with U-Net architecture. The proposed network takes into consideration the accuracy and the computation cost to enable real-time implementation on underwater visual tasks using end-to-end autoencoder network. Underwater vehicles perception is improved by reconstructing captured frames; hence obtaining better performance in underwater tasks. Related learning methods use generative adversarial networks (GANs) to generate color corrected underwater images, and to our knowledge this paper is the first to deal with a single autoencoder capable of producing same or better results. Moreover, image pairs are constructed for training the proposed network, where it is hard to obtain such dataset from underwater scenery. At the end, the proposed model is compared to a state-of-the-art method.
Vehicle Re-ID has recently attracted enthusiastic attention due to its potential applications in smart city and urban surveillance. However, it suffers from large intra-class variation caused by view variations and illumination changes, and inter-class similarity especially for different identities with the similar appearance. To handle these issues, in this paper, we propose a novel deep network architecture, which guided by meaningful attributes including camera views, vehicle types and colors for vehicle Re-ID. In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i.e., camera view, vehicle type and vehicle color). Moreover, to overcome the shortcomings of limited vehicle images of different views, we design a view-specified generative adversarial network to generate the multi-view vehicle images. For network training, we annotate the view labels on the VeRi-776 dataset. Note that one can directly adopt the pre-trained view (as well as type and color) subnetwork on the other datasets with only ID information, which demonstrates the generalization of our model. Extensive experiments on the benchmark datasets VeRi-776 and VehicleID suggest that the proposed approach achieves the promising performance and yields to a new state-of-the-art for vehicle Re-ID.
Point cloud data from 3D LiDAR sensors are one of the most crucial sensor modalities for versatile safety-critical applications such as self-driving vehicles. Since the annotations of point cloud data is an expensive and time-consuming process, therefore recently the utilisation of simulated environments and 3D LiDAR sensors for this task started to get some popularity. With simulated sensors and environments, the process for obtaining an annotated synthetic point cloud data became much easier. However, the generated synthetic point cloud data are still missing the artefacts usually exist in point cloud data from real 3D LiDAR sensors. As a result, the performance of the trained models on this data for perception tasks when tested on real point cloud data is degraded due to the domain shift between simulated and real environments. Thus, in this work, we are proposing a domain adaptation framework for bridging this gap between synthetic and real point cloud data. Our proposed framework is based on the deep cycle-consistent generative adversarial networks (CycleGAN) architecture. We have evaluated the performance of our proposed framework on the task of vehicle detection from a bird's eye view (BEV) point cloud images coming from real 3D LiDAR sensors. The framework has shown competitive results with an improvement of more than 7% in average precision score over other baseline approaches when tested on real BEV point cloud images.
Image-to-image translation, which translates input images to a different domain with a learned one-to-one mapping, has achieved impressive success in recent years. The success of translation mainly relies on the network architecture to reserve the structural information while modify the appearance slightly at the pixel level through adversarial training. Although these networks are able to learn the mapping, the translated images are predictable without exclusion. It is more desirable to diversify them using image-to-image translation by introducing uncertainties, i.e., the generated images hold potential for variations in colors and textures in addition to the general similarity to the input images, and this happens in both the target and source domains. To this end, we propose a novel generative adversarial network (GAN) based model, InjectionGAN, to learn a many-to-many mapping. In this model, the input image is combined with latent variables, which comprise of domain-specific attribute and unspecific random variations. The domain-specific attribute indicates the target domain of the translation, while the unspecific random variations introduce uncertainty into the model. A unified framework is proposed to regroup these two parts and obtain diverse generations in each domain. Extensive experiments demonstrate that the diverse generations have high quality for the challenging image-to-image translation tasks where no pairing information of the training dataset exits. Both quantitative and qualitative results prove the superior performance of InjectionGAN over the state-of-the-art approaches.
We propose a new attack model called AT-GAN that transfers generative adversarial nets (GANs) for adversarial attacks. Different from existing attacks that add small perturbations to the input images, AT-GAN tries to explore the distribution of adversarial instances so as to directly generate the adversarial examples from any random noise. In this way, the generated adversarial examples are not limited to any natual images. Also, compared with the ioneer work using GAN that do iterations of gradient descent to search for a good noise in the neighborhood of an original random noise such that the corresponding output of GAN is an adversial example, our model is a generative model that tries to learn the distribution of the adversial examples. Once AT-GAN is trained, it can generate adversarial images and the output is not limited to the input noise. Experiments show that AT-GAN is very fast and can generate plenty of adversarial instances that look more realistic to human eyes, AT-GAN yields a higher attack success rate under various adversarial training defenses for semi-whitebox as well as black-box attack settings, and AT-GAN can learn a distribution of adversarial examples that is very close to the distribution of the real data.
For machine learning task, lacking sufficient samples mean the trained model has low confidence to approach the ground truth function. Until recently, after the generative adversarial networks (GAN) had been proposed, we see the hope of small samples data augmentation (DA) with realistic fake data, and many works validated the viability of GAN-based DA. Although most of the works pointed out higher accuracy can be achieved using GAN-based DA, some researchers stressed that the fake data generated from GAN has inherent bias, and in this paper, we explored when the bias is so low that it cannot hurt the performance, we set experiments to depict the bias in different GAN-based DA setting, and from the results, we design a pipeline to inspect specific dataset is efficiently-augmentable with GAN-based DA or not. And finally, depending on our trial to reduce the bias, we proposed some advice to mitigate bias in GAN-based DA application.
This work offers a new method for generating photo-realistic images from semantic label maps and a simulator edge map images. We do so in a conditional manner, where we train a Generative Adversarial network (GAN) given an image and its semantic label map to output a photo-realistic version of that scene. Existing architectures of GANs still lack the photo-realism capabilities. We address this issue by embedding edge maps, and presenting the Generator with an edge map image as a prior, which enables generating high level details in the image. We offer a model that uses this generator to create visually appealing videos as well, when a sequence of images is given.