Machine Learning

The Robust Manifold Defense: Adversarial Training using Generative Models

This topic contains 0 replies, has 1 voice, and was last updated by  arXiv 11 months, 2 weeks ago.


  • arXiv
    5 pts

    The Robust Manifold Defense: Adversarial Training using Generative Models

    Deep neural networks are demonstrating excellent performance on several classical vision problems. However, these networks are vulnerable to adversarial examples, minutely modified images that induce arbitrary attacker-chosen output from the network. We propose a mechanism to protect against these adversarial inputs based on a generative model of the data. We introduce a pre-processing step that projects on the range of a generative model using gradient descent before feeding an input into a classifier. We show that this step provides the classifier with robustness against first-order, substitute model, and combined adversarial attacks. Using a min-max formulation, we show that there may exist adversarial examples even in the range of the generator, natural-looking images extremely close to the decision boundary for which the classifier has unjustifiedly high confidence. We show that adversarial training on the generative manifold can be used to make a classifier that is robust to these attacks. Finally, we show how our method can be applied even without a pre-trained generative model using a recent method called the deep image prior. We evaluate our method on MNIST, CelebA and Imagenet and show robustness against the current state of the art attacks.

    The Robust Manifold Defense: Adversarial Training using Generative Models
    by Andrew Ilyas, Ajil Jalal, Eirini Asteri, Constantinos Daskalakis, Alexandros G. Dimakis
    https://arxiv.org/pdf/1712.09196v1.pdf

You must be logged in to reply to this topic.