Adversarial examples in random neural networks with general activations

Andrea Montanari; Yuchen Wu

doi:10.4171/msl/41

JournalsmslVol. 6, No. 1/2pp. 143–200

Adversarial examples in random neural networks with general activations

Andrea Montanari
Stanford University, USA
Yuchen Wu
Stanford University, USA

Download PDF

This article is published open access under our Subscribe to Open model.

Abstract

A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential width and ReLU or smooth activations, and multi-layer ReLU networks with sub-exponential width. We present a result of the same type, with no restriction on width and for general locally Lipschitz continuous activations.

More precisely, given a neural network $f (\cdot; θ)$ with random weights $θ$ , and feature vector $x$ , we show that an adversarial example $x^{'}$ can be found with high probability along the direction of the gradient $\nabla_{x} f (x; θ)$ . Our proof is based on a Gaussian conditioning technique. Instead of proving that $f$ is approximately linear in a neighborhood of $x$ , we characterize the joint distribution of $f (x; θ)$ and $f (x^{'}; θ)$ for $x^{'} = x - s (x) \nabla_{x} f (x; θ)$ , where $s (x) = sign (f (x; θ)) \cdot s_{d}$ for some positive step size $s_{d}$ .

Cite this article

Andrea Montanari, Yuchen Wu, Adversarial examples in random neural networks with general activations. Math. Stat. Learn. 6 (2023), no. 1/2, pp. 143–200

DOI 10.4171/MSL/41