Wasserstein Distributional Robustness and Regularization in Statistical Learning
This topic contains 0 replies, has 1 voice, and was last updated by arXiv 1 year, 5 months ago.

Wasserstein Distributional Robustness and Regularization in Statistical Learning
A central question in statistical learning is to design algorithms that not only perform well on training data, but also generalize to new and unseen data. In this paper, we tackle this question by formulating a distributionally robust stochastic optimization (DRSO) problem, which seeks a solution that minimizes the worstcase expected loss over a family of distributions that are close to the empirical distribution in Wasserstein distances. We establish a connection between such Wasserstein DRSO and regularization. More precisely, we identify a broad class of loss functions, for which the Wasserstein DRSO is asymptotically equivalent to a regularization problem with a gradientnorm penalty. Such relation provides new interpretations for problems involving regularization, including a great number of statistical learning problems and discrete choice models (e.g. multinomial logit). The connection suggests a principled way to regularize highdimensional, nonconvex problems. This is demonstrated through the training of Wasserstein generative adversarial networks in deep learning.
Wasserstein Distributional Robustness and Regularization in Statistical Learning
by Rui Gao, Xi Chen, Anton J. Kleywegt
https://arxiv.org/pdf/1712.06050v2.pdf
You must be logged in to reply to this topic.