EntropySGD optimizes the prior of a PACBayes bound: Datadependent PACBayes priors via differential privacy
This topic contains 0 replies, has 1 voice, and was last updated by arXiv 1 year, 5 months ago.

EntropySGD optimizes the prior of a PACBayes bound: Datadependent PACBayes priors via differential privacy
We show that EntropySGD (Chaudhari et al., 2016), when viewed as a learning algorithm, optimizes a PACBayes bound on the risk of a Gibbs (posterior) classifier, i.e., a randomized classifier obtained by a risksensitive perturbation of the weights of a learned classifier. EntropySGD works by optimizing the bound’s prior, violating the hypothesis of the PACBayes theorem that the prior is chosen independently of the data. Indeed, available implementations of EntropySGD rapidly obtain zero training error on random labels and the same holds of the Gibbs posterior. In order to obtain a valid generalization bound, we show that an $epsilon$differentially private prior yields a valid PACBayes bound, a straightforward consequence of results connecting generalization with differential privacy. Using stochastic gradient Langevin dynamics (SGLD) to approximate the wellknown exponential release mechanism, we observe that generalization error on MNIST (measured on held out data) falls within the (empirically nonvacuous) bounds computed under the assumption that SGLD produces perfect samples. In particular, EntropySGLD can be configured to yield relatively tight generalization bounds and still fit real labels, although these same settings do not obtain stateoftheart performance.
EntropySGD optimizes the prior of a PACBayes bound: Datadependent PACBayes priors via differential privacy
by Gintare Karolina Dziugaite, Daniel M. Roy
https://arxiv.org/pdf/1712.09376v1.pdf
You must be logged in to reply to this topic.