We propose local distributional smoothness (LDS), a new notion of smoothnessfor statistical model that can be used as a regularization term to promote thesmoothness of the model distribution. We named the LDS based regularization asvirtual adversarial training (VAT). The LDS of a model at an input datapoint isdefined as the KL-divergence based robustness of the model distribution againstlocal perturbation around the datapoint. VAT resembles adversarial training,but distinguishes itself in that it determines the adversarial direction fromthe model distribution alone without using the label information, making itapplicable to semi-supervised learning. The computational cost for VAT isrelatively low. For neural network, the approximated gradient of the LDS can becomputed with no more than three pairs of forward and back propagations. Whenwe applied our technique to supervised and semi-supervised learning for theMNIST dataset, it outperformed all the training methods other than the currentstate of the art method, which is based on a highly advanced generative model.We also applied our method to SVHN and NORB, and confirmed our method'ssuperior performance over the current state of the art semi-supervised methodapplied to these datasets.
translated by 谷歌翻译