Deep kernel learning combines the non-parametric flexibility of kernelmethods with the inductive biases of deep learning architectures. We propose anovel deep kernel learning model and stochastic variational inference procedurewhich generalizes deep kernel learning approaches to enable classification,multi-task learning, additive covariance structures, and stochastic gradienttraining. Specifically, we apply additive base kernels to subsets of outputfeatures from deep neural architectures, and jointly learn the parameters ofthe base kernels and deep network through a Gaussian process marginallikelihood objective. Within this framework, we derive an efficient form ofstochastic variational inference which leverages local kernel interpolation,inducing points, and structure exploiting algebra. We show improved performanceover stand alone deep networks, SVMs, and state of the art scalable Gaussianprocesses on several classification benchmarks, including an airline delaydataset containing 6 million training points, CIFAR, and ImageNet.
translated by 谷歌翻译