We present a framework to learn privacy-preserving en-codings of images that inhibit inference of chosen private attributes, while allowing recovery of other desirable information. Rather than simply inhibiting a given fixed pre-trained estimator, our goal is that an estimator be unable to learn to accurately predict the private attributes even with knowledge of the encoding function. We use a natural adversarial optimization-based formulation for this-training the encoding function against a classifier for the private attribute, with both modeled as deep neural networks. The key contribution of our work is a stable and con-vergent optimization approach that is successful at learning an encoder with our desired properties-maintaining utility while inhibiting inference of private attributes, not just within the adversarial optimization, but also by classifiers that are trained after the encoder is fixed. We adopt a rigorous experimental protocol for verification wherein clas-sifiers are trained exhaustively till saturation on the fixed encoders. We evaluate our approach on tasks of real-world complexity-learning high-dimensional encodings that inhibit detection of different scene categories-and find that it yields encoders that are resilient at maintaining privacy.
translated by 谷歌翻译