Model inversion attacks aim to extract details of training data from a trained model, potentially revealing sensitive information about a person’s identity. To abide with protection of personal privacy requirements, it is important to understand the mechanisms that increase the privacy of training data. In this work, we systematically investigated the impact of the training data on a model’s susceptibility to model inversion attacks for models trained at the task of hand-written digit recognition with the openly available MNIST dataset. Using an optimization- based inversion approach, we studied the impacts of the quantity and diversity of training data, and the number and selection of classes on the susceptibility of models to inversion. Our model inversion attack strategy was less successful for models with a larger number of training data and greater training data diversity. Moreover, atypical training records provided additional protection against model inversion. We discovered that not every class was equally susceptible to model inversion attacks and that the inversion results of one class were changed when models were trained with a different selection of classes. However, we did not detect a clear relationship between the number of classes and a model’s susceptibility to inversion. Our study shows that the inversion susceptibility of a model depends on the training data – not only the data used to train the class that is inverted, but also the data used to train the other classes.
«Model inversion attacks aim to extract details of training data from a trained model, potentially revealing sensitive information about a person’s identity. To abide with protection of personal privacy requirements, it is important to understand the mechanisms that increase the privacy of training data. In this work, we systematically investigated the impact of the training data on a model’s susceptibility to model inversion attacks for models trained at the task of hand-written digit recognitio...
»