Privacy-preserving machine learning with tensor networks
ORAL
Abstract
Vast amounts of data are routinely processed in machine learning pipelines, every time covering more aspects of our interactions with the world. When the models are made public, is the safety of the data used for training them guaranteed? This is a crucial question, especially when processing sensitive data such as medical records. The state-of-the-art protection techniques, despite being deployed commercially, consist in adding noise at some stage during the training process, and thus imply a tradeoff between privacy protection and performance.
In this talk I will argue and practically illustrate that insights coming from the tensor network representations of quantum many-body states can help in devising better privacy-preserving machine learning algorithms. First, I will show that standard neural networks are vulnerable to a type of privacy leak that, notably, is a priori resistant to the standard protection mechanisms. Then, I will show that tensor networks, when used as machine learning architectures, are invulnerable to this leak. The proof of the resilience is based on the existence of canonical forms for such architectures. Given that tensor networks are recently showing to compete and even surpass traditional machine learning architectures in certain cases, these results imply that one may not have to be forced to make a choice between accuracy in prediction and ensuring the privacy of the information processed when using machine learning on sensitive data.
In this talk I will argue and practically illustrate that insights coming from the tensor network representations of quantum many-body states can help in devising better privacy-preserving machine learning algorithms. First, I will show that standard neural networks are vulnerable to a type of privacy leak that, notably, is a priori resistant to the standard protection mechanisms. Then, I will show that tensor networks, when used as machine learning architectures, are invulnerable to this leak. The proof of the resilience is based on the existence of canonical forms for such architectures. Given that tensor networks are recently showing to compete and even surpass traditional machine learning architectures in certain cases, these results imply that one may not have to be forced to make a choice between accuracy in prediction and ensuring the privacy of the information processed when using machine learning on sensitive data.
–
Publication: arXiv:2202.12319
Presenters
-
Alejandro Pozas-Kerstjens
Institute of Mathematical Sciences
Authors
-
Alejandro Pozas-Kerstjens
Institute of Mathematical Sciences