Scalable and robust reinforcement learning decoding of surface codes
ORAL
Abstract
A key component of fault-tolerant quantum computing is error decoding. For most practical quantum computing platforms, the decoder needs to overcome severe time and space constraints. Neural network decoders for quantum error-correction codes have constant runtimes, but it is difficult to derive robustness guarantees.
We study the scalability and robustness against adversarial attacks of deep reinforcement learning decoders for the surface code. Our new training environment based on Stim is used to evaluate the decoder under circuit-level Pauli noise, while significantly speeding up training.
In this work, we developed and successfully tested, to the best of our knowledge, the first robustness test for reinforcement learning decoders. The test quantifies the error-correction strength, by sampling low weight syndrome patterns that the agent is pessimistic about decoding correctly (such as irregular arrangements, for example). We experiment with injecting the obtained patterns into the training to increase the agents robustness.
Moreover, we improved the scalability of training after analyzing how the hyperparameters of the reinforcement learning algorithm influences the scalability. We hereby obtain distance-13 decoders, in less than 24 hours, improving on previous state of the art which was a distance-7 decoder.
We study the scalability and robustness against adversarial attacks of deep reinforcement learning decoders for the surface code. Our new training environment based on Stim is used to evaluate the decoder under circuit-level Pauli noise, while significantly speeding up training.
In this work, we developed and successfully tested, to the best of our knowledge, the first robustness test for reinforcement learning decoders. The test quantifies the error-correction strength, by sampling low weight syndrome patterns that the agent is pessimistic about decoding correctly (such as irregular arrangements, for example). We experiment with injecting the obtained patterns into the training to increase the agents robustness.
Moreover, we improved the scalability of training after analyzing how the hyperparameters of the reinforcement learning algorithm influences the scalability. We hereby obtain distance-13 decoders, in less than 24 hours, improving on previous state of the art which was a distance-7 decoder.
–
Presenters
-
Alexandru Paler
Aalto University
Authors
-
Alexandru Paler
Aalto University
-
Jerome Lenssen
Aalto University