Realizing a deep reinforcement learning agent discovering real-time feedback control strategies for a quantum system
ORAL
Abstract
To realize the full potential of quantum technologies, finding good strategies to control quantum information processing devices in real time becomes increasingly important. Usually these strategies require a precise understanding of the device itself, which is generally not available. Model-free reinforcement learning circumvents this need by discovering control strategies from scratch without relying on an accurate description of the quantum system. Furthermore, important tasks like state preparation, gate teleportation and error correction need feedback at time scales much shorter than the coherence time, which for superconducting circuits is in the microsecond range. Developing and training a deep reinforcement learning agent able to operate in this real-time feedback regime has been an open challenge.
Here, we have implemented such an agent in the form of a latency-optimized deep neural network on an FPGA. We demonstrate its use to efficiently initialize a superconducting qubit into a target state. To train the agent, we use model-free reinforcement learning that is based solely on measurement data. We study the agent's performance for high-fidelity, low-fidelity and three-level readout, and compare with simple strategies based on thresholding. This demonstration motivates further research towards adoption of reinforcement learning for real-time feedback control of quantum devices and more generally any physical system requiring learnable low-latency feedback control.
Here, we have implemented such an agent in the form of a latency-optimized deep neural network on an FPGA. We demonstrate its use to efficiently initialize a superconducting qubit into a target state. To train the agent, we use model-free reinforcement learning that is based solely on measurement data. We study the agent's performance for high-fidelity, low-fidelity and three-level readout, and compare with simple strategies based on thresholding. This demonstration motivates further research towards adoption of reinforcement learning for real-time feedback control of quantum devices and more generally any physical system requiring learnable low-latency feedback control.
–
Presenters
-
Jonas Landgraf
Max Planck Institute for the Science of Light
Authors
-
Jonas Landgraf
Max Planck Institute for the Science of Light
-
Kevin Reuer
ETH Zurich
-
Thomas Foesel
Max Planck Institute for the Science of Light
-
James O'Sullivan
ETH Zurich
-
Liberto Beltrán
ETH Zurich
-
Abdulkadir Akin
ETH Zurich
-
Graham J Norris
ETH Zurich
-
Ants Remn
ETH Zurich
-
Michael Kerschbaum
ETH Zurich
-
Jean-Claude Besse
ETH Zurich
-
Florian Marquardt
Max Planck Institute for the Science of Light, Friedrich-Alexander University Erlangen-
-
Andreas Wallraff
ETH Zurich
-
Christopher Eichler
ETH Zurich, ETH, ETH Zurich, FAU Erlangen-Nürnberg