Środowiskowe Seminarium z Informacji i Technologii Kwantowych
join us / spotkanie
Matteo Rosati (Universitat Autònoma de Barcelona)
Real-time calibration of coherent-state receivers: learning by trial and error [ZOOM ID: 652 672 1604]
Optical communications technology uses light propagating in free space and optical fiber to transmit data for telecommunications and networking. When the communication takes place over very long distances, the light signals get extremely damped and Heisenberg's uncertainty principle bounds the ability to recover the message perfectly. In this regime, one has to consider that classical information is encoded in quantum states of light and transferred on a bosonic channel. The ultimate information transmission rate is provided by the Holevo capacity of these channels and it can be attained by encoding the information on coherent-state sequences with several uses of the channel, or communication modes. Unfortunately, it is still an open problem to realize an efficient receiver capable of distinguishing these quantum states with current technology, since it would require to perform a joint measurement in a coherent-superposition basis. Known receiver structures for coherent states make use of simple Gaussian operations, photodetection and feedback. In this setting, we present several reinforcement learning methods that allow an automated agent to learn near-optimal receivers from scratch. Each agent is trained and tested in real time over several runs of independent discrimination experiments and has no knowledge about the energy of the states nor the receiver setup nor the quantum-mechanical laws governing the experiments. Based exclusively on the observed photodetector outcomes, the agent adaptively chooses among a set of ~3 10^3 possible receiver setups, and obtains a reward at the end of each experiment if its guess is correct. Importantly, the information gathered in each run is intrinsically stochastic and thus insufficient to evaluate exactly the performance of the chosen receiver. Nevertheless, we present families of agents that: (i) discover a receiver beating the best Gaussian receiver after ~3 10^2 experiments; (ii) surpass the cumulative reward of the best Gaussian receiver after ~10^3 experiments; (iii) simultaneously discover a near-optimal receiver and attain its cumulative reward after ~10^5 experiments. Our results show that reinforcement learning techniques are suitable for on-line control of quantum receivers and can be employed for long-distance communications over potentially unknown channels.