ZTE Communications ›› 2023, Vol. 21 ›› Issue (2): 53-60.DOI: 10.12142/ZTECOM.202302008

• Special Topic • Previous Articles     Next Articles

Multi-User MmWave Beam Tracking via Multi-Agent Deep Q-Learning

MENG Fan1(), HUANG Yongming2, LU Zhaohua3, XIAO Huahua3   

  1. 1.Purple Mountain Laboratories, Nanjing 211111, China
    2.School of Information Science and Engineering, Southeast University, Nanjing 210096, China
    3.State Key Laboratory of Mobile Network and Mobile Multimedia Technology, ZTE Corporation, Shenzhen 518057, China
  • Received:2022-11-08 Online:2023-06-13 Published:2023-06-13
  • About author:MENG Fan (mengfan@pmlabs.com.cn) received his BS degree from the School of Electronic Engineering, University of Electronic Science and Technology of China in 2015 and the PhD degree with the School of Information Science and Engineering, Southeast University, China in 2020. He is now working in the Purple Mountain Laboratories. His research mainly focuses on applying machine learning techniques in the wireless communication systems. His research interests include machine learning in physical layer, model- and data-driven design, beamforming, beam alignment and tracking, and AI-enhanced positioning.|HUANG Yongming received his BS and MS degrees from Nanjing University, China in 2000 and 2003, respectively, and PhD degree in electrical engineering from Southeast University, China in 2007. Since March 2007, he has been with the School of Information Science and Engineering, Southeast University, where he is currently a full professor. During 2008–2009, he visited the Signal Processing Lab, Royal Institute of Technology, Sweden. He has authored or coauthored more than 200 peer-reviewed papers, and holds more than 80 invention patents. He has submitted 20 technical contributions to IEEE standards. His research interests include intelligent 5G/6G mobile communications and millimeter wave wireless communications.|LU Zhaohua received his PhD degree from Tianjin University, China in 2006. He is currently a senior wireless communication system research expert at ZTE Corporation and has long been engaged in the field of wireless communication system design and the key technologies of the physical layer. He has many technical contributions, papers, and patents in interference mitigation in the MIMO field.|XIAO Huahua received his MS degree in computer software and theories from Sun Yat-Sen University, China. He is currently a senior engineer in the field of antenna algorithm pre-research with ZTE Corporation. He has applied for more than 150 patents in the multi-antenna field home and abroad.

Abstract:

Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems. In the meanwhile, the overhead cost of channel state information and beam training is considerable, especially in dynamic environments. To reduce the overhead cost, we propose a multi-user beam tracking algorithm using a distributed deep Q-learning method. With online learning of users’ moving trajectories, the proposed algorithm learns to scan a beam subspace to maximize the average effective sum rate. Considering practical implementation, we model the continuous beam tracking problem as a non-Markov decision process and thus develop a simplified training scheme of deep Q-learning to reduce the training complexity. Furthermore, we propose a scalable state-action-reward design for scenarios with different users and antenna numbers. Simulation results verify the effectiveness of the designed method.

Key words: multi-agent deep Q-learning, centralized training and distributed execution, mmWave communication, beam tracking, scalability