ZTE Communications ›› 2022, Vol. 20 ›› Issue (4): 69-77.DOI: 10.12142/ZTECOM.202204009

• Research Paper • Previous Articles     Next Articles

Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning

JIA Haonan1, HE Zhenqing1(), TAN Wanlong1, RUI Hua2,3, LIN Wei2,3   

  1. 1.National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu 611731, China
    2.ZTE Corporation, Shenzhen 518057, China
    3.State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen 518055, China
  • Received:2022-02-24 Online:2022-12-31 Published:2022-12-30
  • About author:JIA Haonan received his BS and MS degrees in communication engineering from the University of Electronic Science and Technology of China, in 2019 and 2022, respectively. His research interests focus on deep learning with application to wireless communications.|HE Zhenqing (zhenqinghe@uestc.edu.cn) received his PhD degree in communication and information system from the University of Electronic Science and Technology of China (UESTC) in 2017. Since 2018, he has been with the National Key Laboratory of Science and Technology on Communications, UESTC, where he is currently an associate professor. His main research interests include statistical signal processing, wireless communications, and machine learning. He was a recipient of the IEEE Communications Society Heinrich Hertz Prize Paper Award in 2022.|TAN Wanlong received his BS degree in communication engineering from Jilin University, China in 2020. He is currently pursuing his MS degree in communication engineering with the University of Electronic Science and Technology of China. His research interests include wireless communications and reconfigurable intelligent surface.|RUI Hua received his BS, MS and PhD degrees from Nanjing University of Aeronautics and Astronautics, China in 1999, 2002, and 2005, respectively. He currently works as a senior pre-research expert and the head of the 6G Future Wireless Lab in ZTE Corporation. He has been engaged in wireless communication product and new technology pre-research, including 3G/4G/WIFI/5G/6G network architecture and key technologies. His main research direction is the 6G wireless communication technology. He has published more than 20 invention patents and papers in related fields. He has been engaged in more than 10 industry technical standards and white papers including 3GPP 3G/4G/5G series standards and IEEE 802.11 series standards|LIN Wei received her BS and MS degrees in communication and information system from Northwestern Polytechnical University, China in 2002 and 2005 respectively. At present, she works in ZTE Corporation as a senior algorithm engineer in the Algorithm Department. Her research interests include 6G wireless communication physical layer technology and wireless AI technology. She has applied for more than 20 invention patents in related fields.
  • Supported by:
    the joint research project with ZTE Corporation(HC-CN-2020120002)


The sum rate maximization beamforming problem for a multi-cell multi-user multiple-input single-output interference channel (MISO-IC) system is considered. Conventionally, the centralized and distributed beamforming solutions to the MISO-IC system have high computational complexity and bear a heavy burden of channel state information exchange between base stations (BSs), which becomes even much worse in a large-scale antenna system. To address this, we propose a distributed deep reinforcement learning (DRL) based approach with limited information exchange. Specifically, the original beamforming problem is decomposed of the problems of beam direction design and power allocation and the costs of information exchange between BSs are significantly reduced. In particular, each BS is provided with an independent deep deterministic policy gradient network that can learn to choose the beam direction scheme and simultaneously allocate power to users. Simulation results illustrate that the proposed DRL-based approach has comparable sum rate performance with much less information exchange over the conventional distributed beamforming solutions.

Key words: deep reinforcement learning, downlink beamforming, multiple-input single-output interference channel