Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning

doi:10.12142/ZTECOM.202204009

ZTE Communications ›› 2022, Vol. 20 ›› Issue (4): 69-77.DOI: 10.12142/ZTECOM.202204009

• Research Paper • Previous Articles Next Articles

Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning

JIA Haonan¹, HE Zhenqing¹(), TAN Wanlong¹, RUI Hua^2,³, LIN Wei^2,³

^1.National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu 611731, China
^2.ZTE Corporation, Shenzhen 518057, China
^3.State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen 518055, China

Received:2022-02-24 Online:2022-12-31 Published:2022-12-30
About author:JIA Haonan received his BS and MS degrees in communication engineering from the University of Electronic Science and Technology of China, in 2019 and 2022, respectively. His research interests focus on deep learning with application to wireless communications.|HE Zhenqing (zhenqinghe@uestc.edu.cn) received his PhD degree in communication and information system from the University of Electronic Science and Technology of China (UESTC) in 2017. Since 2018, he has been with the National Key Laboratory of Science and Technology on Communications, UESTC, where he is currently an associate professor. His main research interests include statistical signal processing, wireless communications, and machine learning. He was a recipient of the IEEE Communications Society Heinrich Hertz Prize Paper Award in 2022.|TAN Wanlong received his BS degree in communication engineering from Jilin University, China in 2020. He is currently pursuing his MS degree in communication engineering with the University of Electronic Science and Technology of China. His research interests include wireless communications and reconfigurable intelligent surface.|RUI Hua received his BS, MS and PhD degrees from Nanjing University of Aeronautics and Astronautics, China in 1999, 2002, and 2005, respectively. He currently works as a senior pre-research expert and the head of the 6G Future Wireless Lab in ZTE Corporation. He has been engaged in wireless communication product and new technology pre-research, including 3G/4G/WIFI/5G/6G network architecture and key technologies. His main research direction is the 6G wireless communication technology. He has published more than 20 invention patents and papers in related fields. He has been engaged in more than 10 industry technical standards and white papers including 3GPP 3G/4G/5G series standards and IEEE 802.11 series standards|LIN Wei received her BS and MS degrees in communication and information system from Northwestern Polytechnical University, China in 2002 and 2005 respectively. At present, she works in ZTE Corporation as a senior algorithm engineer in the Algorithm Department. Her research interests include 6G wireless communication physical layer technology and wireless AI technology. She has applied for more than 20 invention patents in related fields.
Supported by:
the joint research project with ZTE Corporation(HC-CN-2020120002)

Abstract

Abstract:

The sum rate maximization beamforming problem for a multi-cell multi-user multiple-input single-output interference channel (MISO-IC) system is considered. Conventionally, the centralized and distributed beamforming solutions to the MISO-IC system have high computational complexity and bear a heavy burden of channel state information exchange between base stations (BSs), which becomes even much worse in a large-scale antenna system. To address this, we propose a distributed deep reinforcement learning (DRL) based approach with limited information exchange. Specifically, the original beamforming problem is decomposed of the problems of beam direction design and power allocation and the costs of information exchange between BSs are significantly reduced. In particular, each BS is provided with an independent deep deterministic policy gradient network that can learn to choose the beam direction scheme and simultaneously allocate power to users. Simulation results illustrate that the proposed DRL-based approach has comparable sum rate performance with much less information exchange over the conventional distributed beamforming solutions.

Key words: deep reinforcement learning, downlink beamforming, multiple-input single-output interference channel

JIA Haonan, HE Zhenqing, TAN Wanlong, RUI Hua, LIN Wei. Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning[J]. ZTE Communications, 2022, 20(4): 69-77.

Figures/Tables 8

Figure 1 Framework of the designed downlink data transmission process

Figure 2 Illustration of information exchange process for the BS n

Figure 3 Illustration of the MADDPG-based scheme for multi-cell multi-user multiple-input single-output interference channel (MISO-IC) system

Figure 4 Structures of the action and Q?networks

Table 1 Comparison of the information exchange

Schemes	Required Information	Information Exchange
MADDPG	$H ˉ (t), C ˉ n (t - 1), R (t - 1)$	$O (N K)$
FP^[3]	$h i, j, k (t), ? i, j, k$	$O (M N K)$
ZG^[6]	$h i, j, k (t), ? j, k$ for the BS i	$O (M N K)$
MRT/ZF^[5]	$h i, i, k (t), ? k$	0

Table 1 Comparison of the information exchange

Schemes	Required Information	Information Exchange
MADDPG	$H ˉ (t), C ˉ n (t - 1), R (t - 1)$	$O (N K)$
FP^[3]	$h i, j, k (t), ? i, j, k$	$O (M N K)$
ZG^[6]	$h i, j, k (t), ? j, k$ for the BS i	$O (M N K)$
MRT/ZF^[5]	$h i, i, k (t), ? k$	0

Figure 5 Convergence behavior of the proposed multi-agent deep deterministic policy gradient (MADDPG) algorithm under different initialization of network weights

Figure 6 Average achievable rate of various schemes versus the number of time slots, where each point is a moving average over the previous 300-time slots with the UE distribution factor ?= 0.3

Figure 7 Average achievable rate of various schemes versus the UE distribution factor ?

References 19

1	SOMEKH O, SIMEONE O, BAR-NESS Y, et al. Cooperative multicell zero-forcing beamforming in cellular downlink channels [J]. IEEE transactions on information theory, 2009, 55(7): 3206–3219. DOI: 10.1109/TIT.2009.2021371 DOI
2	HUANG Y M, ZHENG G, BENGTSSON M, et al. Distributed multicell beamforming with limited intercell coordination [J]. IEEE transactions on signal processing, 2011, 59(2): 728–738. DOI: 10.1109/TSP.2010.2089621 DOI
3	SHEN K M, YU W. Fractional programming for communication systems—part I: power control and beamforming [J]. IEEE transactions on signal processing, 2018, 66(10): 2616–2630. DOI: 10.1109/TSP.2018.2812733 DOI
4	ZHANG R, CUI S G. Cooperative interference management with MISO beamforming [J]. IEEE transactions on signal processing, 2010, 58(10): 5450–5458. DOI: 10.1109/TSP.2010.2056685 DOI
5	BJÖRNSON E, ZAKHOUR R, GESBERT D, et al. Cooperative multicell precoding: rate region characterization and distributed strategies with instantaneous and statistical CSI [J]. IEEE transactions on signal processing, 2010, 58(8): 4298–4310. DOI: 10.1109/TSP.2010.2049996 DOI
6	PARK S H, PARK H, LEE I. Distributed beamforming techniques for weighted sum-rate maximization in MISO interference channels [J]. IEEE communications letters, 2010, 14(12): 1131–1133. DOI: 10.1109/LCOMM.2010.12.101635 DOI
7	GE J G, LIANG Y C, JOUNG J, et al. Deep reinforcement learning for distributed dynamic MISO downlink-beamforming coordination [J]. IEEE transactions on communications, 2020, 68(10): 6070–6085. DOI: 10.1109/TCOMM.2020.3004524 DOI
8	KHAN A A, ADVE R S. Centralized and distributed deep reinforcement learning methods for downlink sum-rate optimization [J]. IEEE transactions on wireless communications, 2020, 19(12): 8410–8426. DOI: 10.1109/TWC.2020.3022705 DOI
9	INDYK P, MOTWANI R. Approximate nearest neighbors: towards removing the curse of dimensionality [C]//The Thirtieth Annual ACM Symposium on Theory of Computing. STOC, 1998: 604–613
10	YING D W, VOOK F W, THOMAS T A, et al. Kronecker product correlation model and limited feedback codebook design in a 3D channel model [C]//Proceedings of 2014 IEEE International Conference on Communications. IEEE, 2014: 5865–5870. DOI: 10.1109/ICC.2014.6884258 DOI
11	DONG M, TONG L, SADLER B M. Optimal insertion of pilot symbols for transmissions over time-varying flat fading channels [J]. IEEE transactions on signal processing, 2004, 52(5): 1403–1418. DOI: 10.1109/TSP.2004.826182 DOI
12	SCHUBERT M, BOCHE H. Solution of the multiuser downlink beamforming problem with individual SINR constraints [J]. IEEE transactions on vehicular technology, 2004, 53(1): 18–28. DOI: 10.1109/TVT.2003.819629 DOI
13	CHRISTENSEN S S, AGARWAL R, DE CARVALHO E, et al. Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design [J]. IEEE transactions on wireless communications, 2008, 7(12): 4792–4799. DOI: 10.1109/T-WC.2008.070851 DOI
14	JORSWIECK E A, LARSSON E G, DANEV D. Complete characterization of the Pareto boundary for the MISO interference channel [J]. IEEE transactions on signal processing, 2008, 56(10): 5292–5296. DOI: 10.1109/TSP.2008.928095 DOI
15	LIM Y G, CHAE C B, CAIRE G. Performance analysis of massive MIMO for cell-boundary users [J]. IEEE transactions on wireless communications, 2015, 14(12): 6827–6842. DOI: 10.1109/TWC.2015.2460751 DOI
16	MENG F, CHEN P, WU L N, et al. Power allocation in multi-user cellular networks: deep reinforcement learning approaches [J]. IEEE transactions on wireless communications, 2020, 19(10): 6255–6267. DOI: 10.1109/TWC.2020.3001736 DOI
17	HESTER T, VECERIK M, PIETQUIN O, et al. Deep Q-learning from demonstrations [EB/OL]. [2022-02-02]. . DOI: 10.1609/aaai.v32i1.11757 DOI URL
18	DONG S K, CHEN J R, LIU Y, et al. Reinforcement learning from algorithm model to industry innovation : a foundation stone of future artificial intelligence [J]. ZTE communications, 2019, 17(3): 31–41. DOI: 10.12142/Z TECOM.201903006 DOI
19	SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms [C]//Proceeding of International Conference on Machine Learning. ICML, 2014: 387–395

[1]	REN Min, XU Renyu, ZHU Ting. Double Deep Q-Network Decoder Based on EEG Brain-Computer Interface [J]. ZTE Communications, 2023, 21(3): 3-10.
[2]	YOU Qian, XU Qian, YANG Xin, ZHANG Tao, CHEN Ming. RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning [J]. ZTE Communications, 2023, 21(2): 61-69.

Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 19

Related Articles 2

Recommended Articles

Metrics