RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

doi:10.12142/ZTECOM.202302009

ZTE Communications ›› 2023, Vol. 21 ›› Issue (2): 61-69.DOI: 10.12142/ZTECOM.202302009

• Special Topic • Previous Articles Next Articles

RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

YOU Qian¹, XU Qian¹(), YANG Xin¹, ZHANG Tao², CHEN Ming³

^1.School of electronics and information, Northwestern Polytechnical University, Xi’an 710072, China
^2.China Academy of Launch Vehicle Technology, Beijing 100076, China
^3.Hangzhou Hikvision Digital Technology Co. , Ltd. , Hangzhou 310051, China

Received:2023-02-21 Online:2023-06-13 Published:2023-06-13
About author:YOU Qian received her BS degree from Yangzhou University, China in 2021. She is currently pursuing her MS degree at the school of electronics and information, Northwestern Polytechnical University, China. Her research interests include machine learning for communications, IRS-assisted communications, and UAV-assisted Communications.|XU Qian (qianxu@nwpu.edu.cn) received her BS and PhD degrees both from Xi’an Jiaotong University, China. She is currently an associate professor at the school of electronics and information, Northwestern Polytechnical University, China. Her current research interests include mobile wireless communications with emphasis on physical layer security and QoS provisioning. She has published more than 20 technical papers.|YANG Xin received his BS degree in communication engineering and MS degree in electronics and communication engineering from Xidian University, China in 2011 and 2014, respectively, and PhD degree in information and communication engineering from Northwestern Polytechnical University, China in 2018. He is an associate professor with school of electronics and information, Northwestern Polytechnical University. His research interests include wireless communications and ad hoc networks.|ZHANG Tao received his MS degree from Nankai University, China. He is currently a senior engineer in China Academy of Launch Vehicle Technology. His current research interests mainly focus on wireless communications.|CHEN Ming received his BS degree from Xidian University, China. He is currently a senior engineer in Hangzhou Hikvision Digital Technology Co., Ltd. His current research interests mainly focus on intelligent signal processing.
Supported by:
the National Natural Science Foundation of China(62201462)

Abstract

Abstract:

Device-to-device (D2D) communications underlying cellular networks enabled by unmanned aerial vehicles (UAV) have been regarded as promising techniques for next-generation communications. To mitigate the strong interference caused by the line-of-sight (LoS) air-to-ground channels, we deploy a reconfigurable intelligent surface (RIS) to rebuild the wireless channels. A joint optimization problem of the transmit power of UAV, the transmit power of D2D users and the RIS phase configuration are investigated to maximize the achievable rate of D2D users while satisfying the quality of service (QoS) requirement of cellular users. Due to the high channel dynamics and the coupling among cellular users, the RIS, and the D2D users, it is challenging to find a proper solution. Thus, a RIS softmax deep double deterministic (RIS-SD3) policy gradient method is proposed, which can smooth the optimization space as well as reduce the number of local optimizations. Specifically, the SD3 algorithm maximizes the reward of the agent by training the agent to maximize the value function after the softmax operator is introduced. Simulation results show that the proposed RIS-SD3 algorithm can significantly improve the rate of the D2D users while controlling the interference to the cellular user. Moreover, the proposed RIS-SD3 algorithm has better robustness than the twin delayed deep deterministic (TD3) policy gradient algorithm in a dynamic environment.

Key words: device-to-device communications, reconfigurable intelligent surface, deep reinforcement learning, softmax deep double deterministic policy gradient

YOU Qian, XU Qian, YANG Xin, ZHANG Tao, CHEN Ming. RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning[J]. ZTE Communications, 2023, 21(2): 61-69.

Figures/Tables 8

Figure 1 System model of a practical RIS-assisted unmanned aerial vehicle (UAV)-D2D communication network

Figure 2 Workflow of the deep double deterministic (SD3) policy gradient algorithm

Table 1 Parameters of the proposed system

Parameter		Value
Location	UAV	From (0, 0, 1) m to (0, 60, 1) m
	RIS	(0, 10, 2) m
	CUE	(20, 0, 1) m
	DT1	(20, 60, 1) m
	Distance of D2D	5 m
	Size area of D2D	10 m
$S I N R t h r$	Minimum SINR of CUE	12 dB
$R d_t h r$	Minimum achievable rate of D2D	2 dB
$I T$	Maximum interference of CUE	-30 dB
$P m a x$	Max transmit power of UAV	30 W
$P t$	Max transmit power of DT	10 W, 20 W, 30 W
$β$	Path loss coefficient	-30 dB
$α 0$	Path loss exponent over the user-UAV link	3
$ν$	Path loss exponent	2.5
$ρ$	The path loss at the reference distance	0.01

Table 1 Parameters of the proposed system

Parameter		Value
Location	UAV	From (0, 0, 1) m to (0, 60, 1) m
	RIS	(0, 10, 2) m
	CUE	(20, 0, 1) m
	DT1	(20, 60, 1) m
	Distance of D2D	5 m
	Size area of D2D	10 m
$S I N R t h r$	Minimum SINR of CUE	12 dB
$R d_t h r$	Minimum achievable rate of D2D	2 dB
$I T$	Maximum interference of CUE	-30 dB
$P m a x$	Max transmit power of UAV	30 W
$P t$	Max transmit power of DT	10 W, 20 W, 30 W
$β$	Path loss coefficient	-30 dB
$α 0$	Path loss exponent over the user-UAV link	3
$ν$	Path loss exponent	2.5
$ρ$	The path loss at the reference distance	0.01

Figure 3 Effect of the learning rate

Figure 4 Effect of batchsize on the training model

Figure 5 RIS-SD3 in comparison with other baseline schemes

Figure 6 Sum Rate under different Pt

Figure 7 Sum rate under different Pmax

References 17

1	DANG S P, CHEN G J, COON J P. Multicarrier relay selection for full-duplex relay-assisted OFDM D2D systems [J]. IEEE transactions on vehicular technology, 2018, 67(8): 7204–7218. DOI: 10.1109/TVT.2018.2829401 DOI
2	WU Q Q, ZENG Y, ZHANG R. Joint trajectory and communication design for multi-UAV enabled wireless networks [J]. IEEE transactions on wireless communications, 2018, 17(3): 2109–2121. DOI: 10.1109/TWC.2017.2789293 DOI
3	WU Q Q, ZHANG R. Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network [J]. IEEE communications magazine, 2020, 58(1): 106–112. DOI: 10.1109/MCOM.001.1900107 DOI
4	WU Q Q, ZHANG R. Intelligent reflecting surface enhanced wireless network: Joint active and passive beamforming design [C]//IEEE Global Communications Conference (GLOBECOM). IEEE, 2018: 1–6. DOI: 10.1109/GLOCOM.2018.8647620 DOI
5	HUANG C W, ZAPPONE A, ALEXANDROPOULOS G C, et al. Reconfigurable intelligent surfaces for energy efficiency in wireless communication [J]. IEEE transactions on wireless communications, 2019, 18(8): 4157–4170. DOI: 10.1109/TWC.2019.2922609 DOI
6	YANG G, LIAO Y, LIANG Y C, et al. Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks [J]. IEEE transactions on communications, 2021, 69(11): 7790–7805. doi: 10.1109/TCOMM.2021.3102640 DOI
7	CAO Y S, LV T J, NI W, et al. Sum-rate maximization for multi-reconfigurable intelligent surface-assisted device-to-device communications [EB/OL]. [2023-01-20].
8	HUANG C W, ALEXANDROPOULOS G C, YUEN C, et al. Indoor signal focusing with deep learning designed reconfigurable intelligent surfaces [C]//IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE, 2019: 1–5. DOI: 10.1109/SPAWC.2019.8815412 DOI
9	SHAFIN R, CHEN H, NAM Y H, et al. Self-tuning sectorization: Deep reinforcement learning meets broadcast beam optimization [J]. IEEE transactions on wireless communications, 2020, 19(6): 4038–4053. DOI: 10.1109/TWC.2020.2979446 DOI
10	PENG H R, WANG L C, YE L G, et al. Long-lasting UAV-aided RIS communications based on SWIPT [C]//2022 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2022: 1844–1849. DOI: 10.1109/WCNC51071.2022.9771999 DOI
11	HUANG C W, MO R H, YUEN C. Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning [J]. IEEE Journal on selected areas in communications, 2020, 38(8):1839–1850. DOI:10.1109/JSAC.2020.3000835 DOI
12	CESA-BIANCHI N, GENTILE C, LUGOSI G, et al. Boltzmann exploration done right [C]//The 31st International Conference on Neural Information Processing Systems. NIPS, 2017: 6287–6296
13	PAN L, CAI Q P, HUANG L B. Softmax deep double deterministic policy gradients [C]//The 34th International Conference on Neural Information Processing Systems. NIPS, 2020: 11767–11777
14	AL-HOURANI A, KANDEEPAN S, LARDNER S. Optimal LAP altitude for maximum coverage [J]. IEEE wireless communications letters, 2014, 3(6): 569–572. doi: 10.1109/LWC.2014.2342736 DOI
15	WU Q Q, ZHANG S W, ZHENG B X, et al. Intelligent reflecting surface-aided wireless communications: A tutorial [J]. IEEE transactions on communications, 2021, 69(5): 3313–3351. DOI: 10.1109/TCOMM.2021.3051897 DOI
16	WANG W H, YANG L, MENG A Q, et al. Resource allocation for IRS-aided JP-CoMP downlink cellular networks with underlaying D2D communications [J]. IEEE transactions on wireless communications, 2022, 21(6): 4295–4309. DOI: 10.1109/TWC.2021.3128711 DOI
17	SHI Q J, RAZAVIYAYN M, LUO Z Q, et al. An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel [C]//IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2011: 3060–3063. DOI: 10.1109/ICASSP.2011.5946304 DOI

[1]	ZHU Yuting, XU Zhiyu, ZHANG Hongtao. Cooperative Distributed Beamforming Design for Multi-RIS Aided Cell-Free Systems [J]. ZTE Communications, 2024, 22(2): 99-106.
[2]	REN Min, XU Renyu, ZHU Ting. Double Deep Q-Network Decoder Based on EEG Brain-Computer Interface [J]. ZTE Communications, 2023, 21(3): 3-10.
[3]	WANG Yiji, WEN Dingzhu, MAO Yijie, SHI Yuanming. RIS-Assisted Federated Learning in Multi-Cell Wireless Networks [J]. ZTE Communications, 2023, 21(1): 25-37.
[4]	JIA Haonan, HE Zhenqing, TAN Wanlong, RUI Hua, LIN Wei. Distributed Multi-Cell Multi-User MISO Downlink Beamforming via Deep Reinforcement Learning [J]. ZTE Communications, 2022, 20(4): 69-77.
[5]	YUAN Yifei, GU Qi, WANG Anna, WU Dan, LI Ya. Recent Progress in Research and Development of Reconfigurable Intelligent Surface [J]. ZTE Communications, 2022, 20(1): 3-13.
[6]	XU Yongjun, YANG Zhaohui, HUANG Chongwen, YUEN Chau, GUI Guan. Resource Allocation for Two‑Tier RIS‑Assisted Heterogeneous NOMA Networks [J]. ZTE Communications, 2022, 20(1): 36-47.
[7]	SHAO Zhichao, YAN Wenjing, YUAN Xiaojun. Markovian Cascaded Channel Estimation for RIS Aided Massive MIMO Using 1‑Bit ADCs and Oversampling [J]. ZTE Communications, 2022, 20(1): 48-56.
[8]	ZHOU Mingyong, CHEN Xiangyu, TANG Wankai, KE Jun Chen, JIN Shi, CHENG Qiang, CUI Tie Jun. Dual‑Polarized RIS‑Based STBC Transmission with Polarization Coupling Analysis [J]. ZTE Communications, 2022, 20(1): 63-75.
[9]	Qian Xu, Pinyi Ren, Qinghe Du, Gang Wu, Qiang Li, and Li Sun. Angle-Based Interference-Aware Routing Algorithm for Multicast overWireless D2D Networks [J]. ZTE Communications, 2014, 12(4): 30-39.

RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 17

Related Articles 9

Recommended Articles

Metrics