ZTE Communications ›› 2023, Vol. 21 ›› Issue (2): 70-79.DOI: 10.12142/ZTECOM.202302010
• Special Topic • Previous Articles Next Articles
LIU Chenyao1, GUO Jiejie2, ZHANG Yimeng1, XU Wenjun1,3(), LIU Yiming1
Received:
2023-02-11
Online:
2023-06-13
Published:
2023-06-13
About author:
LIU Chenyao received her BE degree from the School of Information and Communication Engineering, Beijing University of Posts and Telecommunication (BUPT), China in 2022. She is currently pursuing her PhD degree at the School of Artificial Intelligence, BUPT. Her research interests include semantic communication, video coding, and machine learning.|GUO Jiejie is currently pursuing her BE degree from the School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, China. Her research interests include semantic communication, video coding, and artificial intelligence.|ZHANG Yimeng received her BE degree from the School of Information and Communication Engineering, Beijing University of Posts and Telecommunication (BUPT), China in 2018. She is currently pursuing her PhD degree at the School of Artificial Intelligence, BUPT. Her research interests include semantic communication and intelligent resource allocation in emerging wireless applications. She is a graduate student Member of IEEE.|XU Wenjun (Supported by:
LIU Chenyao, GUO Jiejie, ZHANG Yimeng, XU Wenjun, LIU Yiming. SST-V: A Scalable Semantic Transmission Framework for Video[J]. ZTE Communications, 2023, 21(2): 70-79.
Number | Scheme | PSNR | MS-SSIM/dB |
---|---|---|---|
1 | Without SIE, fixed Level 1 (L1) | 23.937 6 | 6.704 7 |
2 | Without SIE, fixed Level 2 (L2) | 27.307 8 | 8.743 6 |
3 | With SIE, fixed Level 1 (SIE-L1) | 26.099 3 | 7.454 1 |
4 | With SIE, fixed Level 2 (SIE-L2) | 28.900 34 | 9.754 9 |
5 | Scalable multilevel coding without SIE | 29.935 9 | 11.682 3 |
6 (SST-V) | Scalable multilevel coding with SIE | 31.190 78 | 14.349 9 |
Table 1 PSNR and MS-SSIM of different schemes
Number | Scheme | PSNR | MS-SSIM/dB |
---|---|---|---|
1 | Without SIE, fixed Level 1 (L1) | 23.937 6 | 6.704 7 |
2 | Without SIE, fixed Level 2 (L2) | 27.307 8 | 8.743 6 |
3 | With SIE, fixed Level 1 (SIE-L1) | 26.099 3 | 7.454 1 |
4 | With SIE, fixed Level 2 (SIE-L2) | 28.900 34 | 9.754 9 |
5 | Scalable multilevel coding without SIE | 29.935 9 | 11.682 3 |
6 (SST-V) | Scalable multilevel coding with SIE | 31.190 78 | 14.349 9 |
1 |
TU Y, CHEN W. A deep learning-based semantic communication system [J]. Mobile communications, 2021, 45(4): 91-94. DOI: 10.3969/j.issn.1006-1010.2021.04.015
DOI |
2 | CISCO. 2020 global networking trends report [EB/OL]. (2019-11-17) [2023-04-01]. |
3 | WARREN W, SHANNON C E. Recent contributions to the mathematical theory of communication [EB/OL]. [2023-02-01]. |
4 | MORRIS C W. Foundations of the theory of signs [M]. Chicago, USA: The University of Chicago Press, 1938 |
5 |
XIE H Q, QIN Z J, LI G Y, et al. Deep learning enabled semantic communication systems [J]. IEEE transactions on signal processing, 2021, 69: 2663–2675. DOI: 10.1109/TSP.2021.3071210
DOI |
6 |
WEI H, XU W J, WANG F Y, et al. SemAudio: semantic-aware streaming communications for real-time audio transmission [C]//IEEE Global Communications Conference. IEEE, 2022: 3965–3970. DOI: 10.1109/GLOBECOM48099.2022.10001043
DOI |
7 |
XU W J, ZHANG Y M, WANG F Y, et al. Semantic communication for the Internet of vehicles: a multiuser cooperative approach [J]. IEEE vehicular technology magazine, 2023, 18(1): 100–109. DOI: 10.1109/MVT.2022.3227723
DOI |
8 |
LU G, OUYANG W L, XU D, et al. DVC: an end-to-end deep video compression framework [C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 10998–11007. DOI: 10.1109/CVPR.2019.01126
DOI |
9 |
WANG S X, DAI J C, LIANG Z J, et al. Wireless deep video semantic transmission [J]. IEEE journal on selected areas in communications, 2023, 41(1): 214–229. DOI: 10.1109/JSAC.2022.3221977
DOI |
10 |
TUNG T Y, GÜNDÜZ D. DeepWiVe: deep-learning-aided wireless video transmission [J]. IEEE journal on selected areas in communications, 2022, 40(9): 2570–2583. DOI: 10.1109/JSAC.2022.3191354
DOI |
11 | HUANG B W, YAN X, ZHOU J J, et al. CSMCNet: scalable video compressive sensing reconstruction with interpretable motion estimation [EB/OL]. (2021-08-03) [2023-02-01]. arXiv: 2108.01522. |
12 |
WIEGAND T, SULLIVAN G J, BJONTEGAARD G, et al. Overview of the H.264/AVC video coding standard [J]. IEEE transactions on circuits and systems for video technology, 2003, 13(7): 560–576. DOI: 10.1109/TCSVT.2003.815165
DOI |
13 | CARNAP R, BAR-HILLEL Y. An outline of a theory of semantic information [EB/OL]. [2023-02-01]. |
14 |
BAR-HILLEL Y, CARNAP R. Semantic information [J]. The British journal for the philosophy of science, 1953, 4(14): 147–157. DOI: 10.1093/bjps/iv.14.147
DOI |
15 |
FLORIDI L. Outline of a theory of strongly semantic information [J].Minds and machines, 2004, 14(2): 197–221. DOI: 10.1023/B: MIND.0000021684.50925.c9
DOI |
16 |
KOLCHINSKY A, WOLPERT D H. Semantic information, autonomous agency and non-equilibrium statistical physics [J]. Interface focus, 2018, 8(6): 20180041. DOI: 10.1098/rsfs.2018.0041
DOI |
17 |
ZHANG P, XU W J, GAO H, et al. Toward wisdom-evolutionary and primitive-concise 6G: a new paradigm of semantic communication networks [J]. Engineering, 2022, 8: 60–73. DOI: 10.1016/j.eng.2021.11.003
DOI |
18 |
ZHONG Y X. A theory of semantic information [J]. China communications, 2017, 14(1): 1–17. DOI: 10.1109/CC.2017.7839754
DOI |
19 |
RAO M, FARSAD N, GOLDSMITH A. Variable length joint source-channel coding of text using deep neural networks [C]//IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE, 2018: 1–5. DOI: 10.1109/SPAWC.2018.8445924
DOI |
20 |
BOURTSOULATZE E, BURTH KURKA D, GÜNDÜZ D. Deep joint source-channel coding for wireless image transmission [J]. IEEE transactions on cognitive communications and networking, 2019, 5(3): 567–579. DOI: 10.1109/TCCN.2019.2919300
DOI |
21 |
KURKA D B, GÜNDÜZ D. DeepJSCC-f: deep joint source-channel coding of images with feedback [J]. IEEE journal on selected areas in information theory, 2020, 1(1): 178–193. DOI: 10.1109/JSAIT.2020.2987203
DOI |
22 |
JALALPOUR Y, WANG L Y, FENG W C, et al. FID: frame interpolation and DCT-based video compression [C]//IEEE International Symposium on Multimedia (ISM). IEEE, 2021: 218–221. DOI: 10.1109/ISM.2020.00045
DOI |
23 |
CHEN J W, HO C M. MM-ViT: multi-modal video transformer for compressed video action recognition [C]//IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2022: 786–797. DOI: 10.1109/WACV51458.2022.00086
DOI |
24 |
LIN J P, LIU D, LI H Q, et al. M-LVC: multiple frames prediction for learned video compression [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 3543–3551. DOI: 10.1109/CVPR42600.2020.00360
DOI |
25 | LI J, LI B, LU Y. Deep contextual video compression [EB/OL]. [2023-04-01]. |
26 |
LIU C, SUN H M, ZENG X Y, et al. Learned video compression with residual prediction and feature-aided loop filter [C]//IEEE International Conference on Image Processing (ICIP). IEEE, 2022: 1321–1325. DOI: 10.1109/ICIP46576.2022.9897989
DOI |
27 |
ZHANG S P, MRAK M, HERRANZ L, et al. DVC-P: deep video compression with perceptual optimizations [C]//Proceedings of 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, 2022: 1–5. DOI: 10.1109/VCIP53242.2021.9675350
DOI |
28 |
YANG R, TIMOFTE R, VAN GOOL L. Advancing learned video compression with In-loop frame prediction [J]. IEEE transactions on circuits and systems for video technology, 2023, 33(5): 2410–2423. DOI: 10.1109/TCSVT.2022.3222418
DOI |
29 |
HUANG D L, GAO F F, TAO X M, et al. Toward semantic communications: deep learning-based image semantic coding [J]. IEEE journal on selected areas in communications, 2023, 41(1): 55–71. DOI: 10.1109/JSAC.2022.3221999
DOI |
30 |
DUAN Y P, LI M Z, WEN L J, et al. From object-attribute-relation semantic representation to video generation: a multiple variational autoencoder approach [C]//Proceedings of 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022: 1–6. DOI: 10.1109/MLSP55214.2022.9943394
DOI |
31 |
JIANG P W, WEN C K, JIN S, et al. Wireless semantic communications for video conferencing [J]. IEEE journal on selected areas in communications, 2023, 41(1): 230–244. DOI: 10.1109/JSAC.2022.3221968
DOI |
32 | CHEN B, WANG Z, LI B, et al. Interactive face video coding: a generative compression framework [EB/OL]. [2023-02-20]. |
33 |
CUI L Z, SU D Y, YANG S, et al. TCLiVi: transmission control in live video streaming based on deep reinforcement learning [J]. IEEE transactions on Multimedia, 2020, 23: 651-663. DOI: 10.1109/TMM.2020.2985631
DOI |
34 |
ELGAMAL T, SHI S, GUPTA V, et al. SiEVE: semantically encoded video analytics on edge and cloud [C]//Proceedings of 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2021: 1383–1388. DOI: 10.1109/ICDCS47774.2020.00182
DOI |
35 |
WANG Y Q, XU J C, JI W. A feature-based video transmission framework for visual IoT in fog computing systems [C]//Proceedings of 2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS). IEEE, 2019: 1–8. DOI: 10.1109/ANCS.2019.8901872
DOI |
36 |
YANG R, MENTZER F, VAN GOOL L, et al. Learning for video compression with hierarchical quality and recurrent enhancement [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 6627–6636. DOI: 10.1109/CVPR42600.2020.00666
DOI |
37 | ZHANG B, QIN Z, LI Y. Semantic communications with variable-length coding for extended reality [EB/OL]. [2023-03-11]. . |
38 | RAPPAPORT T S. Wireless communications: principles and practice [M]. Upper Saddle River, USA: Prentice Hall PTR, 1996 |
39 |
RANJAN A, BLACK M J. Optical flow estimation using a spatial pyramid network [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 2720–2729. DOI: 10.1109/CVPR.2017.291
DOI |
40 |
MARQUEZ E S, HARE J S, NIRANJAN M. Deep cascade learning [J]. IEEE transactions on neural networks and learning systems, 2018, 29(11): 5475–5485. DOI: 10.1109/TNNLS.2018.2805098
DOI |
41 |
XUE T F, CHEN B A, WU J J, et al. Video enhancement with task-oriented flow [J]. International journal of computer vision, 2019, 127(8): 1106–1125. DOI: 10.1007/s11263-018-01144-2
DOI |
42 |
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity [J]. IEEE transactions on image processing, 2004, 13(4): 600–612. DOI: 10.1109/TIP.2003.819861
DOI |
43 |
WANG Z, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment [C]//The 37th Asilomar Conference on Signals, Systems & Computers. IEEE, 2004: 1398–1402. DOI: 10.1109/ACSSC.2003.1292216
DOI |
[1] | DENG Letian, ZHAO Yanru. Deep Learning-Based Semantic Feature Extraction: A Literature Review and Future Directions [J]. ZTE Communications, 2023, 21(2): 11-17. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||