ZTE Communications ›› 2023, Vol. 21 ›› Issue (4): 29-37.DOI: 10.12142/ZTECOM.202304004

• Special Topic • Previous Articles     Next Articles

Lossy Point Cloud Attribute Compression with Subnode-Based Prediction

YIN Qian1, ZHANG Xinfeng2, HUANG Hongyue1, WANG Shanshe1, MA Siwei1()   

  1. 1.School of Computer Science, Peking University, Beijing 100871, China
    2.School of Computer Science and Technology, University of the Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-10-07 Online:2023-12-25 Published:2023-12-07
  • About author:YIN Qian received her MS degree in signal and information processing from University of Electronic Science and Technology of China in 2021. She is currently pursuing a PhD degree in computer science at Peking University, China. She is actively participating in the research work of the Audio Video Coding Standard (AVS) Workgroup of China and Moving Picture Experts Group (MPEG). Her research interests include video and point cloud compression.
    ZHANG Xinfeng received his BS degree in computer science from Hebei University of Technology, China in 2007 and PhD degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences in 2014. From 2014 to 2017, he was a research fellow with the Rapid-Rich Object Search Lab, Nanyang Technological University, Singapore. From October 2017 to October 2018, he was a postdoctoral fellow with the School of Electrical Engineering System, University of Southern California, Los Angeles, USA. From December 2018 to August 2019, he was a research fellow with the Department of Computer Science, City University of Hong Kong, China. He is currently an assistant professor with the School of Computer Science and Technology, University of Chinese Academy of Sciences. He has authored more than 150 refereed journal/conference papers. His research interests include video compression and processing, image/video quality assessment, and 3D point cloud processing.
    HUANG Hongyue received his BE degree in communication engineering from Beijing University of Posts and Telecommunications, China in 2015, MS degree in computer science from the Technical University of Berlin (TUB), Germany in 2018, and PhD degree in computer science from the Free University of Brussels (VUB), Belgium in 2021. He is currently a postdoctoral researcher with the National Engineering Research Center of Visual Technology, Peking University, China. His research interests include inventing and optimizing deep-learning-based compression methods for 2D images/videos and 3D visual content such as immersive videos, point clouds, and light field images.
    WANG Shanshe received his BS degree from the Department of Mathematics, Heilongjiang University, China in 2004, MS degree in computer software and theory from Northeast Petroleum University, China in 2010, and PhD degree in computer science from the Harbin Institute of Technology, China. He held a postdoctoral position with Peking University, China from 2016 to 2018. He is currently an associate researcher with the School of Electronics Engineering and Computer Science, Institute of Digital Media, Peking University. His current research interests include video compression and image and video quality assessment.
    MA Siwei (swma@stu.pku.edu.cn) received his BS degree from Shandong Normal University, China in 1999, and PhD degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences in 2005. He held a postdoctoral position with the University of Southern California, Los Angeles, USA from 2005 to 2007. He is currently a professor with the School of Electronics Engineering and Computer Science, Institute of Digital Media, Peking University, China. He has authored over 300 technical articles in refereed journals and proceedings in image and video coding, video processing, video streaming, and transmission. He served/serves as an associate editor for the IEEE Transactions on Circuits and Systems for Video Technology and Journal of Visual Communication and Image Representation.
  • Supported by:
    China Postdoctoral Science Foundation(2022M720234);the National Natural Science Foundation(62071449)

Abstract:

Recent years have witnessed that 3D point cloud compression (PCC) has become a research hotspot both in academia and industry. Especially in industry, the Moving Picture Expert Group (MPEG) has actively initiated the development of PCC standards. One of the adopted frameworks called geometry-based PCC (G-PCC) follows the architecture of coding geometry first and then coding attributes, where the region adaptive hierarchical transform (RAHT) method is introduced for the lossy attribute compression. The upsampled transform domain prediction in RAHT does not sufficiently explore the attribute correlations between neighbor nodes and thus fails to further reduce the attribute redundancy between neighbor nodes. In this paper, we propose a subnode-based prediction method, where the spatial position relationship between neighbor nodes is fully considered and prediction precision is further promoted. We utilize some already-encoded neighbor nodes to facilitate the upsampled transform domain prediction in RAHT by means of a weighted average strategy. Experimental results have illustrated that our proposed attribute compression method shows better rate-distortion (R-D) performance than the latest MPEG G-PCC (both on reference software TMC13-v22.0 and GeS-TM-v2.0).

Key words: point cloud compression, MPEG G-PCC, RAHT, subnode-based prediction