ZTE Communications ›› 2023, Vol. 21 ›› Issue (4): 38-46.DOI: 10.12142/ZTECOM.202304005
• Special Topic • Previous Articles Next Articles
WANG Chongchong1, LI Yao2, WANG Beibei3, CAO Hong3, ZHANG Yanyong2()
Received:
2023-10-07
Online:
2023-12-07
Published:
2023-12-07
About author:
WANG Chongchong received his BS degree in computer science and technology from Huazhong Agricultural University, China in 2022. He is currently pursuing a master’s degree in computer science and technology at Anhui University, China.WANG Chongchong, LI Yao, WANG Beibei, CAO Hong, ZHANG Yanyong. Point Cloud Processing Methods for 3D Point Cloud Detection Tasks[J]. ZTE Communications, 2023, 21(4): 38-46.
Add to citation manager EndNote|Ris|BibTeX
URL: https://zte.magtechjournal.com/EN/10.12142/ZTECOM.202304005
Method | Modality | APCar | APPedestrian | APCyclist | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | ||
VoxelNet[ | LiDAR | 81.97 | 65.46 | 62.85 | 57.86 | 53.42 | 48.87 | 67.17 | 47.65 | 45.11 |
SECOND[ | LiDAR | 83.13 | 73.66 | 66.20 | 51.07 | 42.56 | 37.29 | 70.51 | 53.85 | 46.90 |
PointPillars[ | LiDAR | 79.05 | 74.99 | 68.30 | 52.08 | 43.53 | 41.49 | 75.78 | 59.07 | 52.92 |
OcTr[ | LiDAR | 88.43 | 78.57 | 77.16 | 61.49 | 57.17 | 52.35 | 85.29 | 70.44 | 66.17 |
Table 1 Performance of VoxelNet, SECOND, PointPillars and OcTr on the KITTI dataset[31]
Method | Modality | APCar | APPedestrian | APCyclist | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | ||
VoxelNet[ | LiDAR | 81.97 | 65.46 | 62.85 | 57.86 | 53.42 | 48.87 | 67.17 | 47.65 | 45.11 |
SECOND[ | LiDAR | 83.13 | 73.66 | 66.20 | 51.07 | 42.56 | 37.29 | 70.51 | 53.85 | 46.90 |
PointPillars[ | LiDAR | 79.05 | 74.99 | 68.30 | 52.08 | 43.53 | 41.49 | 75.78 | 59.07 | 52.92 |
OcTr[ | LiDAR | 88.43 | 78.57 | 77.16 | 61.49 | 57.17 | 52.35 | 85.29 | 70.44 | 66.17 |
Method | Recall4 096 | Recall1 024 | Recall512 |
---|---|---|---|
D-FPS | 99.7% | 65.9% | 51.8% |
F-FPS, λ = 0.0 | 99.7% | 83.5% | 68.4% |
F-FPS, λ = 0.5 | 99.7% | 84.9% | 74.9% |
F-FPS, λ = 1.0 | 99.7% | 89.2% | 76.1% |
F-FPS, λ = 2.0 | 99.7% | 86.3% | 73.7% |
Table 2 Points recall among different sampling strategies on the nuScenes dataset. “4 096”, “1 024” and “512” represent the number of representative points in the subset. The first row of results uses only D-FPS.
Method | Recall4 096 | Recall1 024 | Recall512 |
---|---|---|---|
D-FPS | 99.7% | 65.9% | 51.8% |
F-FPS, λ = 0.0 | 99.7% | 83.5% | 68.4% |
F-FPS, λ = 0.5 | 99.7% | 84.9% | 74.9% |
F-FPS, λ = 1.0 | 99.7% | 89.2% | 76.1% |
F-FPS, λ = 2.0 | 99.7% | 86.3% | 73.7% |
Method | APCar‐3D Detection | APC ar‐BEV Detection | APCyclist‐3D Detection | APCyclist‐BEV Detection | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |
SECOND[ | 83.34 | 72.55 | 65.82 | 89.39 | 83.77 | 78.59 | 71.33 | 52.08 | 45.83 | 76.50 | 56.05 | 49.45 |
Fast Point R-CNN[ | 85.29 | 77.40 | 70.24 | 90.87 | 87.84 | 80.52 | - | - | - | - | - | - |
STD[ | 87.95 | 79.71 | 75.09 | 94.74 | 89.19 | 86.42 | 78.69 | 61.59 | 55.30 | 81.36 | 67.23 | 59.35 |
PV-RCNN[ | 90.25 | 81.43 | 76.82 | 94.98 | 90.65 | 86.14 | 78.60 | 63.71 | 57.65 | 82.49 | 68.89 | 62.41 |
Table 3 Performance testing on the KITTI test set. Mean average precision is taken as the evaluation metric. The table shows better performance of PV-RCNN and STD
Method | APCar‐3D Detection | APC ar‐BEV Detection | APCyclist‐3D Detection | APCyclist‐BEV Detection | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |
SECOND[ | 83.34 | 72.55 | 65.82 | 89.39 | 83.77 | 78.59 | 71.33 | 52.08 | 45.83 | 76.50 | 56.05 | 49.45 |
Fast Point R-CNN[ | 85.29 | 77.40 | 70.24 | 90.87 | 87.84 | 80.52 | - | - | - | - | - | - |
STD[ | 87.95 | 79.71 | 75.09 | 94.74 | 89.19 | 86.42 | 78.69 | 61.59 | 55.30 | 81.36 | 67.23 | 59.35 |
PV-RCNN[ | 90.25 | 81.43 | 76.82 | 94.98 | 90.65 | 86.14 | 78.60 | 63.71 | 57.65 | 82.49 | 68.89 | 62.41 |
1 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137–1149. DOI: 10.1109/tpami.2016.2577031 |
2 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [EB/OL]. (2016-05-09)[2023-08-20]. |
3 | LAW H, DENG J. CornerNet: detecting objects as paired keypoints [J]. International journal of computer vision, 2020, 128(3): 642–656. DOI: 10.1007/s11263-019-01204-1 |
4 | SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud [EB/OL]. (2019-05-16)[2023-08-21]. |
5 | CHEN Y K, LIU J H, ZHANG X Y, et al. VoxelNeXt: fully sparse VoxelNet for 3D object detection and tracking [EB/OL]. (203-03-20)[2023-08-21]. |
6 | YANG Z T, SUN Y N, LIU S, et al. STD: sparse-to-dense 3D object detector for point cloud [EB/OL]. (2019-07-22)[2023-08-21]. |
7 | PHILION J, Lift FIDLER S., splat, shoot: encoding images from arbitrary camera rigs by implicitly unprojecting to 3D [C]//European Conference on Computer Vision. Springer, 2020: 194–210.10.1007/978-3-030-58568-6_12 |
8 | YIN T W, ZHOU X Y, KRÄHENBÜHL P. Center-based 3D object detection and tracking [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021: 11779–11788. DOI: 10.1109/CVPR46437.2021.01161 |
9 | KLASING K, WOLLHERR D, BUSS M. A clustering method for efficient segmentation of 3D laser data [C]//2008 IEEE International Conference on Robotics and Automation. IEEE, 2008: 4043–4048. DOI: 10.1109/ROBOT.2008.4543832 |
10 | KLASING K, WOLLHERR D, BUSS M. Realtime segmentation of range data using continuous nearest neighbors [C]//2009 IEEE International Conference on Robotics and Automation. IEEE, 2009: 2431–2436. DOI: 10.1109/ROBOT.2009.5152498 |
11 | CHARLES R Q, HAO S, MO K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 77–85. DOI: 10.1109/CVPR.2017.16 |
12 | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [EB/OL]. (2017-06-07)[2020-08-21]. |
13 | HUANG J J, HUANG G, ZHU Z, et al. BEVDet: high-performance multi-camera 3D object detection in bird-eye-view [EB/OL]. (2022-06-16)[2023-08-21]. |
14 | LI Z Q, WANG W H, LI H Y, et al. Bevformer: learning bird's-eye-view representation from multi-camera images via spatiotemporal transformers [EB/OL]. (2022-07-13)[2023-08-21]. |
15 | LI Y H, GE Z, YU G Y, et al. BEVDepth: acquisition of reliable depth for multi-view 3D object detection [EB/OL]. (2022-11-30)[2023-08-21]. |
16 | LIU Z J, TANG H T, AMINI A, et al. BEVFusion: Multi-task multi-sensor fusion with unified bird's-eye view representation [EB/OL]. (2022-06-16)[2023-08-21]. |
17 | WANG R H, QIN J, LI K Y, et al. BEV-LaneDet: a simple and effective 3D lane detection baseline [EB/OL]. (203-03-11)[2023-08-21]. |
18 | DONG Y P, KANG C X, ZHANG J L. Benchmarking robustness of 3D object detection to common corruptions in autonomous driving [EB/OL]. (2023-03-20)[2023-08-21]. |
19 | QI C R, LITANY O, HE K M, et al. Deep hough voting for 3D object detection in point clouds [EB/OL]. (2019-08-22)[2023-08-21]. |
20 | HE Y S, SUN W, HUANG H B, et al. PVN 3D: point-wiseadeep 3D keypoints voting network for 6DoF pose estimation [EB/OL]. (2020-03-24)[2023-08-21]. |
21 | ZHOU Y, TUZEL O. VoxelNet: end-to-end learning for point cloud based 3D object detection [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 4490–4499. DOI: 10.1109/CVPR.2018.00472 |
22 | YAN Y, MAO Y X, LI B. SECOND: sparsely embedded convolutional detection [J]. Sensors, 2018, 18(10): 3337. DOI: 10.3390/s18103337 |
23 | SIMON M, AMENDE K, KRAUS A, et al. Complexer-YOLO: real-time 3D object detection and tracking on semantic point clouds [EB/OL]. (2019-04-16)[2023-08-21]. |
24 | LANG A H, VORA S, CAESAR H, et al. PointPillars: fast encoders for object detection from point clouds [EB/OL]. (2020-03-24)[2023-08-21]. |
25 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137–1149. DOI: 10.1109/TPAMI.2016.2577031 |
26 | CAESAR H, BANKITI V, LANG A H, et al. nuScenes: a multimodal dataset for autonomous driving [EB/OL]. (2020-05-05)[2023-08-21]. |
27 | YIN T W, ZHOU X Y, KRÄHENBÜHL P. Center-based 3D object detection and tracking [EB/OL]. (2021-01-06)[2023-08-21]. |
28 | FAN L, WANG F, WANG N Y, et al. Fully sparse 3D object detection [EB/OL]. (2022-10-03)[2023-08-21]. |
29 | ZHOU C, ZHANG Y N, CHEN J X, et al. OcTr: octree-based transformer for 3D object detection [EB/OL]. (2023-03-22)[2023-08-21]. |
30 | SUN P, KRETZSCHMAR H, DOTIWALLA X, et al. Scalability in perception for autonomous driving: waymo open dataset [EB/OL]. (2023-03-22)[2023-08-21]. |
31 | GEIGER A, LENZ P, URTASUN R, et al. Are we ready for autonomous driving? The KITTI vision benchmark suite [C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012. DOI: 10.1109/CVPR.2012.6248074 |
32 | YANG Z T, SUN Y N, LIU S, et al. 3DSSD: point-based 3D single stage object detector [EB/OL]. (2020-02-24)[2023-08-21]. |
33 | SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 770–779. DOI: 10.1109/CVPR.2019.00086 |
34 | QI C R, LIU W, WU C X, et al. Frustum PointNets for 3D object detection from RGB-D data [EB/OL]. (2018-04-13)[2023-08-21]. |
35 | CHEN Y L, LIU S, SHEN X Y, et al. Fast point R-CNN [EB/OL]. (2019-08-16)[2023-08-21]. |
36 | YANG Z T, SUN Y N, LIU S, et al. 3DSSD: point-based 3D single stage object detector [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 11037–11045. DOI: 10.1109/CVPR42600.2020.01105 |
37 | YANG Z T, SUN Y N, LIU S, et al. IPOD: intensive point-based object detector for point cloud [EB/OL]. (2018-12-13)[2023-08-21]. |
38 | YANG Z T, SUN Y N, LIU S, et al. STD: sparse-to-dense 3D object detector for point cloud [C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2020: 1951–1960. DOI: 10.1109/ICCV.2019.00204 |
39 | SHI S S, GUO C X, JIANG L, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection [EB/OL]. (2021-04-09)[2023-08-21]. |
40 | YOU Y R, WANG Y, CHAO W-L, et al. Pseudo-lidar++: accurate depth for 3d object detection in autonomous driving [EB/OL]. (2020-02-15)[2023-08-21]. |
41 | HE C H, ZENG H, HUANG J Q, et al. Structure aware single-stage 3D object detection from point cloud [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 11870–11879. DOI: 10.1109/CVPR42600.2020.01189 |
42 | SHI W J. Point-GNN: graph neural network for 3D object detection in a point cloud [EB/OL]. (2020-03-02)[2023-08-21]. |
43 | SCARSELLI F, GORI M, TSOI A C, et al. The graph neural network model [J]. IEEE transactions on neural networks, 2009, 20(1): 61–80. DOI: 10.1109/tnn.2008.2005605 |
44 | KU J, MOZIFIAN M, LEE J, et al. Joint 3D proposal generation and object detection from view aggregation [C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). ACM, 2018: 1–8. DOI: 10.1109/IROS.2018.8594049 |
45 | LIANG M, YANG B, CHEN Y, et al. Multi-task multi-sensor fusion for 3D object detection [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 7337–7345. DOI: 10.1109/CVPR.2019.00752 |
[1] | GONG Panyin, ZHANG Guidong, ZHANG Zhigang, CHEN Xiao, DING Xuan. Research on Fall Detection System Based on Commercial Wi-Fi Devices [J]. ZTE Communications, 2023, 21(4): 60-68. |
[2] | DENG Letian, ZHAO Yanru. Deep Learning-Based Semantic Feature Extraction: A Literature Review and Future Directions [J]. ZTE Communications, 2023, 21(2): 11-17. |
[3] | LU Ping, SHENG Bin, SHI Wenzhe. Scene Visual Perception and AR Navigation Applications [J]. ZTE Communications, 2023, 21(1): 81-88. |
[4] | FAN Guotian, WANG Zhibin. Intelligent Antenna Attitude Parameters Measurement Based on Deep Learning SSD Model [J]. ZTE Communications, 2022, 20(S1): 36-43. |
[5] | GAO Zhengguang, LI Lun, WU Hao, TU Xuezhen, HAN Bingtao. A Unified Deep Learning Method for CSI Feedback in Massive MIMO Systems [J]. ZTE Communications, 2022, 20(4): 110-115. |
[6] | ZHANG Jintao, HE Zhenqing, RUI Hua, XU Xiaojing. Spectrum Sensing for OFDMA Using Multicarrier Covariance Matrix Aware CNN [J]. ZTE Communications, 2022, 20(3): 61-69. |
[7] | HE Hongye, YANG Zhiguo, CHEN Xiangning. Payload Encoding Representation from Transformer for Encrypted Traffic Classification [J]. ZTE Communications, 2021, 19(4): 90-97. |
[8] | XUE Songyan, LI Ang, WANG Jinfei, YI Na, MA Yi, Rahim TAFAZOLLI, Terence DODGSON. To Learn or Not to Learn:Deep Learning Assisted Wireless Modem Design [J]. ZTE Communications, 2019, 17(4): 3-11. |
[9] | ZHENG Xiaoqing, LU Yaping, PENG Haoyuan, FENG Jiangtao, ZHOU Yi, JIANG Min, MA Li, ZHANG Ji, JI Jie. Detecting Abnormal Start-Ups, Unusual Resource Consumptions of the Smart Phone: A Deep Learning Approach [J]. ZTE Communications, 2019, 17(2): 38-43. |
[10] | ZHENG Xiaoqing, CHEN Jun, SHANG Guoqiang. Deep Neural Network-Based Chinese Semantic Role Labeling [J]. ZTE Communications, 2017, 15(S2): 58-64. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||