ZTE Communications ›› 2024, Vol. 22 ›› Issue (4): 67-77.DOI: 10.12142/ZTECOM.202404010

• Research Papers • Previous Articles     Next Articles

Unsupervised Motion Removal for Dynamic SLAM

CHEN Hao1,2(), ZHANG Kaijiong1,2, CHEN Jun1,2, ZHANG Ziwen1,2, JIA Xia1,2   

  1. 1.ZTE Corporation, Shenzhen 518057, China
    2.State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen 518055, China
  • Received:2024-02-29 Online:2024-12-20 Published:2024-12-03
  • About author:CHEN Hao (chen.hao16@zte.com.cn) received his BS and MS degrees in control theory and control engineering from Harbin Engineering University, China in 2018 and 2020. He has been engaged in deep learning technologies in ZTE Corporation since his graduation. His research interests include digital humans, SLAM, and image recognition.
    ZHANG Kaijiong received his MS degree from Shanghai Jiao Tong University, China in 2020. He is currently an algorithm engineer with ZTE Corporation. His research interests include computer vision, image/video processing and artificial intelligence.
    CHEN Jun received his master’s degree in aerospace science and technology from Nanjing University of Aeronautics and Astronautics, China. He has been engaged in the R&D of computer graphics, computer vision, and cloud computing for more than 10 years in ZTE Corporation, and has accumulated rich experience in solution and engineering.
    ZHANG Ziwen received his bachelor’s degree in instrument science and technology and master’s degree in instrument engineering from Harbin Institute of Technology, China in 2018 and 2020 respectively. After graduation, he worked at ZTE Corporation as a computer vision algorithm engineer. He has been engaged in algorithm research, design, improvement and end-to-end deployment optimization in the fields of face detection and recognition, image matching, SLAM, digital human generation, and portrait stylization migration for a long time, and has accumulated rich experience in these fields.
    JIA Xia received her BS and MS degrees in control theory and control engineering from Taiyuan University of Technology, China, and Dalian University of Technology, China in 1995 and 2001, respectively. She joined ZTE Corporation in 2001 and worked in the State Key Laboratory of Mobile Network and Mobile Multimedia Technology. Her main research interests include deep learning techniques, face detection and recognition, Re-ID, and activity detection and recognition.

Abstract:

We propose a dynamic simultaneous localization and mapping technology for unsupervised motion removal (UMR-SLAM), which is a deep learning-based dynamic RGBD SLAM. It is the first time that a scheme combining scene flow and deep learning SLAM is proposed to improve the accuracy of SLAM in dynamic scenes, in response to the situation where dynamic objects cause pose changes. The entire process does not require explicit object segmentation as supervisory information. We also propose a loop detection scheme that combines optical flow and feature similarity in the backend optimization section of the SLAM system to improve the accuracy of loop detection. UMR-SLAM is rewritten based on the DROID-SLAM code architecture. Through experiments on different datasets, it has been proven that our scheme has higher pose accuracy in dynamic scenarios compared with the current advanced SLAM algorithm.

Key words: dynamic RGBD SLAM, update module, motion estimation, scene flow