Key Technologies in Mobile Visual Search and MPEG Standardization Activities

ZTE Communications ›› 2012, Vol. 10 ›› Issue (2): 57-66.

• • 上一篇

Key Technologies in Mobile Visual Search and MPEG Standardization Activities

Ling-Yu Duan, Jie Chen, Chunyu Wang, Rongrong Ji, Tiejun Huang, and Wen Gao

Institute of Digital Media, Peking University, Beijing 100871, China

收稿日期:2012-03-08 出版日期:2012-06-25 发布日期:2012-06-25
作者简介:Ling-Yu Duan (lingyu@pku.edu.cn) received his MSc degree in automation from The University of Science and Technolohy, China, in 1999. He received his MSc degree in computer science from the National University of Singapore in 2002 and his PhD degree in information technology from The University of Newcastle, Australia, in 2007. From 2003 to 2008, he was a research scientist at the Institute for Infocomm Research, Singapore. Since 2008, he has been an associate professor at the School of Electrical Engineering and Computer Science at Peking University. Dr. Duan currently His research interests include visual search and reality augmentation, multimedia content analysis, and mobile media computing. He has authored more than 70 papers in these areas.

Jie Chen (cjie@pku.edu.cn) is a PhD candidate at the School of Electrical Engineering and Computer Science, Peking University. His research interest include mobile visual search, low bit-rate visual descriptors, and vector quantizer. He has published more than 10 journal or conference papers.

Chunyu Wang (wangchunyu@pku.edu.cn) is a PhD candidate at the School of Electrical Engineering and Computer Science, Peking University. His research interests include visual search, object recognition, and activity recognition.

Rongrong Ji (rrji@pku.edu.cn) received his PhD degree in computer science from Harbin Institute of Technology, China. He is currently a postdoctoral research fellow at Columbia University. His research interests include image and video search, content analysis and understanding, mobile visual search and recognition, and interactive human-computer interface. Dr. Ji received the Best Paper Award at ACM Multimedia 2011 and received a Microsoft Fellowship in 2007.

Tiejun Huang (tjhuang@pku.edu.cn) received his BSc and MSC degrees in computer science from Wuhan University of Technology in 1992 and 1995. He received his PhD degree in pattern recognition and image analysis from Huazhong University of Science and Technology, China, in 1998. He is currently a professor in the School of Electrical Engineering and Computer Science, Peking University. He is also vice director of the National Engineering Laboratory for Video Technology of China. His research interests include video coding, image understanding, digital rights management (DRM), and digital library. He has published more than sixty peer-reviewed papers and has authored or co-authored three books. He is a member of the board of directors for the Digital Media Project; he is on the advisory board for IEEE Computing Now; he is on the editorial board of the Journal on 3D Research; and he is on the board of the Chinese Institute of Electronics.

Wen Gao (wgao@pku.edu.cn) received his MSc degree in computer science from Harbin Institute of Technology, China, in 1985. He received his PhD degree in electronics engineering from the University of Tokyo in 1991. He is a professor in the School of Electronics Engineering and Computer Science, Peking University. He has led research efforts in video coding, face recognition, sign language recognition and synthesis, and multimedia retrieval. Professor Gao was admitted as an Academedian of the China Engineering Academy in 2011 and became an IEEE Fellow in 2010 for his contribution to video coding technology. He has been on the editorial boards of IEEE Trans. on Multimedia, IEEE Trans. Circuits Syst. For Video Tech., and several other top international academic journals. He was the chair of IEEE Int. Conf. Multimedia & Expo (ICME) 2007, and ACM Int. Conf. Multimedia (ACM-MM) 2009. He has authored four books and published more than 500 research papers on video coding, signal processing, computer vision, and pattern recognition.

Key Technologies in Mobile Visual Search and MPEG Standardization Activities

Ling-Yu Duan, Jie Chen, Chunyu Wang, Rongrong Ji, Tiejun Huang, and Wen Gao

Institute of Digital Media, Peking University, Beijing 100871, China

Received:2012-03-08 Online:2012-06-25 Published:2012-06-25
About author:Ling-Yu Duan (lingyu@pku.edu.cn) received his MSc degree in automation from The University of Science and Technolohy, China, in 1999. He received his MSc degree in computer science from the National University of Singapore in 2002 and his PhD degree in information technology from The University of Newcastle, Australia, in 2007. From 2003 to 2008, he was a research scientist at the Institute for Infocomm Research, Singapore. Since 2008, he has been an associate professor at the School of Electrical Engineering and Computer Science at Peking University. Dr. Duan currently His research interests include visual search and reality augmentation, multimedia content analysis, and mobile media computing. He has authored more than 70 papers in these areas.

Jie Chen (cjie@pku.edu.cn) is a PhD candidate at the School of Electrical Engineering and Computer Science, Peking University. His research interest include mobile visual search, low bit-rate visual descriptors, and vector quantizer. He has published more than 10 journal or conference papers.

Chunyu Wang (wangchunyu@pku.edu.cn) is a PhD candidate at the School of Electrical Engineering and Computer Science, Peking University. His research interests include visual search, object recognition, and activity recognition.

Rongrong Ji (rrji@pku.edu.cn) received his PhD degree in computer science from Harbin Institute of Technology, China. He is currently a postdoctoral research fellow at Columbia University. His research interests include image and video search, content analysis and understanding, mobile visual search and recognition, and interactive human-computer interface. Dr. Ji received the Best Paper Award at ACM Multimedia 2011 and received a Microsoft Fellowship in 2007.

Tiejun Huang (tjhuang@pku.edu.cn) received his BSc and MSC degrees in computer science from Wuhan University of Technology in 1992 and 1995. He received his PhD degree in pattern recognition and image analysis from Huazhong University of Science and Technology, China, in 1998. He is currently a professor in the School of Electrical Engineering and Computer Science, Peking University. He is also vice director of the National Engineering Laboratory for Video Technology of China. His research interests include video coding, image understanding, digital rights management (DRM), and digital library. He has published more than sixty peer-reviewed papers and has authored or co-authored three books. He is a member of the board of directors for the Digital Media Project; he is on the advisory board for IEEE Computing Now; he is on the editorial board of the Journal on 3D Research; and he is on the board of the Chinese Institute of Electronics.

Wen Gao (wgao@pku.edu.cn) received his MSc degree in computer science from Harbin Institute of Technology, China, in 1985. He received his PhD degree in electronics engineering from the University of Tokyo in 1991. He is a professor in the School of Electronics Engineering and Computer Science, Peking University. He has led research efforts in video coding, face recognition, sign language recognition and synthesis, and multimedia retrieval. Professor Gao was admitted as an Academedian of the China Engineering Academy in 2011 and became an IEEE Fellow in 2010 for his contribution to video coding technology. He has been on the editorial boards of IEEE Trans. on Multimedia, IEEE Trans. Circuits Syst. For Video Tech., and several other top international academic journals. He was the chair of IEEE Int. Conf. Multimedia & Expo (ICME) 2007, and ACM Int. Conf. Multimedia (ACM-MM) 2009. He has authored four books and published more than 500 research papers on video coding, signal processing, computer vision, and pattern recognition.

摘要/Abstract

摘要： Visual search has been a long-standing problem in applications such as location recognition and product search. Much research has been done on image representation, matching, indexing, and retrieval. Key component technologies for visual search have been developed, and numerous real-world applications are emerging. To ensure application interoperability, the Moving Picture Experts Group (MPEG) has begun standardizing visual search technologies and is developing the compact descriptors for visual search (CDVS) standard. MPEG seeks to develop a collaborative platform for evaluating existing visual search technologies. Peking University has participated in this standardization since the 94th MPEG meeting, and significant progress has been made with the various proposals. A test model (TM) has been selected to determine the basic pipeline and key components of visual search. However, the first-version TM has high computational complexity and imperfect retrieval and matching. Core experiments have therefore been set up to improve TM. In this article, we summarize key technologies for visual search and report the progress of MPEG CDVS. We discuss Peking University’s efforts in CDVS and also discuss unresolved issues.

关键词: visual search, mobile, visual descriptors, low bit rate, compression

Abstract: Visual search has been a long-standing problem in applications such as location recognition and product search. Much research has been done on image representation, matching, indexing, and retrieval. Key component technologies for visual search have been developed, and numerous real-world applications are emerging. To ensure application interoperability, the Moving Picture Experts Group (MPEG) has begun standardizing visual search technologies and is developing the compact descriptors for visual search (CDVS) standard. MPEG seeks to develop a collaborative platform for evaluating existing visual search technologies. Peking University has participated in this standardization since the 94th MPEG meeting, and significant progress has been made with the various proposals. A test model (TM) has been selected to determine the basic pipeline and key components of visual search. However, the first-version TM has high computational complexity and imperfect retrieval and matching. Core experiments have therefore been set up to improve TM. In this article, we summarize key technologies for visual search and report the progress of MPEG CDVS. We discuss Peking University’s efforts in CDVS and also discuss unresolved issues.

Key words: visual search, mobile, visual descriptors, low bit rate, compression

Ling-Yu Duan, Jie Chen, Chunyu Wang, Rongrong Ji, Tiejun Huang, and Wen Gao. Key Technologies in Mobile Visual Search and MPEG Standardization Activities[J]. ZTE Communications, 2012, 10(2): 57-66.

[1]	DONG Zhenjiang, SHUANG Kai, CAI Yanan, WANG Wei, and LI Congbing. An Optimization of HTTP/2 for Mobile Applications[J]. ZTE Communications, 2016, 14(S1): 35-42.
[2]	Bin Li, Jizheng Xu. An Introduction to High Efficiency Video Coding Range Extensions[J]. ZTE Communications, 2016, 14(1): 12-18.
[3]	Ping Lu, Xia Jia, Hengliang Zhu, Ming Liu, Shouhong Ding, Lizhuang Ma. A Visual Lossless Image-Recompression Framework[J]. ZTE Communications, 2015, 13(2): 36-40.
[4]	Danmeng Liu, Wei Wei, Guojie Song, Ping Lu. Community Discovery with Location-Interaction Disparity in Mobile Social Networks[J]. ZTE Communications, 2015, 13(2): 53-61.
[5]	Mojdeh Amani, Toktam Mahmoodi, Mallikarjun Tatipamula, and Hamid Aghvami. SDN-Based Data Offloading for 5G Mobile Networks[J]. ZTE Communications, 2014, 12(2): 34-40.
[6]	Lei Yang and Jiannong Cao. Computation Partitioning in Mobile Cloud Computing: A Survey[J]. ZTE Communications, 2013, 11(4): 8-17.
[7]	Ashagrie Getnet Flattie. Cooperative Communication Protocols for Performance Improvement in Mobile Satellite Systems[J]. ZTE Communications, 2013, 11(4): 47-52.
[8]	Ruimin Hu, Rui Zhong, Zhongyuan Wang, and Zhen Han. 3D Perception Algorithms: Towards Perceptually Driven Compression of 3D Video[J]. ZTE Communications, 2013, 11(1): 11-16.
[9]	Is-Haka Mkwawa and Lingfen Sun. Battery Voltage Discharge Rate Prediction and Video Content Adaptation in Mobile Devices on 3G Access Networks[J]. ZTE Communications, 2013, 11(1): 44-50.
[10]	Bhumip Khasnabish. Mobile Cloud for Personalized Any-Media Services[J]. ZTE Communications, 2012, 10(3): 47-54.
[11]	Sean Cai and Li Mo. Guest Editorial: Advances in Mobile Data Communications[J]. ZTE Communications, 2011, 9(4): 42-42.
[12]	Li Mo, Fei Yuan, and Jian Yang. Mobile Backhaul Solutions[J]. ZTE Communications, 2011, 9(4): 63-67.
[13]	Aiqun Hu, Tao Li, and Mingfu Xue. Security Service Technology for Mobile Networks[J]. ZTE Communications, 2011, 9(3): 49-54.
[14]	Chengzhong Xu. Guest Editorial: Mobile Cloud Computing and Applications[J]. ZTE Communications, 2011, 9(1): 3-3.
[15]	Xiaopeng Fan, Jiannong Cao, and Haixia Mao. A Survey of Mobile Cloud Computing[J]. ZTE Communications, 2011, 9(1): 4-8.

Key Technologies in Mobile Visual Search and MPEG Standardization Activities

Key Technologies in Mobile Visual Search and MPEG Standardization Activities

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics