ZTE Communications ›› 2014, Vol. 12 ›› Issue (4): 16-22.DOI: DOI:10.3969/j.issn.1673-5188.2014.04.003

• Special Topic • Previous Articles     Next Articles

MBGM: A Graph-Mining Tool Based on MapReduce and BSP

Zhenjiang Dong1, Lixia Liu1, Bin Wu2, and Yang Liu2   

  1. 1. ZTE Corporation, Nanjing 210012, China;
    2. Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2014-04-04 Online:2014-12-25 Published:2014-12-25
  • About author:Zhenjiang Dong (dong.zhengjiang@zte.com.cn) reveived his MS degree in telecommunication from Harbin Instituteof Technology in 1996. He is the deputy head of the Service Institute of ZTE Corporation,. His research interests include cloud computing and mobile internet.

    Lixia Liu (liu.lixia@zte.com.cn) is a senior engineer in the pre-research department of ZTE. She received her MS degree from Ocean University of China in 2008. Her research interests include natural language processing, text mining, data mining, machine learning, mathematical statistics and cloud computing.

    Bin Wu (wubin@bupt.edu.cn) received his PhD degree from the Institute of Computing Technology, Chinese Academy of Science, Beijing, in 2002. He is a senior member of CCF. He is current a professor at the School of Computer Science, Beijing University of Posts and Telecommunications, China. His research interests include data mining, complex network, and cloud computing. He has published more than 100 papers in refereed journals and conferences.

    Yang Liu (liuyang1984@bupt.edu.cn) received his BS degree in computer science from Henan University of Technology in 2007. He is currently a PhD candidate at the School of Computer Science, Beijing University of Posts and Telecommunication, China. His research interests include social network analysis, data mining, and cloud computing.
  • Supported by:
    This work is supported by ZTE Industry-Academia-Research Cooperaton Funds.

MBGM: A Graph-Mining Tool Based on MapReduce and BSP

Zhenjiang Dong1, Lixia Liu1, Bin Wu2, and Yang Liu2   

  1. 1. ZTE Corporation, Nanjing 210012, China;
    2. Beijing University of Posts and Telecommunications, Beijing 100876, China
  • 作者简介:Zhenjiang Dong (dong.zhengjiang@zte.com.cn) reveived his MS degree in telecommunication from Harbin Instituteof Technology in 1996. He is the deputy head of the Service Institute of ZTE Corporation,. His research interests include cloud computing and mobile internet.

    Lixia Liu (liu.lixia@zte.com.cn) is a senior engineer in the pre-research department of ZTE. She received her MS degree from Ocean University of China in 2008. Her research interests include natural language processing, text mining, data mining, machine learning, mathematical statistics and cloud computing.

    Bin Wu (wubin@bupt.edu.cn) received his PhD degree from the Institute of Computing Technology, Chinese Academy of Science, Beijing, in 2002. He is a senior member of CCF. He is current a professor at the School of Computer Science, Beijing University of Posts and Telecommunications, China. His research interests include data mining, complex network, and cloud computing. He has published more than 100 papers in refereed journals and conferences.

    Yang Liu (liuyang1984@bupt.edu.cn) received his BS degree in computer science from Henan University of Technology in 2007. He is currently a PhD candidate at the School of Computer Science, Beijing University of Posts and Telecommunication, China. His research interests include social network analysis, data mining, and cloud computing.
  • 基金资助:
    This work is supported by ZTE Industry-Academia-Research Cooperaton Funds.

Abstract: This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) computing model. The tool is named Mapreduce and BSP based Graph-mining tool (MBGM). The core of this mining system are four sets of parallel graph-mining algorithms programmed in the BSP parallel model and one set of data extraction-transformation-loading (ETL) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a well-designed data management function enables users to view, delete and input data in the Hadoop distributed file system (HDFS). Experiments on artificial data show that the components of graph-mining algorithm in MBGM are efficient.

Key words: cloud computing, parallel algorithms, graph data analysis, data mining, social network analysis

摘要: This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) computing model. The tool is named Mapreduce and BSP based Graph-mining tool (MBGM). The core of this mining system are four sets of parallel graph-mining algorithms programmed in the BSP parallel model and one set of data extraction-transformation-loading (ETL) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a well-designed data management function enables users to view, delete and input data in the Hadoop distributed file system (HDFS). Experiments on artificial data show that the components of graph-mining algorithm in MBGM are efficient.

关键词: cloud computing, parallel algorithms, graph data analysis, data mining, social network analysis