期刊
  出版年
  关键词
结果中检索 Open Search
Please wait a minute...
选择: 显示/隐藏图片
1. MBGM: A Graph-Mining Tool Based on MapReduce and BSP
Zhenjiang Dong, Lixia Liu, Bin Wu, and Yang Liu
ZTE Communications    2014, 12 (4): 16-22.   DOI: DOI:10.3969/j.issn.1673-5188.2014.04.003
摘要57)      PDF (393KB)(79)    收藏
This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) computing model. The tool is named Mapreduce and BSP based Graph-mining tool (MBGM). The core of this mining system are four sets of parallel graph-mining algorithms programmed in the BSP parallel model and one set of data extraction-transformation-loading (ETL) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a well-designed data management function enables users to view, delete and input data in the Hadoop distributed file system (HDFS). Experiments on artificial data show that the components of graph-mining algorithm in MBGM are efficient.
相关文章 | 多维度评价
2. A Parallel Platform for Web Text Mining
Ping Lu, Zhenjiang Dong, Shengmei Luo, Lixia Liu, Shanshan Guan, Shengyu Liu, and Qingcai Chen
ZTE Communications    2013, 11 (3): 56-61.   DOI: DOI:10.3969/j.issn.1673-5188.2013.03.010
摘要56)      PDF (411KB)(63)    收藏
With user-generated content, anyone can be a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is becoming harder to efficiently obtain required information. In this paper, we describe how natural language processing and text mining can be parallelized using Hadoop and Message Passing Interface. We propose a parallel web text mining platform that processes massive amounts of data quickly and efficiently. Our web knowledge service platform is designed to collect information about the IT and telecommunications industries from the web and process this information using natural language processing and data-mining techniques.
相关文章 | 多维度评价
3. Parallel Web Mining System Based on Cloud Platform
Shengmei Luo, Qing He, Lixia Liu, Xiang Ao, Ning Li, and Fuzhen Zhuang
ZTE Communications    2012, 10 (4): 45-53.  
摘要71)      PDF (479KB)(56)    收藏
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients. Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
相关文章 | 多维度评价