ZTE Communications ›› 2013, Vol. 11 ›› Issue (3): 56-61.DOI: DOI:10.3969/j.issn.1673-5188.2013.03.010
• • 上一篇
Ping Lu1, Zhenjiang Dong1, Shengmei Luo1, Lixia Liu1, Shanshan Guan2, Shengyu Liu2, and Qingcai Chen2
Ping Lu1, Zhenjiang Dong1, Shengmei Luo1, Lixia Liu1, Shanshan Guan2, Shengyu Liu2, and Qingcai Chen2
摘要: With user-generated content, anyone can be a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is becoming harder to efficiently obtain required information. In this paper, we describe how natural language processing and text mining can be parallelized using Hadoop and Message Passing Interface. We propose a parallel web text mining platform that processes massive amounts of data quickly and efficiently. Our web knowledge service platform is designed to collect information about the IT and telecommunications industries from the web and process this information using natural language processing and data-mining techniques.