[an error occurred while processing this directive]

ZTE Communications ›› 2014, Vol. 12 ›› Issue (4): 1-2.

• • 上一篇    下一篇

Guest Editorial: Improving Performance of Cloud Computing and Big Data Technologies and Applications

Zhenjiang Dong   

  1. ZTE Corporation, China
  • 出版日期:2014-12-25 发布日期:2014-12-25
  • 作者简介:Zhenjiang Dong is the deputy head of the Cloud Computing and IT Research Institute of ZTE Corporation and a standing member and service team leader of the company’s Committee of Corporate Strategy and Planning Experts. He is also an executive director of Chinese Association for Artificial Intelligence, a professor of Nanjing University of Science and Technology, a service computing expert of China Computer Federation. He has been responsible for more than 10 research projects supported by National High Tech R&D Programs of China ("863" programs), Programs of Core Electronic Devices, High-End Generic Chips, and Basic Software Products of China, and National Science and Technology Major Project of China. His research interests include cloud computing, big data, and media analysis and processing.

Guest Editorial: Improving Performance of Cloud Computing and Big Data Technologies and Applications

Zhenjiang Dong   

  1. ZTE Corporation, China
  • Online:2014-12-25 Published:2014-12-25
  • About author:Zhenjiang Dong is the deputy head of the Cloud Computing and IT Research Institute of ZTE Corporation and a standing member and service team leader of the company’s Committee of Corporate Strategy and Planning Experts. He is also an executive director of Chinese Association for Artificial Intelligence, a professor of Nanjing University of Science and Technology, a service computing expert of China Computer Federation. He has been responsible for more than 10 research projects supported by National High Tech R&D Programs of China ("863" programs), Programs of Core Electronic Devices, High-End Generic Chips, and Basic Software Products of China, and National Science and Technology Major Project of China. His research interests include cloud computing, big data, and media analysis and processing.

摘要: Cloud computing technology is changing the development and usage patterns of IT infrastructure and applications. Virtualized and distributed systems as well as unified management and scheduling has greatly improved computing and storage. Management has become easier, and OAM costs have been significantly reduced. Cloud desktop technology is developing rapidly. With this technology, users can flexibly and dynamically use virtual machine resources, companies’efficiency of using and allocating resources is greatly improved, and information security is ensured. In most existing virtual cloud desktop solutions, computing and storage are bound together, and data is stored as image files. This limits the flexibility and expandability of systems and is insufficient for meeting customers’requirements in different scenarios.
In this era of big data, the annual growth rate of the data in social networks, mobile communication, e-commerce, and the Internet of Things is more than 50%. More than 80% of this data is unstructured. Therefore, it is imperative to develop an effective method for storing and managing big data and querying and analyzing big data in real time or quasi real. HBase is a distributed data storage system operating in the Hadoop environment. HBase provides a highly expandable method and platform for big data storage and management. However, it supports only primary key indexing but does not support non-primary key indexing. As a result, the data query efficiency of HBase is low and data cannot be queried in real time or quasi real time. For HBase operating in Hadoop, the capability of querying data according to non-primary keys is the most important and urgent.
The graph data structure is suitable for most big data created in social networks. Graph data is more complex and difficult to understand than traditional linked-list data or tree data, so quick and easy processing and understanding of graph data is of great significance and has become a hot topic in the industry.
Big data has a high proportion of video and image data but most of the video and image data is not utilized. Creating value with this data has been a research focus in the industry. For example, the traditional face localization and identification technology is a local optimal solution that has a large room for improvement in accuracy.
This special issue of ZTE Communications embodies the industry’s efforts on performance improvement of cloud computing and big data technologies and applications. We invited four peer-reviewed papers based on projects supported by ZTE Industry-Academic-Research Cooperation Funds.
Hancong Duang et al . propose a disk mapping solution integrated with the virtual desktop technology in“A New Virtual Disk Mapping Method for the Cloud Desktop Storage Client.”The virtual disk driver has a user-friendly mode for accessing desktop data and has a flexible cache space management mechanism. The file system filter driver intelligently checks I/O requests of upper applications and synchronizes file access requests to users’cloud storage services. Experimental results show that the read-write performance of our virtual disk mapping method with customizable local cache storage is almost same as that of the local hard disk.
“HMIBase: An Hierarchical Indexing System for Storing and Querying Big Data,”by Shengmei Luoet al ., presents the design and implementation of a complete hierarchical indexing and query system called HMIBase. This system efficiently queries a value or values within a range according to non-primary key attributes. This system has good expandability. Test results based on 10 million to 1 billion data records show that regardless of whether the number of query results is large or small, HMIBase can respond to cold and hot queries one to four levels faster than standard HBase and five to twenty times faster than the open-source Hindex system.
In“MBGM: A Graph - Mining Tool Based on MapReduce and BSP,”Zhenjiang Dong et al. propose a MapReduce and BSP-based Graph Mining (MBGM) tool. This tool uses the BSP model-based parallel graph mining algorithm and the MapReduce-based extraction-transformation-loading (ETL) algorithm, and an optimized workflow engine for cloud computing is designed for the tool. Experiments show that graph mining algorithm components, including PageRank, K - means, InDegree Count, and Closeness Centrality, in the MBGM tool has higher performance than the corresponding algorithm components of the BC-PDM and BC-BSP.
Bofei Wang et al . in“Facial Landmark Localization by Gibbs Sampling,”present an optimized solution of the face localization technology based on key points. Instead of the traditional gradient descent algorithm, this solution uses the Gibbs sampling algorithm, which is easy to converge and can implement the global optimal solution for face localization based on key points. In this way, the local optimal solution is avoided. The posterior probability function used by the Gibbs sampling algorithm comprises the prior probability function and the likelihood function. The prior probability function is assumed to follow the Gaussian distribution and learn according to features after dimension reduction. The likelihood function is obtained through the local linear SVM algorithm. The LFW data has been used in the system for tests. The test results show that the accuracy of face localization is high.
I would like to thank all the authors for their contributions and all the reviewers who helped improve the quality of the papers.

关键词: Cloud Computing, Big Data

Abstract: Cloud computing technology is changing the development and usage patterns of IT infrastructure and applications. Virtualized and distributed systems as well as unified management and scheduling has greatly improved computing and storage. Management has become easier, and OAM costs have been significantly reduced. Cloud desktop technology is developing rapidly. With this technology, users can flexibly and dynamically use virtual machine resources, companies’efficiency of using and allocating resources is greatly improved, and information security is ensured. In most existing virtual cloud desktop solutions, computing and storage are bound together, and data is stored as image files. This limits the flexibility and expandability of systems and is insufficient for meeting customers’requirements in different scenarios.
In this era of big data, the annual growth rate of the data in social networks, mobile communication, e-commerce, and the Internet of Things is more than 50%. More than 80% of this data is unstructured. Therefore, it is imperative to develop an effective method for storing and managing big data and querying and analyzing big data in real time or quasi real. HBase is a distributed data storage system operating in the Hadoop environment. HBase provides a highly expandable method and platform for big data storage and management. However, it supports only primary key indexing but does not support non-primary key indexing. As a result, the data query efficiency of HBase is low and data cannot be queried in real time or quasi real time. For HBase operating in Hadoop, the capability of querying data according to non-primary keys is the most important and urgent.
The graph data structure is suitable for most big data created in social networks. Graph data is more complex and difficult to understand than traditional linked-list data or tree data, so quick and easy processing and understanding of graph data is of great significance and has become a hot topic in the industry.
Big data has a high proportion of video and image data but most of the video and image data is not utilized. Creating value with this data has been a research focus in the industry. For example, the traditional face localization and identification technology is a local optimal solution that has a large room for improvement in accuracy.
This special issue of ZTE Communications embodies the industry’s efforts on performance improvement of cloud computing and big data technologies and applications. We invited four peer-reviewed papers based on projects supported by ZTE Industry-Academic-Research Cooperation Funds.
Hancong Duang et al . propose a disk mapping solution integrated with the virtual desktop technology in“A New Virtual Disk Mapping Method for the Cloud Desktop Storage Client.”The virtual disk driver has a user-friendly mode for accessing desktop data and has a flexible cache space management mechanism. The file system filter driver intelligently checks I/O requests of upper applications and synchronizes file access requests to users’cloud storage services. Experimental results show that the read-write performance of our virtual disk mapping method with customizable local cache storage is almost same as that of the local hard disk.
“HMIBase: An Hierarchical Indexing System for Storing and Querying Big Data,”by Shengmei Luoet al ., presents the design and implementation of a complete hierarchical indexing and query system called HMIBase. This system efficiently queries a value or values within a range according to non-primary key attributes. This system has good expandability. Test results based on 10 million to 1 billion data records show that regardless of whether the number of query results is large or small, HMIBase can respond to cold and hot queries one to four levels faster than standard HBase and five to twenty times faster than the open-source Hindex system.
In“MBGM: A Graph - Mining Tool Based on MapReduce and BSP,”Zhenjiang Dong et al. propose a MapReduce and BSP-based Graph Mining (MBGM) tool. This tool uses the BSP model-based parallel graph mining algorithm and the MapReduce-based extraction-transformation-loading (ETL) algorithm, and an optimized workflow engine for cloud computing is designed for the tool. Experiments show that graph mining algorithm components, including PageRank, K - means, InDegree Count, and Closeness Centrality, in the MBGM tool has higher performance than the corresponding algorithm components of the BC-PDM and BC-BSP.
Bofei Wang et al . in“Facial Landmark Localization by Gibbs Sampling,”present an optimized solution of the face localization technology based on key points. Instead of the traditional gradient descent algorithm, this solution uses the Gibbs sampling algorithm, which is easy to converge and can implement the global optimal solution for face localization based on key points. In this way, the local optimal solution is avoided. The posterior probability function used by the Gibbs sampling algorithm comprises the prior probability function and the likelihood function. The prior probability function is assumed to follow the Gaussian distribution and learn according to features after dimension reduction. The likelihood function is obtained through the local linear SVM algorithm. The LFW data has been used in the system for tests. The test results show that the accuracy of face localization is high.
I would like to thank all the authors for their contributions and all the reviewers who helped improve the quality of the papers.

Key words: Cloud Computing, Big Data