ZTE Communications ›› 2018, Vol. 16 ›› Issue (2): 55-66.DOI: 10.3969/j.issn.1673-5188.2018.02.009
• Review • Previous Articles
HU Baiqing, WANG Wenjie, Chi Harold Liu
Received:
2017-06-15
Online:
2018-06-25
Published:
2019-12-12
About author:
HU Baiqing (baibenny@foxmail.com) received his B.Eng. degree in software engineering from Wuhan Textile University, China in 2016. He is pursuing an M.Eng. degree at Beijing Institute of Technology, China, with a major in software engineering. His research interests include cloud computing, big data, and the Internet of Things.|WANG Wenjie (wangwj1203962899@gmail.com) received his B.Eng. degree in software engineering from Chongqing University, China in 2016. He is pursuing an M.Eng. degree at Beijing Institute of Technology, China, with a major in software engineering. His research interest is the security of big data.|Chi Harold Liu (chiliu@bit.edu.cn) received his B.Eng. degree in electronic and information engineering from Tsinghua University, China in 2006, and Ph.D. degree in electrical engineering from Imperial College, UK. He is currently a full professor and Vice Dean of School of Computer Science, Beijing Institute of Technology, China. His research interests include big data and the Internet of Things.
HU Baiqing, WANG Wenjie, Chi Harold Liu. Open Source Initiatives for Big Data Governance and Security: A Survey[J]. ZTE Communications, 2018, 16(2): 55-66.
Need | Feature |
---|---|
Unified management of data lifecycle | ?Centralized definition and management of pipelines for data ingest, process and export |
?Ensuring disaster preparedness and business continuity | |
?Out-of-the-box policies for data replication and retention | |
?End-to-end monitoring of data pipes | |
Compliance and audit | ?Visualization of data pipeline lineage |
?Tracking the data pipeline audit log | |
?Taging data with business metadata | |
Database replication and archival | ?Replication across on-premise and cloud-based storages targets: Microsoft Azure and Amazon S3 |
?Data lineage with supporting documentation and examples | |
?HDFS in heterogeneous tiered storage | |
?Definition of data hot/cold storage layer within a cluster |
Table 1 Apache Falcon requirements and features [22]
Need | Feature |
---|---|
Unified management of data lifecycle | ?Centralized definition and management of pipelines for data ingest, process and export |
?Ensuring disaster preparedness and business continuity | |
?Out-of-the-box policies for data replication and retention | |
?End-to-end monitoring of data pipes | |
Compliance and audit | ?Visualization of data pipeline lineage |
?Tracking the data pipeline audit log | |
?Taging data with business metadata | |
Database replication and archival | ?Replication across on-premise and cloud-based storages targets: Microsoft Azure and Amazon S3 |
?Data lineage with supporting documentation and examples | |
?HDFS in heterogeneous tiered storage | |
?Definition of data hot/cold storage layer within a cluster |
Service | Resource | Authority |
---|---|---|
HDFS | Path | Read; Write; Execute |
YARN | Queue | Submit; Admin |
HBase | Table; Column Family; Column | Read; Write; Create; Admin |
Hive | Database; Table; Column | Select; Update; Create; Drop; Alter; Index; Lock |
Table 2 Ranger model entity enumeration values
Service | Resource | Authority |
---|---|---|
HDFS | Path | Read; Write; Execute |
YARN | Queue | Submit; Admin |
HBase | Table; Column Family; Column | Read; Write; Create; Admin |
Hive | Database; Table; Column | Select; Update; Create; Drop; Alter; Index; Lock |
Services and Features | Ranger | Sentry | Kerberos |
---|---|---|---|
Audit log service | √ | √ | √ |
Fine grained authorization service | √ | √ | |
Authentication service | √ | ||
Unified authorization policy | √ | √ | |
Ticket granting service | √ | ||
Role-based management | √ | √ | |
Supported components | 9 | 5 | 12 |
Table 3 Comparisons between the three security frameworks
Services and Features | Ranger | Sentry | Kerberos |
---|---|---|---|
Audit log service | √ | √ | √ |
Fine grained authorization service | √ | √ | |
Authentication service | √ | ||
Unified authorization policy | √ | √ | |
Ticket granting service | √ | ||
Role-based management | √ | √ | |
Supported components | 9 | 5 | 12 |
[1] | Z. J. Dong , “Improving performance of cloud computing and big data technologies and applications,” ZTE Communications, vol. 12, no. 4, pp. 1-2, Dec. 2014. |
[2] | L. Douglas . (2001, Feb. 6). 3D data management: controlling data volume, velocity and variety [Online]. Available: https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf |
[3] |
S. Iwata , “Big data era,” Journal of Information Processing and Management, vol. 55, no. 8, pp. 543-551, Jan 2012. doi: 10.1241/johokanri.55.543.
DOI URL |
[4] |
K. Sarvakar , “big data security and privacy using data transformation with role based access control,” International Journal of Computer Science & Communication (IJCSC), vol. 7, no. 2, pp. 90-94, Jul. 2016. doi: 10.090592/IJCSC.2016.115.
DOI URL PMID |
[5] |
S. B. Scruggs, K. Watson, A. I. Su , “Harnessing the heart of big data,” Circulation Research, vol. 116, no. 7, pp. 1115-1119, Mar. 2015. doi: 10.1161/CIRCRESAHA.115.306013.
DOI URL PMID |
[6] | M. Jensen , “Challenges of privacy protection in big data analytics,” in IEEE BigData Congress, Santa Clara, USA, Jul. 2013, pp. 235-238. doi: 10.1109/BigData.Congress.2013.39. |
[7] | F. Cang, M. Zhang, Y. Wu , “Preventing data leakage in a cloud environment,” ZTE Communications, vol. 11, no. 4, pp. 27-31, Dec. 2013. doi: 10.3969/j.issn.1673-5188.2013.04.004. |
[8] | K. Setty and R. Bakhshi , “What is big data and what does it have to do with it audit?,” ISACA Journal, vol. 3, no. 14, pp. 1-3, 2013. |
[9] | M. Anup, R. Nimje, V. T. Gaikwad, H. N. Datir , “A review of various trust management models for cloud computing storage systems,” International Journal of Engineering and Computer Science, vol. 3, no. 2, pp. 3924-3928, Feb. 2014. |
[10] | M. Al-Ruithe, E. Benkhelifa, K. Hameed , “Key dimensions for cloud data governance,” in IEEE 4th International Conference on Future Internet of Things and Cloud (FiCloud), Vienna, Austria, Sept. 2016, pp. 379-386. doi: 10.1109/FiCloud.2016.60. |
[11] | M. Felici, T. Koulouris, S. Pearson , “Accountability for data governance in cloud ecosystems,” in IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, Dec. 2013, pp. 327-332. doi: 0.1109/CloudCom.2013.157. |
[12] | A. Corradi, L. Foschini, A. Zanni , et al., “A federation model to support semantic SPARQL queries for enterprise data governance,” in Eleventh International Conference on Digital Information Management (ICDIM), Porto, Portugal, Sept. 2016, pp. 96-100. doi: 10.1109/ICDIM.2016.7829778. |
[13] | T. Priebe and S. Markus , “Business information modeling: a methodology for data-intensive projects, data science and big data governance,” in IEEE International Conference on Big Data, Santa Clara, USA, Dec. 2015, pp. 2056-2065. doi: 10.1109/BigData.2015.7363987. |
[14] | M. Al-Ruithe, S. Mthunzi, E. Benkhelifa , “Data governance for security in IoT & cloud converged environments,” in IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Agadir, Morocco, Dec. 2016, pp. 1-8. doi: 10.1109/AICCSA.2016.7945737. |
[15] | R. J.DeStefano, L. Tao, and K. Gai , “Improving data governance in large organizations through ontology and linked data,” in IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud), Beijing, China, Jun. 2016, pp. 279-284. doi: 10.1109/CSCloud.2016.47. |
[16] | A. Yulfitri , “Modeling operational model of data governance in government: case study: government agency X in Jakarta,” in International Conference on Information Technology Systems and Innovation (ICITSI), Bandung, Indonesia, Oct. 2016, pp. 1-5. doi: 10.1109/ICITSI.2016.7858207. |
[17] | R. Thiel, K. A. Stroetmann, P. D. Singleton , “Clinical data governance: legal and ethical challenges,” in IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), Valencia, Spain, Jun. 2014, pp. 597-600. doi: 10.1109/BHI.2014.6864435. |
[18] |
L. Xu, C. Jiang, J. Wang, J. Yuan, Y. Ren , “Information security in big data: privacy and data mining,” IEEE Access, vol. 2, pp. 1149-1176 Oct. 2014. doi: 10.1109/ACCESS.2014.2362522.
DOI URL |
[19] | H. Ye, X. Cheng, M. Yuan , et al., “A survey of security and privacy in big data,” in 16th International Symposium on Communications and Information Technologies (ISCIT), Qingdao, China, Sept. 2016, pp. 268-272. doi: 10.1109/ISCIT.2016.7751634. |
[20] |
S. Tan, D. De, W. Z. Song, J. Yang, S. K. Das , “Survey of security advances in smart grid: a data driven approach,” IEEE Communications Surveys & Tutorials, vol. 19, no. 1, pp. 397-422, Oct. 2016. doi: 10.1109/COMST.2016.2616442.
DOI URL PMID |
[21] | Apache. ( 2016, Sept. 20). What Falcon does[Online]. Available: https://falcon.apache.org/FalconDocumentation.html |
[22] | Hortonworks. ( 2016, Sept. 20). Falcon [Online]. Available: https://zh.hortonworks.com/apache/falcon |
[23] | Apache. (2016, Oct. 16). Data governance and metadata framework for hadoop [Online]. Available: http://atlas.apache.org |
[24] | Apache. (2017, Mar. 16). Atals architecture [Online]. Available: http://atlas.apache.org/Architecture.html |
[25] | M. Rouse. (2015, May 23). Data-lake [Online]. Available:http://searchaws.techtarget.com/definition/data-lake |
[26] | Hortonworks. (2015, Dec. 1). How Ranger works [Online]. Available: https://hortonworks.com/apache/ranger |
[27] | Cloudera. (2016, May 23). Sentry [Online]. Available: http://www.cloudera.com/content/cloudera/en/products-adn-services/cdh/sentry.html |
[28] | J . Kohl and C. Neuman , “The kerberos network authentication service,” Internet RFC 1510, Sept. 1993. |
[29] | Apache. (2016, Mar. 25). Apache Sentry [Online]. Available: https://blogs.apache.org/sentry/entry/sentry_graduates_to_a_top?platform=hootsuite |
[1] | CAO Yinfeng, CAO Jiannong, WANG Yuqin, WANG Kaile, LIU Xun. Security in Edge Blockchains: Attacks and Countermeasures [J]. ZTE Communications, 2022, 20(4): 3-14. |
[2] | LU Haitao, YAN Xincheng, ZHOU Qiang, DAI Jiulong, LI Rui. Key Intrinsic Security Technologies in 6G Networks [J]. ZTE Communications, 2022, 20(4): 22-31. |
[3] | HE Miao, LI Xiangman, NI Jianbing. Physical Layer Security for MmWave Communications: Challenges and Solutions [J]. ZTE Communications, 2022, 20(4): 41-51. |
[4] | YAN Xincheng, TENG Huiyun, PING Li, JIANG Zhihong, ZHOU Na. Study on Security of 5G and Satellite Converged Communication Network [J]. ZTE Communications, 2021, 19(4): 79-89. |
[5] | YANG Howard H., ZHAO Zhongyuan, QUEK Tony Q. S.. Enabling Intelligence at Network Edge:An Overview of Federated Learning [J]. ZTE Communications, 2020, 18(2): 2-10. |
[6] | TANG Kai. Risk Analysis of Industrial InternetIdentity System [J]. ZTE Communications, 2020, 18(1): 44-48. |
[7] | MA Baoluo, CHEN Wenqu, CHI Cheng. Security Risk Analysis Model for Identification and Resolution System of Industrial Internet [J]. ZTE Communications, 2020, 18(1): 49-54. |
[8] | SHI Zongsheng, JIANG Jian, JING Sizhe, LI Qiyuan, MA Xiaoran. Application of Industrial Internet Identifier in Optical Fiber Industrial Chain [J]. ZTE Communications, 2020, 18(1): 66-72. |
[9] | NI Dong, LI Hui, JI Yuefeng, LI Hongbiao, ZHU Yinan. A Service-Based Intelligent Time-Domain and Spectral-Domain Flow Aggregation in IP-over-EON Based on SDON [J]. ZTE Communications, 2019, 17(3): 56-62. |
[10] | ZHANG Yunyong, XU Lei, TAO Ye. SDN Based Security Services [J]. ZTE Communications, 2018, 16(4): 9-14. |
[11] | WANG Hua, ZHAO Yongli, WANG Dajiang, WANG Jiayu, WANG Zhenyu. A Quantum Key Re-Transmission Mechanism for QKD-Based Optical Networks [J]. ZTE Communications, 2018, 16(3): 52-58. |
[12] | Alexander A. Okandeji, Muhammad R. A. Khandaker, WONG Kai-Kit, ZHANG Yangyang, ZHENG Zhongbin. Secure Beamforming Design for SWIPT in MISO Full-Duplex Systems [J]. ZTE Communications, 2018, 16(1): 38-46. |
[13] | REN Fuji, Kazuyuki Matsumoto. Emotion Analysis on Social Big Data [J]. ZTE Communications, 2017, 15(S2): 30-37. |
[14] | MENG Ziqian, GUAN Zhi, WU Zhengang, LI Anran, CHEN Zhong. Security Enhanced Internet of Vehicles with Cloud-Fog-Dew Computing [J]. ZTE Communications, 2017, 15(S2): 47-51. |
[15] | ZANG Qimeng, GUO Song. Online Shuffling with Task Duplication in Cloud [J]. ZTE Communications, 2017, 15(4): 38-42. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||