ZTE Communications ›› 2018, Vol. 16 ›› Issue (2): 55-66.DOI: 10.3969/j.issn.1673-5188.2018.02.009
• • 上一篇
收稿日期:
2017-06-15
出版日期:
2018-06-25
发布日期:
2019-12-12
HU Baiqing, WANG Wenjie, Chi Harold Liu
Received:
2017-06-15
Online:
2018-06-25
Published:
2019-12-12
About author:
HU Baiqing (baibenny@foxmail.com) received his B.Eng. degree in software engineering from Wuhan Textile University, China in 2016. He is pursuing an M.Eng. degree at Beijing Institute of Technology, China, with a major in software engineering. His research interests include cloud computing, big data, and the Internet of Things.|WANG Wenjie (wangwj1203962899@gmail.com) received his B.Eng. degree in software engineering from Chongqing University, China in 2016. He is pursuing an M.Eng. degree at Beijing Institute of Technology, China, with a major in software engineering. His research interest is the security of big data.|Chi Harold Liu (chiliu@bit.edu.cn) received his B.Eng. degree in electronic and information engineering from Tsinghua University, China in 2006, and Ph.D. degree in electrical engineering from Imperial College, UK. He is currently a full professor and Vice Dean of School of Computer Science, Beijing Institute of Technology, China. His research interests include big data and the Internet of Things.
. [J]. ZTE Communications, 2018, 16(2): 55-66.
HU Baiqing, WANG Wenjie, Chi Harold Liu. Open Source Initiatives for Big Data Governance and Security: A Survey[J]. ZTE Communications, 2018, 16(2): 55-66.
Need | Feature |
---|---|
Unified management of data lifecycle | ?Centralized definition and management of pipelines for data ingest, process and export |
?Ensuring disaster preparedness and business continuity | |
?Out-of-the-box policies for data replication and retention | |
?End-to-end monitoring of data pipes | |
Compliance and audit | ?Visualization of data pipeline lineage |
?Tracking the data pipeline audit log | |
?Taging data with business metadata | |
Database replication and archival | ?Replication across on-premise and cloud-based storages targets: Microsoft Azure and Amazon S3 |
?Data lineage with supporting documentation and examples | |
?HDFS in heterogeneous tiered storage | |
?Definition of data hot/cold storage layer within a cluster |
Table 1 Apache Falcon requirements and features [22]
Need | Feature |
---|---|
Unified management of data lifecycle | ?Centralized definition and management of pipelines for data ingest, process and export |
?Ensuring disaster preparedness and business continuity | |
?Out-of-the-box policies for data replication and retention | |
?End-to-end monitoring of data pipes | |
Compliance and audit | ?Visualization of data pipeline lineage |
?Tracking the data pipeline audit log | |
?Taging data with business metadata | |
Database replication and archival | ?Replication across on-premise and cloud-based storages targets: Microsoft Azure and Amazon S3 |
?Data lineage with supporting documentation and examples | |
?HDFS in heterogeneous tiered storage | |
?Definition of data hot/cold storage layer within a cluster |
Service | Resource | Authority |
---|---|---|
HDFS | Path | Read; Write; Execute |
YARN | Queue | Submit; Admin |
HBase | Table; Column Family; Column | Read; Write; Create; Admin |
Hive | Database; Table; Column | Select; Update; Create; Drop; Alter; Index; Lock |
Table 2 Ranger model entity enumeration values
Service | Resource | Authority |
---|---|---|
HDFS | Path | Read; Write; Execute |
YARN | Queue | Submit; Admin |
HBase | Table; Column Family; Column | Read; Write; Create; Admin |
Hive | Database; Table; Column | Select; Update; Create; Drop; Alter; Index; Lock |
Services and Features | Ranger | Sentry | Kerberos |
---|---|---|---|
Audit log service | √ | √ | √ |
Fine grained authorization service | √ | √ | |
Authentication service | √ | ||
Unified authorization policy | √ | √ | |
Ticket granting service | √ | ||
Role-based management | √ | √ | |
Supported components | 9 | 5 | 12 |
Table 3 Comparisons between the three security frameworks
Services and Features | Ranger | Sentry | Kerberos |
---|---|---|---|
Audit log service | √ | √ | √ |
Fine grained authorization service | √ | √ | |
Authentication service | √ | ||
Unified authorization policy | √ | √ | |
Ticket granting service | √ | ||
Role-based management | √ | √ | |
Supported components | 9 | 5 | 12 |
[1] | Z. J. Dong , “Improving performance of cloud computing and big data technologies and applications,” ZTE Communications, vol. 12, no. 4, pp. 1-2, Dec. 2014. |
[2] | L. Douglas . (2001, Feb. 6). 3D data management: controlling data volume, velocity and variety [Online]. Available: https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf |
[3] |
S. Iwata , “Big data era,” Journal of Information Processing and Management, vol. 55, no. 8, pp. 543-551, Jan 2012. doi: 10.1241/johokanri.55.543.
DOI URL |
[4] |
K. Sarvakar , “big data security and privacy using data transformation with role based access control,” International Journal of Computer Science & Communication (IJCSC), vol. 7, no. 2, pp. 90-94, Jul. 2016. doi: 10.090592/IJCSC.2016.115.
DOI URL PMID |
[5] |
S. B. Scruggs, K. Watson, A. I. Su , “Harnessing the heart of big data,” Circulation Research, vol. 116, no. 7, pp. 1115-1119, Mar. 2015. doi: 10.1161/CIRCRESAHA.115.306013.
DOI URL PMID |
[6] | M. Jensen , “Challenges of privacy protection in big data analytics,” in IEEE BigData Congress, Santa Clara, USA, Jul. 2013, pp. 235-238. doi: 10.1109/BigData.Congress.2013.39. |
[7] | F. Cang, M. Zhang, Y. Wu , “Preventing data leakage in a cloud environment,” ZTE Communications, vol. 11, no. 4, pp. 27-31, Dec. 2013. doi: 10.3969/j.issn.1673-5188.2013.04.004. |
[8] | K. Setty and R. Bakhshi , “What is big data and what does it have to do with it audit?,” ISACA Journal, vol. 3, no. 14, pp. 1-3, 2013. |
[9] | M. Anup, R. Nimje, V. T. Gaikwad, H. N. Datir , “A review of various trust management models for cloud computing storage systems,” International Journal of Engineering and Computer Science, vol. 3, no. 2, pp. 3924-3928, Feb. 2014. |
[10] | M. Al-Ruithe, E. Benkhelifa, K. Hameed , “Key dimensions for cloud data governance,” in IEEE 4th International Conference on Future Internet of Things and Cloud (FiCloud), Vienna, Austria, Sept. 2016, pp. 379-386. doi: 10.1109/FiCloud.2016.60. |
[11] | M. Felici, T. Koulouris, S. Pearson , “Accountability for data governance in cloud ecosystems,” in IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, Dec. 2013, pp. 327-332. doi: 0.1109/CloudCom.2013.157. |
[12] | A. Corradi, L. Foschini, A. Zanni , et al., “A federation model to support semantic SPARQL queries for enterprise data governance,” in Eleventh International Conference on Digital Information Management (ICDIM), Porto, Portugal, Sept. 2016, pp. 96-100. doi: 10.1109/ICDIM.2016.7829778. |
[13] | T. Priebe and S. Markus , “Business information modeling: a methodology for data-intensive projects, data science and big data governance,” in IEEE International Conference on Big Data, Santa Clara, USA, Dec. 2015, pp. 2056-2065. doi: 10.1109/BigData.2015.7363987. |
[14] | M. Al-Ruithe, S. Mthunzi, E. Benkhelifa , “Data governance for security in IoT & cloud converged environments,” in IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Agadir, Morocco, Dec. 2016, pp. 1-8. doi: 10.1109/AICCSA.2016.7945737. |
[15] | R. J.DeStefano, L. Tao, and K. Gai , “Improving data governance in large organizations through ontology and linked data,” in IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud), Beijing, China, Jun. 2016, pp. 279-284. doi: 10.1109/CSCloud.2016.47. |
[16] | A. Yulfitri , “Modeling operational model of data governance in government: case study: government agency X in Jakarta,” in International Conference on Information Technology Systems and Innovation (ICITSI), Bandung, Indonesia, Oct. 2016, pp. 1-5. doi: 10.1109/ICITSI.2016.7858207. |
[17] | R. Thiel, K. A. Stroetmann, P. D. Singleton , “Clinical data governance: legal and ethical challenges,” in IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), Valencia, Spain, Jun. 2014, pp. 597-600. doi: 10.1109/BHI.2014.6864435. |
[18] |
L. Xu, C. Jiang, J. Wang, J. Yuan, Y. Ren , “Information security in big data: privacy and data mining,” IEEE Access, vol. 2, pp. 1149-1176 Oct. 2014. doi: 10.1109/ACCESS.2014.2362522.
DOI URL |
[19] | H. Ye, X. Cheng, M. Yuan , et al., “A survey of security and privacy in big data,” in 16th International Symposium on Communications and Information Technologies (ISCIT), Qingdao, China, Sept. 2016, pp. 268-272. doi: 10.1109/ISCIT.2016.7751634. |
[20] |
S. Tan, D. De, W. Z. Song, J. Yang, S. K. Das , “Survey of security advances in smart grid: a data driven approach,” IEEE Communications Surveys & Tutorials, vol. 19, no. 1, pp. 397-422, Oct. 2016. doi: 10.1109/COMST.2016.2616442.
DOI URL PMID |
[21] | Apache. ( 2016, Sept. 20). What Falcon does[Online]. Available: https://falcon.apache.org/FalconDocumentation.html |
[22] | Hortonworks. ( 2016, Sept. 20). Falcon [Online]. Available: https://zh.hortonworks.com/apache/falcon |
[23] | Apache. (2016, Oct. 16). Data governance and metadata framework for hadoop [Online]. Available: http://atlas.apache.org |
[24] | Apache. (2017, Mar. 16). Atals architecture [Online]. Available: http://atlas.apache.org/Architecture.html |
[25] | M. Rouse. (2015, May 23). Data-lake [Online]. Available:http://searchaws.techtarget.com/definition/data-lake |
[26] | Hortonworks. (2015, Dec. 1). How Ranger works [Online]. Available: https://hortonworks.com/apache/ranger |
[27] | Cloudera. (2016, May 23). Sentry [Online]. Available: http://www.cloudera.com/content/cloudera/en/products-adn-services/cdh/sentry.html |
[28] | J . Kohl and C. Neuman , “The kerberos network authentication service,” Internet RFC 1510, Sept. 1993. |
[29] | Apache. (2016, Mar. 25). Apache Sentry [Online]. Available: https://blogs.apache.org/sentry/entry/sentry_graduates_to_a_top?platform=hootsuite |
No related articles found! |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||