International Transaction of Electrical and Computer Engineers System
ISSN (Print): 2373-1273 ISSN (Online): 2373-1281 Website: http://www.sciepub.com/journal/iteces Editor-in-chief: Dr. Pushpendra Singh, Dr. Rajkumar Rajasekaran
Open Access
Journal Browser
Go
International Transaction of Electrical and Computer Engineers System. 2017, 4(1), 26-38
DOI: 10.12691/iteces-4-1-4
Open AccessArticle

A Perusal of Big Data Classification and Hadoop Technology

Nikhat Akhtar1, , Firoj Parwej2 and Yusuf Perwej3

1DDepartment of Computer Science & Engineering, Babu Banarasi Das University, Lucknow, India

2Department of Computer Science & Engineering, Singhania University, Distt. Jhunjhunu, Rajasthan, India

3Department of Information Technology, Al Baha University, Al Baha, Kingdom of Saudi Arabia (KSA)

Pub. Date: May 15, 2017

Cite this paper:
Nikhat Akhtar, Firoj Parwej and Yusuf Perwej. A Perusal of Big Data Classification and Hadoop Technology. International Transaction of Electrical and Computer Engineers System. 2017; 4(1):26-38. doi: 10.12691/iteces-4-1-4

Abstract

Big Data make conversant with novel technology, skills and processes to your information architecture and the people that operate, design, and utilization them. The big data delineate a holistic information management contrivance that comprise and integrates numerous new types of data and data management together conventional data. The Hadoop is an unlocked source software framework licensed under the Apache Software Foundation, render for supporting data profound applications running on huge grids and clusters, to proffer scalable, credible, and distributed computing. This is invented to scale up from single servers to thousands of machines, every proposition local computation and storage. In this paper, we have endeavored to converse about on the taxonomy for big data and Hadoop technology. Eventually, the big data technologies are necessary in providing more actual analysis, which may leadership to more concrete decision-making consequence in greater operational capacity, cost deficiency, and detect risks for the business. In this paper, we are converse about the taxonomy of the big data and components of Hadoop.

Keywords:
Big Data storage infrastructure hama avro visualization data domains Hadoop JobTracker YARN

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  Gandomi, A., & Haider, M. Beyond, “The hype: big data concepts, methods, and analytics,” International Journal of Information Management, 35(2), 137-144, (2015).
 
[2]  Heudecker, Nick. 2013. “Hype Cycle for Big Data.” Gartner G00252431.
 
[3]  B. Elser and A. Montresor, “An evaluation study of bigdata frameworks for graph processing,'' in Proc. IEEE Int. Conf. Big Data, Oct. 2013, pp. 6067.
 
[4]  Yusuf Perwej, “An Experiential Study of the Big Data”, International Transaction of Electrical and Computer Engineers System, ISSN (Print): 2373-1273, ISSN (Online) 2373-1281, USA, Vol. 4, No. 1, Page 14-25, 2017.
 
[5]  F. H. Gebara, H. P. Hofstee, and K. J. Nowka, “Second-generation big data systems,'' Computer, vol. 48, no. 1, pp. 3641, 2015.
 
[6]  K. Shvachko, H. Kuang, S. Radia, R. Chansle, “The Hadoop Distributed File System”, Proceeding MSST 10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Pages 1-10, May 03 - 07, 2010.
 
[7]  Hinshaw F D, Meyers D L, Zane B M. Programmable streaming data processor for database appliance having multiple processing unit groups: US, US7577667, 2009.
 
[8]  Bollier, David. The Promise and Peril of Big Data. The Aspen Institute, 2010.
 
[9]  M. Khan, S.S. Khan, Data and Information Visualization Methods and Interactive Mechanisms: A Survey, International Journal of Computer Applications, 34(1), 2011, pp. 1-14.
 
[10]  D. Tang, C. Stolte, P. Hanrahan, “Polaris: A System for Query Analysis and Visualization of Multidimensional Relational Databases”, IEEE Trans. Visualization and Computer Graphics, vol. 8, no. 1, pp. 52-65, Jan.-Mar. 2002.
 
[11]  S. Card, J. MacKinlay, and B. Shneiderman, (1998). “Readings in Information Visualization: Using Vision to Think”. Morgan Kaufmann.
 
[12]  Alfredo R. Teyseyre and Marcelo R. Campo, “An Overview of 3D Software Visualization”, IEEE Transactions on Visualization and Computer Graphics, vol.15, No.1, 2009.
 
[13]  C.L. P. Chen, C.-Y. Zhang, Data-intensive applications, challenges, techniques and technologies: A survey on Big Data, Information Sciences, 275 (10), pp. 314-347, 2014.
 
[14]  J. Fan, F. Han, H. Liu, “ Challenges of big data analysis”, National Science Review, 1 (2) (2014), pp. 293-314.
 
[15]  K. Bakshi, “Considerations for Big Data: Architecture and Approach”, Aerospace Conference IEEE, Big Sky Montana, March 2012
 
[16]  Basili, V.R., Carver, J.C., Cruzes, D., Hochstein, L.M., Hollingsworth, J.K., Shull, F. and Zelkowitz, M.V. 2008. Understanding the High-Performance-Computing Community: A Software Engineer’s Perspective. IEEE Software
 
[17]  Maltzahn, C., Molina- Estolano, E., Khurana, A., Nelson, A.J., Brandt, S.A. and Weil, S. 2010. Ceph as a scalable alternative to the Hadoop Distributed File System. login: The USENIX Magazine. (2010).
 
[18]  Xu, C., Goldstone, R.J., Liu, Z., Chen, H., Neitzel, B. and Yu, W. 2015. Exploiting Analytics Shipping with Virtualized MapReduce on HPC Backend Storage Servers. IEEE Transactions on Parallel and Distributed Systems. PP, 99 (2015).
 
[19]  Peng Hong, Du Nan. Research of parallel technology in massive commerce data management system. Application Research of Computers. Vol. 26 No. 2 Feb. 2009.
 
[20]  T. Omer, P. Jules, “Big Data for All: Privacy and User Control in the Age of Analytics”, Northwestern Journal of Technology and Intellectual Property, article 1, vol. 11, issue 5, 2013.
 
[21]  A.A. Cardenas, P.K. Manadhata, S.P. Rajan, “Big Data Analytics for Security”, IEEE Security & Privacy, vol. 11, issue 6, pp. 74-76, 2013.
 
[22]  De Cristofaro, E., Soriente, C., Tsudik, G., & Williams, A. (2012). Hummingbird: Privacy at the time of twitter. In Security and Privacy (SP), 2012 IEEE Symposium on (pp. 285-299).
 
[23]  Mohammadian E, Noferesti M, Jalili R. FAST: fast anonymization of big data streams. In: ACM proceedings of the 2014 international conference on big data science and computing, article 1. 2014.
 
[24]  Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo!, “The Hadoop Distributed File System,” IEEE NASA storage conference, May 2010.
 
[25]  Cristina L. Abad, Huong Luu, Nathan Roberts, Kihwal Lee, Yi Lu, Roy H. Campbell, “Metadata Traces and Workload Models for Evaluating Big Storage Systems”, Proceedings of IEEE 5th International Conference on Utility and Cloud Computing (UCC), pp. 125-132, 2012.
 
[26]  Mohammad Asif Khan, Zulfiqar A. Memon, Sajid Khan, “Highly Available Hadoop NameNode Architecture”, International Conference on Advanced Computer Science Applications and Technologies 2012, pp. 167-172, 2012.
 
[27]  Jian Wan, Minggang Liu, Xixiang Hu “Dual-JT: Toward the high availability of JobTracker in Hadoop”, Cloud Computing Technology and Science (CloudCom), IEEE 4th International Conference on, 2012.
 
[28]  Rabl, Tilmann; Sadoghi, Mohammad; Jacobsen, Hans-Arno; Villamor, Sergio Gomez-; Mulero -, Victor Muntes; Mankovskii, Serge (2012-08-27). “Solving Big Data Challenges for Enterprise Application Performance Management” VLDB.
 
[29]  Leslie G. Valiant, A bridging model for parallel computation, Communications of the ACM, Volume 33 Issue 8, Aug. 1990.
 
[30]  W. Shang, Z. M. Jiang, H. Hemmati, B. Adams, A.E. Hassan, P. Martin, “Assisting developers of big data analytics applications when deploying on Hadoop clouds”, the Proceeding of the international conference on software engineering, vol. 203, pp. 402-411, 2013.
 
[31]  OnurSavas, YalinSagduyu, Julia Deng, and Jason Li, Tactical Big Data Analytics: Challenges, Use Cases and Solutions, Big Data Analytics Workshop in conjunction with ACM Sigmetrics 2013, June 21, 2013.
 
[32]  Ke Wang, Ning Liu, Iman Sadooghi, “Overcoming Hadoop Scaling Limitations through Distributed Task Execution”, Cluster Computing (CLUSTER), IEEE International Conference on, 2015.