Journal of Computer Sciences and Applications
ISSN (Print): 2328-7268 ISSN (Online): 2328-725X Website: https://www.sciepub.com/journal/jcsa Editor-in-chief: Minhua Ma, Patricia Goncalves
Open Access
Journal Browser
Go
Journal of Computer Sciences and Applications. 2015, 3(6), 177-180
DOI: 10.12691/jcsa-3-6-13
Open AccessArticle

Big Data Processing: Big Challenges and Opportunities

Neelamani Samal1 and Nilamadhab Mishra1,

1Department of Computer Science & Engineering, Gandhi Institute for Education and Technology, Bhubaneswar, Odisha, India

Pub. Date: December 31, 2015

Cite this paper:
Neelamani Samal and Nilamadhab Mishra. Big Data Processing: Big Challenges and Opportunities. Journal of Computer Sciences and Applications. 2015; 3(6):177-180. doi: 10.12691/jcsa-3-6-13

Abstract

With the rapid growth of emerging applications like social network, semantic web, sensor networks and LBS (Location Based Service) applications, a variety of data to be processed continues to witness a quick increase. Effective management and processing of large-scale data poses an interesting but critical challenge. Recently, big data has attracted a lot of attention from academia, industry as well as government. This paper introduces several big data processing techniques from system and application aspects. First, from the view of cloud data management and big data processing mechanisms, we present the key issues of big data processing, including definition of big data, big data management platform, big data service models, distributed file system, data storage, data virtualization platform and distributed applications. Following the Map Reduce parallel processing framework, we introduce some MapReduce optimization strategies reported in the literature. Finally, we discuss the open issues and challenges, and deeply explore the research directions in the future on big data processing in cloud computing environments.

Keywords:
big data cloud computing data management distributed processing

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  American Institute of Physics (AIP). 2010. College Park, MD,
 
[2]  (http://www.aip.org/fyi/2010/)
 
[3]  Ayres, I. 2007. Supercrunchers, Bantam Books, New York, NY.
 
[4]  The State of the Art in Distributed Query Processing DONALD KOSSMANN, University of Passau, ACM Computing Surveys, Vol. 32, No. 4, December 2000.
 
[5]  The Apprenda Library
 
[6]  (https://apprenda.com/library/paas/iaas-paas-saas-explained-compared/).
 
[7]  Felten, E. 2010. “Needle in a Haystack Problems”,https://freedom-to-tinker.com/blog/felten/needle-haystackproblems/
 
[8]  Fox, B. 2011. “Leveraging Big Data for Big Impact”, Health Management Technology, http://www.healthmgttech.com/.
 
[9]  https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html.
 
[10]  Gantz, J. and E. Reinsel. 2011. “Extracting Value from Chaos”, IDC’s Digital Universe Study, sponsored by EMC.
 
[11]  Jacobs, A. 2009. “Pathologies of Big Data”, Communications of the ACM, 52(8):36-44.
 
[12]  JASON. 2008. “Data Analysis Challenges”, The Mitre Corporation, McLean, VA, JSR-08-142
 
[13]  Kaisler, S. 2012. “Advanced Analytics”, CATALYST Technical Report, i_SW Corporation, Arlington, VA
 
[14]  https://en.wikipedia.org/wiki/BigTable
 
[15]  https://en.wikipedia.org/wiki/Pnuts
 
[16]  https://cs.uwaterloo.ca/~kmsalem/courses/.../Chalamalla-HadoopDB.pdf
 
[17]  https://en.wikipedia.org/wiki/Greenplum
 
[18]  https://pig.apache.org/
 
[19]  www.cs.rutgers.edu/~zz124/cs671.../srikanth_mapreducemerge.pdf. Map-Reduce-Merge: Simplified Relational Data. Processing on Large. Clusters. Hung-chih Yang, Ali Dasdan. Yahoo! Ruey-Lung Hsiao, D. Sto Parker.
 
[20]  http://www.journalofcloudcomputing.com/content/3/1/12. Improving the performance of Hadoop Hive by sharing scan and computation tasks Tansel Dokeroglu1, Serkan Ozal1, Murat Ali Bayir2, Muhammet Serkan Cinar3 and Ahmet Cosar1.
 
[21]  Liu et al. "An Investigation of Ptractical Approximate Nearest Neighbor Algorithms", 2004. Carnegie-Mellon University, pp. 1-8.
 
[22]  www.elsevier.com/locate/jcss , Journal of Computer and System Sciences 77 (2011) 637-651.
 
[23]  Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis, IJCAI-07 1606 , Evgeniy Gabrilovich and Shaul Markovitch Department of Computer Science Technion—Israel Institute of Technology, 32000 Haifa, Israel
 
[24]  {gabr,shaulm}@cs.technion.ac.il.
 
[25]  https://en.wikipedia.org/wiki/MapReduce.
 
[26]  Applied Spatial Data Analysis with R Authors:Roger S. Bivand, Edzer Pebesma, Virgilio Gómez-Rubio.
 
[27]  A twelve-analyzer detector system for high-resolution powder diffraction P. L. Lee, D. Shu, M. Ramanathan, C. Preissner, J. Wang, M. A. Beno, R. B. Von Dreele, L. Ribaud, C. Kurtz, S. M. Antao, X. Jiao and B. H. Toby. J. Synchrotron Rad. (2008). 15, 427-432.