Automatic Control and Information Sciences
ISSN (Print): 2375-1649 ISSN (Online): 2375-1630 Website: http://www.sciepub.com/journal/acis Editor-in-chief: Apply for this position
Open Access
Journal Browser
Go
Automatic Control and Information Sciences. 2017, 3(1), 8-15
DOI: 10.12691/acis-3-1-3
Open AccessReview Article

Heterogeneous Data and Big Data Analytics

Lidong Wang1,

1Department of Engineering Technology, Mississippi Valley State University, Itta Bena, MS, USA

Pub. Date: June 15, 2017

Cite this paper:
Lidong Wang. Heterogeneous Data and Big Data Analytics. Automatic Control and Information Sciences. 2017; 3(1):8-15. doi: 10.12691/acis-3-1-3

Abstract

Heterogeneity is one of major features of big data and heterogeneous data result in problems in data integration and Big Data analytics. This paper introduces data processing methods for heterogeneous data and Big Data analytics, Big Data tools, some traditional data mining (DM) and machine learning (ML) methods. Deep learning and its potential in Big Data analytics are analysed. The benefits of the confluences among Big Data analytics, deep learning, high performance computing (HPC), and heterogeneous computing are presented. Challenges of dealing with heterogeneous data and Big Data analytics are also discussed.

Keywords:
Big Data Big Data analytics heterogeneous data deep learning data mining machine learning heterogeneous computing computational intelligence artificial intelligence

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  Chen M, Mao S, Liu Y. Big data: A survey. Mobile Networks and Applications. 2014 Apr 1; 19(2): 171-209.
 
[2]  Jirkovský V, Obitko M. Semantic Heterogeneity Reduction for Big Data in Industrial Automation. InITAT 2014.
 
[3]  Singh VK, Gao M, Jain R. Situation recognition: an evolving problem for heterogeneous dynamic big multimedia data. InProceedings of the 20th ACM international conference on Multimedia 2012 Oct 29 (pp. 1209-1218). ACM.
 
[4]  Hai R, Geisler S, Quix C. Constance: An intelligent data lake system. In Proceedings of the 2016 International Conference on Management of Data 2016 Jun 26 (pp. 2097-2100). ACM.
 
[5]  Caragea D. Learning classifiers from distributed, semantically heterogeneous, autonomous data sources (Doctoral dissertation, Iowa State University), 2004, 1-225.
 
[6]  Anderson K, Arora A, Aoi S, Fujinuma K, et al. Big Data and Disaster Management, Technical Report No. GIT-CERCS-13-09; Georgia Institute of Technology, CERCS. A Report from the JST/NSF Joint Workshop, JST/NSF Joint Workshop Report on Big Data and Disaster Management, Editors, C. Pu and M. Kitsuregawa, May 2013.
 
[7]  Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2014). Data mining with big data. IEEE Transactions on Knowledge and Data Engineering, 26(1), 97-107.
 
[8]  Fang R, Pouyanfar S, Yang Y, Chen SC, Iyengar SS. Computational health informatics in the big data age: a survey. ACM Computing Surveys (CSUR). 2016 Jun 14; 49(1): 12.
 
[9]  Pyle D. Data preparation for data mining. Morgan Kaufmann, 1999.
 
[10]  Kabacoff R. R in action: data analysis and graphics with R. Manning Publications Co.; 2015 Mar 3.
 
[11]  Torgo L, Torgo L. Data mining with R: learning with case studies. Boca Raton, FL:: Chapman & Hall/CRC; 2011.
 
[12]  Han J, Pei J, Kamber M. Data mining: concepts and techniques. Elsevier; 2011 Jun 9.
 
[13]  Data B. Transport: Understanding and assessing options. Corporate Partnership Board Report, the Organization for Economic Cooperation and Development (OECD)/International Transport Forum, May 2015, 1-66.
 
[14]  Viña A. Data Virtualization Goes Mainstream, White Paper, Denodo Technologies, Inc, USA, 2015, 1-18.
 
[15]  Rudin C., Dunson D., Irizarry R., Ji H., Laber E., Leek J., & Wasserman L. Discovery with data: Leveraging statistics with computer science to transform science and society. July 2, 2014, 1-27.
 
[16]  Curry E, Kikiras P, Freitas A. et al. Big Data Technical Working Groups, White Paper, BIG Consortium, 2012.
 
[17]  Pullokkaran LJ. Analysis of data virtualization & enterprise data standardization in business intelligence (Doctoral dissertation, Massachusetts Institute of Technology), 2013.
 
[18]  Stein B, Morrison A. The enterprise data lake: Better integration and deeper analytics. PwC Technology Forecast: Rethinking integration. 2014(1), 1-9.
 
[19]  Shalev-Shwartz S, Ben-David S. Understanding machine learning: From theory to algorithms. Cambridge university press; 2014 May 19.
 
[20]  Abbass HA, editor. Data Mining: A Heuristic Approach: A Heuristic Approach. IGI Global; 2001 Jul 1.
 
[21]  Wikibook, Data Mining Algorithms In R - Wikibooks, open books for an open world. PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/, 2014 14 Jul.
 
[22]  Galit S, Nitin P, Peter B. Data Mining in Excel: Lecture Notes and Cases. Resampling Stats, Inc., USA, 2005 December 30.
 
[23]  Harrington P. Machine learning in action. Greenwich, CT: Manning; 2012 Apr 16.
 
[24]  Hertzmann A, Fleet D. Machine Learning and Data Mining Lecture Notes. Computer Science Department, University of Toronto. 2010.
 
[25]  Zhang J, Yang X, Appelbaum D. Toward effective Big Data analysis in continuous auditing. Accounting Horizons. 2015 Jun; 29(2):469-76.
 
[26]  Tak PA, Gumaste SV, Kahate SA, The Challenging View of Big Data Mining, International Journal of Advanced Research in Computer Science and Software Engineering, 5(5), May 2015, 1178-1181.
 
[27]  Yenkar V, Bartere M. Review on “Data Mining with Big Data”. International Journal of Computer Science and Mobile Computing, 3(4), April 2014, 97-102.
 
[28]  NESSI, Big Data: A New World of Opportunities, NESSI White Paper, the Networked Software and Services Initiative (NESSI), December 2012, 1-25.
 
[29]  Stanford Law School, Big Data and Privacy: Making Ends Meet, The Future of Privacy Forum (FPF), September 10th, 2013, 1-122.
 
[30]  Datameer, Inc. The Guide to Big Data Analytics, White Paper, 2013, 1-39.
 
[31]  Schotman R, Mitwalli A. Big Data for Marketing: When is Big Data the right choice? Canopy – The Open Cloud Company, 2013, p8.
 
[32]  Jaseena KU, David JM. Issues, challenges, and solutions: big data mining. NeTCoM, CSIT, GRAPH-HOC, SPTM–2014. 2014: 131-40.
 
[33]  Daniel D. Gutierrez, InsideBIGDATA Guide to Big Data for Finance, White Paper, DELL and intel, Whitepaper, 2015, 1-14.
 
[34]  Kreuter F, Berg M, Biemer P, Decker P, Lampe C, Lane J, O'Neil C, Usher A. AAPOR Report on Big Data. Mathematica Policy Research; 2015 Feb 12.
 
[35]  Labrinidis A, Jagadish HV. Challenges and opportunities with big data. Proceedings of the VLDB Endowment. 2012 Aug 1; 5(12): 2032-2033.
 
[36]  Zhao Y. R and data mining: Examples and case studies. Academic Press; 2012 Dec 31.
 
[37]  Chappelle D. Big Data & Analytics Reference Architecture, Oracle White Paper, Oracle Enterprise Transformation Solutions Series, September 2013, 1-39.
 
[38]  Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. Journal of Big Data. 2015 Feb 24; 2(1): 1.
 
[39]  Wu R, Deep learning meets heterogeneous computing, Workshop, Baidu Inc., 2014.
 
[40]  Sivarajah U, Kamal MM, et al. Critical Big Data Analysis Challenges and Analytical Methods, Journal of Business Research, 70, (2017, 263-286.
 
[41]  Elgendy N and Elragal A, Big Data Analytics: A Literature Review Paper. P. Perner (Ed.): ICDM 2014, LNAI 8557, Springer International Publishing Switzerland, 2014, 214-227.
 
[42]  Yusuf Perwej, An Experiential Study of the Big Data, International Transaction of Electrical and Computer Engineers System, 2017, Vol. 4, No. 1, 14-25.
 
[43]  Almeida FL, Calistru C. The main challenges and issues of big data management. International Journal of Research Studies in Computing. 2013 Oct 9; 2(1).