Journal of Computer Sciences and Applications
ISSN (Print): 2328-7268 ISSN (Online): 2328-725X Website: Editor-in-chief: Minhua Ma, Patricia Goncalves
Open Access
Journal Browser
Journal of Computer Sciences and Applications. 2017, 5(2), 42-49
DOI: 10.12691/jcsa-5-2-1
Open AccessArticle

Characterisation of Academic Journal Publications Using Text Mining Techniques

Adebola K. Ojo1, and Adesesan B. Adeyemo1

1Computer Science Department, University of Ibadan, Ibadan, Nigeria

Pub. Date: June 19, 2017

Cite this paper:
Adebola K. Ojo and Adesesan B. Adeyemo. Characterisation of Academic Journal Publications Using Text Mining Techniques. Journal of Computer Sciences and Applications. 2017; 5(2):42-49. doi: 10.12691/jcsa-5-2-1


The ever-growing volume of published academic journals and the implicit knowledge that can be derived from them has not fully enhanced knowledge development but rather resulted into information and cognitive overload. However, publication data are textual, unstructured and anomalous. Analysing such high dimensional data manually is time consuming and this has limited the ability to make projections and trends derivable from the patterns hidden in various publications. This study was designed to develop and use intelligent text mining techniques to characterise academic journal publications. Journals Scoring Criteria by nineteen rankers from 2001 to 2013 of 50th edition of Journal Quality List (JQL) were used as criteria for selecting the highly rated journals. The text-miner software developed was used to crawl and download the abstracts of papers and their bibliometric information from the articles selected from these journal articles. The datasets were transformed into structured data and cleaned using filtering and stemming algorithms. Thereafter, the data were grouped into series of word features based on bag of words document representation. The highly rated journals were clustered using Self-Organising Maps (SOM) method with attribute weights in each cluster.

highly rated journals text mining self-organising maps filtering and stemming algorithms

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit


[1]  Harzing, A.W. 2010 The Publish or Perish Book: Your guide to effective and responsible citation analysis, Melbourne: Tarma Software Research.
[2]  Harzing, A.W. 2013. Journal Quality List. University of Melbourne Department of Management Parkville Campus Parkville VIC 3010 Australia.
[3]  Harzing, A.W. 2006. Journal Quality List. University of Melbourne Department of Management Parkville Campus Parkville VIC 3010 Australia.
[4]  Wu, Sheng-Tang. 2007. Knowledge Discovery Using Pattern Taxonomy Model in Text Mining. A dissertation in the Faculty of Information Technology, Queensland University of Technology.
[5]  Devedzic V. 2001. Knowledge discovery and data mining in databases. In S. K. Chang, editor, Handbook of Software Engineering and Knowledge Engineering, Vol. 1 – Fundamentals, pages 615-637. World Scientific Publishing Co, 2001.
[6]  Osofisan, A. O. 2011. Transforming Data Dust to Data Gold. An Inaugural Lecture Delivered at the University of Ibadan, 25 August, 2011. Ibadan University Press.
[7]  Francis, Louise and Flynn, Matt. 2010. Text Mining Handbook. Casualty Actuarial Society E-Forum
[8]  Dorre J. Gerstl P. and Seiffert R. 1999. Text Mining: Finding Nuggets in Mountains of Textual data. In Proc. 5th ACM Int. Conf. on Knowledge Discovery and Data Mining (KDD-99), pages 398-401, San Diego, US, 1999. ACM Press, New York, US.
[9]  Hearst M., 1999. “Untangling Text Data Mining,” In the Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics.
[10]  Chen, Kuan C. 2009. Text Mining e-Complaints Data From e-Auction Store With Implications For Internet Marketing Research Purdue University Calumet, USA. Journal of Business & Economics Research – May, 2009 Volume 7, Number 5
[11]  Robb Drew. Taming Text. 2005.
[12]  Cerrito, Patricia 2005. Inside Text Mining., March 24, 2005.
[13]  Haralampos, Karanikas and Babis, Theodoulidis Manchester. 2001. Knowledge Discovery in Text and Text Mining Software, Centre for Research in Information Management, UK.
[14]  Gupta, Vishal and Lehal, Gurpreet S. 2009. A Survey of Text Mining Techniques and Applications. Journal of Emerging Technologies in Web Intelligence, Vol. 1, No.1. August 2009.
[15]  Umajancy. S. and Thanamani, Antony Selvadoss. 2013 An Analysis on Text Mining –Text Retrieval and Text Extraction, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, Issue 8.
[16]  Humphreys, K., Demetriou, G., and Gaizauskas, R. 2000. Bioinformatics applications of information extraction for scientific journal articles. Journal of Information Science. 26. 75-85.
[17]  Sharp, K. 2001. Internet librarianship: Traditional roles in a new environment. IFLA Journal, 27 (2): 78-81.
[18]  Gharehchopogh, F. and Z. Khalifelu, 2011. Analysis and evaluation of unstructured data: Text mining versus natural language processing. Proceedings of the 5th International Conference on Application of Information and Communication Technologies, Oct. 12-14, IEEE Xplore Press, pp: 1-4.
[19]  Bolasco, S., Baiocchi, F., Canzonetti, A., della Ratta-Rinaldi, F., Feldman, A. 2004. Applications, sectors and strategies of Text Mining: a first overall picture, in S. Sirmakessi (ed.) Text mining and its Applications, Springer Verlag, Heidelberg, pp. 37-52.
[20]  Manning, Christopher D. and Schiitze, Hinrich. 1999. Foundations of Statistical Natural Language Processing. The MIT Press Cambridge, Massachusetts London, England
[21]  Baeza-Yates, Ricardo and Ribeiro-Neto, Berthier. 1999. Modern Information Retrieval. ACM Press, New York. Addison-Wesley. Retrieved from
[22]  Kao, A., and Poteet, S., 2004. Report on KDD Conference 2004 Panel Discussion Can Natural Language Processing Help Text Mining? SIGKDD Exp. Newsl. 6(2), 132-133.
[23]  Navathe Shamkant B. and Elmasri Ramez. 2000, Data Warehousing and Data Mining, in ‘Fundamentals of Database Systems’. Pearson Education pvt Inc. Singapore, 841-872.
[24]  Lizhen Liu and Junjie Chen 2002. The Research of Web Mining, Proceedings of the 4th World Congress on Intelligent Control and Automation June 10-14, 2002, Shanghai, P.R. China, IEEE. 2333-2337.
[25]  Brin S., and Page L.1998. The anatomy of a largescale hyper textual Web search engine, Computer Networks and ISDN Systems, 30(1-7): 107-117.
[26]  Kleinberg J.M., 1999, Authoritative sources in hyperlinked environment, Journal of ACM, Vol.46, No.5, 604-632.
[27]  JISC, 2008. Text Mining Briefing Paper, Joint Information Systems Committee, accessed from (27 October 2009).
[28]  Dahl Stephan. 2010. ‘Current Themes in Social Marketing Research: Text-Mining the Past Five Years’, Social Marketing Quarterly, 16: 2, and 128-136.
[29]  Kim J., Ohta T., Tsuruoka Y., Tateisi Y. and Collier N. 2004. Introduction to the Bio-Entity task at jnlpba. In N. Collier, P. Ruch, and A. Nazarenko, editors, Proc. Workshop on Natural Language Processing in Biomedicine and its Applications, pages 70-76, 2004.
[30]  PaaB G. and deVries H. 2005. Evaluating the performance of text mining systems on real-world press archives. In Proc. 29th Annual Conference of the German Classification Society (GfKl 2005), pages 61-70, Oulu, Finland, Sep 2001. Infotech.
[31]  Kaushik, Abhishek and Naithani, Sudhanshu 2016. A Comprehensive Study of Text Mining Approach. International Journal of Computer Science and Network Security, Vol. 16 No. 2, February 2016.
[32]  Fan W., Gordon M. D., and Pathak P. 2006. Personalization of search engine services for effective retrieval and knowledge management. In Proceedings of the 21th International Conference on Information Systems, pages 20-34, 2006.
[33]  Stavrianou Anna, Andritsos Periklis and Nicoloyannis Nicolas. 2007. Overview and Semantic Issues of Text Mining. SIGMOD Record, September 2007. Vol. 36, No.3.
[34]  McCallum Andrew. 2005. Information Extraction: Distilling Structured Data from Unstructured Text. ACM Queue, 3(9), November 2005.
[35]  Metzler, D., Bernstein, Y., Croft, W. B., Moffat, A., and Zobel, J. 2005. Similarity measures for tracking information flow. In Proc. of CIKM, Bremen. Germany, pp. 517-524.
[36]  Nenadic, G., and Ananiadou, S. 2006. Mining semantically related terms from biomedical literature. In ACM TALIP Special Issue on Text Mining and Management in Biomedicine, 5(1), pp. 22-43.
[37]  Adeyemo A.B. and Ojo A.K. 2014, “Classification of Social Blogs Comments Using Text Mining”. International Journal of Computer Science Issues (IJCSI), Vol. 11 No. 6. Pp. 54-58