International Journal of Data Envelopment Analysis and *Operations Research*
ISSN (Print): ISSN Pending ISSN (Online): ISSN Pending Website: https://www.sciepub.com/journal/ijdeaor Editor-in-chief: Ehsan Zanboori
Open Access
Journal Browser
Go
International Journal of Data Envelopment Analysis and *Operations Research*. 2023, 4(1), 1-32
DOI: 10.12691/ijdeaor-4-1-1
Open AccessArticle

Predicting Stock Investments Based on Sentiment and Historical Price Data

I. O. Olawale1, J. Iworiso1, and I. A. Amaunam2

1School of Computing and Digital Media, London Metropolitan University

2School of Computer Science and Electronic Engineering, University of Essex

Pub. Date: October 10, 2023

Cite this paper:
I. O. Olawale, J. Iworiso and I. A. Amaunam. Predicting Stock Investments Based on Sentiment and Historical Price Data. International Journal of Data Envelopment Analysis and *Operations Research*. 2023; 4(1):1-32. doi: 10.12691/ijdeaor-4-1-1

Abstract

This paper examines the impact of integrating sentiment data from Twitter with historical stock prices to enhance stock market prediction accuracy. The study employs a comparative analysis using machine learning and deep learning algorithms—Support Vector Machine (SVM), Recurrent Neural Network (RNN), and Bidirectional Encoder Representations from Transformers (BERT) on a single stock. Among these algorithms, BERT achieved the highest predictive accuracy with a rate of 93.21%. The outperformance of BERT algorithm over the other techniques in this study is an indicative evidence that deep learning classification algorithms are superior to conventional sentiment analysis in stock market predictability, with immense contribution to empirical literature. All computations and graphics in this study are obtained using Python. In an extension to this core analysis, the study simulates two distinct investment strategies using aggregated data from ten different stocks: a passive long-term investment and an active, sentiment-based bot trading strategy. These strategies were evaluated using separate machine learning algorithms—Random Forest and XGBoost classifiers—to inform real-time investment decisions. The results indicate that both versions of the bot trading strategies, regardless of the machine learning or deep learning model employed, consistently outperform the passive, long-term investment strategy. The findings corroborate the utility of incorporating social media sentiment into traditional stock prediction frameworks, thereby providing valuable insights for investors and financial institutions. This study underscores the transformative potential of advanced machine learning algorithms and sentiment analysis in reducing investment risks and enhancing decision-making.

Keywords:
sentiment analysis supervised learning classifiers ML DL accuracy

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  W. Chen, T. Chong and X. Duan, “The impact of sentiment on stock market predictability: evidence from the US and China.,” Applied Economics, vol. 49(45), pp. 4586-4598, 2017.
 
[2]  E. Guresen, G. Kayakutlu and T. U. Daim, “Using artificial neural network models in stock market index prediction,” Expert Systems with Applications, vol. 38(8), pp. 10389-10397, 2011.
 
[3]  J. Bollen, H. Mao and X. Zeng, “Twitter mood predicts the stock market,” Journal of Computational Science, vol. 2(1), pp. 1-8, 2011.
 
[4]  G. Ranco, D. Aleksovski, G. Caldarelli, M. Grčar and I. Mozetic, “The effects of Twitter sentiment on stock price returns,” PLoS ONE, vol. 10(9), p. e0138441, 2015.
 
[5]  X. Zhang, H. Fuehres and P. A. Gloor, “Predicting stock market indicators through Twitter “I hope it is not as bad as I fear”,” in Social and Behavioral Sciences, 2011.
 
[6]  T. H. Nguyen, K. Shirai and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, vol. 42(24), pp. 9603-9611, 2015.
 
[7]  T. Loughran and B. McDonald, “Textual analysis in accounting and finance,” A survey. Journal of Accounting Research, vol. 54(4), pp. 1187-1230, 2016.
 
[8]  B. G. Malkiel, “The efficient market hypothesis and its critics,” Journal of Economic Perspectives, vol. 17(1), pp. 59-82, 2003.
 
[9]  M. Ghiassi, J. Skinner and D. Zimbra, “Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network,” Expert Systems with Application, vol. 40(16), pp. 6266-6282, 2013.
 
[10]  C. Krauss, X. A. Do and N. Huck, “Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500.,” European Journal of Operational Research, vol. 259(2), pp. 689-702, 2017.
 
[11]  F. Wang, M. Li, W. Li, X. Jia and G. Rui, “Sentiment Analysis of StockTwits Using Transformer Models,” in 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 2021.
 
[12]  S. Yıldırım, D. Jothimani, C. Kavaklioğlu and A. Basar, “"Deep learning approaches for sentiment analysis on financial microblog dataset",” in IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019.
 
[13]  M. Li, L. Chen, J. Zhao and Q. Li, “"Sentiment analysis of chinese stock reviews based on bert model",” Applied Intelligence, pp. pp. 1-9, 2021.
 
[14]  B. Aysun, A. Sabrina, C. Mucahit and B. Ayse, “Sentiment Analysis of StockTwits Using Transformer Models,” in 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 2021.
 
[15]  L. Xiaodong, X. Haoran, C. Li, W. Jianping and D. Xiaotie, “News impact on stock price return via sentiment analysis,” Knowledge-Based Systems, vol. 69, pp. 14-23, 2014.
 
[16]  K. Zhou, C. Zhang and S. Yang, “Fine-tuning BERT for text classification tasks in the financial domain,” in IEEE International Conference on Big Data (Big Data) (pp. 2895-2902), 2019.
 
[17]  S. Jianfeng, M. Arjun, L. Bing and L. Qing , “Exploiting topic based twitter sentiment for stock prediction,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), 24-29, 2013.
 
[18]  Y. Kara, M. A. Boyacioglu and O. K. Baykan, “Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange,” Expert Systems with Applications, vol. 38(5), pp. 5311-5319., 2011.
 
[19]  I. Kumar, K. Dogra, C. Utreja and P. Yadav, “"A Comparative Study of Supervised Machine Learning Algorithms for Stock Market Trend Prediction",” in Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 100, 2018.
 
[20]  K. S. Amit, K. M. Pradeep and V. Attar, “"Multiple Kernel Learning for stock price direction prediction",” in International Conference on Advances in Engineering & Technology Research (ICAETR - 2014), pp. 1-4, 2104.
 
[21]  Z. Nurulhuda and S. Ali , “"Sentiment analysis using Support Vector Machine",” in International Conference on Computer Communications and Control Technology Proceedings, pp. 333-337, 2014.
 
[22]  N. Christina and T. Christos, “"A Methodology for Stock Movement Prediction Using Sentiment Analysis on Twitter and StockTwits Data",” in Computer Networks and Social Media Conference (SEEDA-CECNSM), Greece, 2021.
 
[23]  B. Rakhi and M. D. Sher, “"Integrating StockTwits with sentiment analysis for better prediction of stock price movement",” in International Conference on Computing Mathematics and Engineering Technologies (iCoMET), pp. 1-5., 2018.
 
[24]  W. Bao, J. Yue and Y. Rao, “A deep learning framework for financial time series using stacked autoencoders and long-short term memory.,” PLOS ONE, vol. 12(7), p. e0180944., 2017.
 
[25]  M. Kraus and S. Feuerriegel, “Decision support from financial disclosures with deep neural networks and transfer learning,” Decision Support Systems, vol. 104, pp. 38-48., 2017.
 
[26]  M. Shruti, C. Anubhav and C. K. Nagpal, “Stock Market Prediction by Incorporating News Sentiments Using Bert,” in Modern Approaches in Machine Learning & Cognitive Science: A Walkthrough, SpringerLink, 2022, pp. 35-45.
 
[27]  C.-C. Lee, Z. Gao and C.-L. Tsai, “BERT-Based Stock Market Sentiment Analysis,” in IEEE International Conference on Consumer Electronics , Taoyuan, Taiwan, 2020.
 
[28]  G. S. Matheus, S. Kenzo , d. S. R. Lucas, H. M. Pedro, R. F. Eraldo and T. M. Edson, “BERT for Stock Market Sentiment Analysis,” in IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 2019.
 
[29]  C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
 
[30]  I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, 2016.
 
[31]  V. Ashish , S. Noam, P. Niki, U. Jakob, J. Llion, N. G. Aidan, K. Łukasz and P. Illia, “"Attention is all you need", Advances in neural information processing systems,” in NeurIPS Proceedings, 2017.
 
[32]  J. Devlin, M. Chang, K. Lee and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.,” in In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 2019.
 
[33]  S. Vijayarani and R. Janani, “Text mining: open source tokenization tools-an analysis,” Advanced Computing and Intelligence: An International Journal, vol. 3, no. 1, p. 37–47, 2016.
 
[34]  P. Fabian, V. Gael and G. Alexandre, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
 
[35]  C. D. Manning, P. Raghavan and H. Schütze, Introduction to information retrieval., Cambridge University Press, 2008.
 
[36]  G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval.,” Information Processing & Management, vol. 25, no. 5, pp. 513-523, 1988.
 
[37]  J. Ramos, “Using tf-idf to determine word relevance in document queries,” 2003.
 
[38]  J. Pennington, R. Socher and C. D. Manning, “GloVe: Global Vectors for Word Representation.,” Doha, Qatar, 2014.
 
[39]  Z. Bodie, A. Kane and A. J. Marcus, Investments, New York: NY: McGraw-Hill Education., 2017.
 
[40]  B. Graham and D. L. Dodd, Security Analysis, New York: NY: Whittlesey House., 1934.
 
[41]  P. A. Fisher, Common Stocks and Uncommon Profits, New York: NY: Harper & Brothers., 1958.
 
[42]  M. H. Miller and F. Modigliani, “Dividend Policy, Growth, and the Valuation of Shares,” The Journal of Business, vol. 34, no. 4, pp. 411-433, 1961.
 
[43]  J. J. Siegel, Stocks for the Long Run: The Definitive Guide to Financial Market Returns and Long-term Investment Strategies, New York: NY: McGraw-Hill, 1998.
 
[44]  N. Jegadeesh and S. Titman, “Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency,” The Journal of Finance, vol. 48, no. 1, pp. 65-91., 1993.
 
[45]  T. J. Moskowitz, Y. H. Ooi and L. H. Pedersen, “Time Series Momentum,” Journal of Financial Economics, vol. 104, no. 2, pp. 228-250, 2012.
 
[46]  J. Iworiso and S. Vrontos, “On the directional predictability of equity premium using machine learning techniques,” Journal of Forecasting, vol. 39, no. 3, pp. 449-469, 2020.
 
[47]  J. Iworiso and S. Vrontos, “On the predictability of the equity premium using deep learning techniques,” The Journal of Financial Data Science, 2020.
 
[48]  J. Iworiso, “Forecasting stock market out-of-sample with regularised regression training techniques,” International Journal of Econometrics and Financial Management, vol. 11, no. 1, pp. 1-12, 2023.