Detecting Malicious DNS over HTTPS Traffic in Domain Name System using Machine Learning Classifiers

Yaser M. Banadaki

Journal of Computer Sciences and Applications. 2020, 8(2), 46-55
DOI: 10.12691/jcsa-8-2-2

Open AccessArticle

Detecting Malicious DNS over HTTPS Traffic in Domain Name System using Machine Learning Classifiers

Yaser M. Banadaki^1,

¹Department of Computer Science, Southern University and A&M College, Baton Rouge, LA, 70813, USA

Pub. Date: August 20, 2020

View Full Text Full Text PDF (2279 KB) Full Text ePUB(1918 KB)

Cite this paper:
Yaser M. Banadaki. Detecting Malicious DNS over HTTPS Traffic in Domain Name System using Machine Learning Classifiers. Journal of Computer Sciences and Applications. 2020; 8(2):46-55. doi: 10.12691/jcsa-8-2-2

Abstract

This paper presents a systematic two-layer approach for detecting DNS over HTTPS (DoH) traffic and distinguishing Benign-DoH traffic from Malicious-DoH traffic using six machine learning algorithms. The capability of machine learning classifiers is evaluated considering their accuracy, precision, recall, and F-score, confusion matrices, ROC curves, and feature importance. The results show that LGBM and XGBoost algorithms outperform the other algorithms in almost all the classification metrics reaching the maximum accuracy of 100% in the classification tasks of layers 1 and 2. LGBM algorithms only misclassified one DoH traffic test as non-DoH out of 4000 test datasets. It has also found that out of 34 features extracted from the CIRA-CIC-DoHBrw-2020 dataset, SourceIP is the critical feature for classifying DoH traffic from non-DoH traffic in layer one followed by DestinationIP feature. However, only DestinationIP is an important feature for LGBM and gradient boosting algorithms when classifying Benign-DoH from Malicious-DoH traffic in layer 2.

Keywords:
machine learning classifiers DNS over HTTPs traffic domain name system

This work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Figures

Figure 2 of 8

References:

[1]	Davidowicz, D., “Domain name system (DNS) security,” Yahoo Geocities, 1999.

[2]	Chatzis, N., “Motivation for behaviour-based DNS security: A taxonomy of DNS-related internet threats,” International Conference on Emerging Security Information, Systems, and Technologies (SECUREWARE 2007), 2007, 36-41.

[3]	Ahmim, A., Ghoualmi–Zine, N., “A new adaptive intrusion detection system based on the intersection of two different classifiers,” International Journal of Security and Networks, 9(3), 125-132, Jan.2014.

[4]	Ahmim, A., Zine, N. G., “A new hierarchical intrusion detection system based on a binary tree of classifiers,” Information & Computer Security, Mar. 2015.

[5]	Detrow, S. “Obama on Russian Hacking: ‘We Need to Take Action. And We Will’,” NPR News, 2016.

[6]	Langner, R., “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Security & Privacy, 9(3), 49-51, 2011.

[7]	Bouteraa, I., Derdour, M., Ahmim, A., “Intrusion Detection using Data Mining: A contemporary comparative study,” 3rd International Conference on Pattern Analysis and Intelligent Systems (PAIS), IEEE, 2018, 1-8.

[8]	Kshetri, N., “Kaspersky Lab: from Russia with anti-virus,” Emerald Emerging Markets Case Studies, 2011.

[9]	Liao, H.-J., Lin, C.-H. R., Lin, Y.-C., Tung, K.-Y. “Intrusion detection system: A comprehensive review,” Journal of Network and Computer Applications, 36(1), 16-24, 2013.

[10]	Bace, R., Mell, P., “NIST special publication on intrusion detection systems,” BOOZ-ALLEN AND HAMILTON INC MCLEAN VA, 2001.

[11]	Ertam, F., Kilincer, L. F., Yaman, O., “Intrusion detection in computer networks via machine learning algorithms,” International Artificial Intelligence and Data Processing Symposium (IDAP), IEEE, 2017, 1-4.

[12]	Lazarevic, A., Kumar, V., Srivastava, J. “Intrusion detection: A survey,” Managing Cyber Threats: Springer, 2005, 19-78.

[13]	Tsai, C.-F., Hsu, Y.-F., Lin, C.-Y., Lin, W.-Y. “Intrusion detection by machine learning: A review,” expert systems with applications, 36(10), 11994-12000, 2009.

[14]	Li, W., Li, Q., “Using naive Bayes with AdaBoost to enhance network anomaly intrusion detection,” Third International Conference on Intelligent Networks and Intelligent Systems, IEEE, 2010, 486-489.

[15]	Gautam, S. K., Om, H., “Computational neural network regression model for Host based Intrusion Detection System,” Perspectives in Science, 8, 93-95, 2016.

[16]	Jha, J. Ragha, L., “Intrusion detection system using support vector machine,” International Journal of Applied Information Systems (IJAIS), 3, 25-30, 2013.

[17]	Liu, G., Yi, Z., Yang, S., “A hierarchical intrusion detection model based on the PCA neural networks,” Neurocomputing, 70 (7-9), 1561-1568, 2007.

[18]	Zhang, J., Zulkernine, M., Haque, A., “Random-forests-based network intrusion detection systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(5), 649-659, 2008.

[19]	Montazeri Shatoori, M., Davidson, L., Kaur, G., Habibi Lashkari, A., “Detection of DoH Tunnels using Time-series Classification of Encrypted Traffic,” in The 5th IEEE Cyber Science and Technology Congress, Calgary, Canada, 2020.

[20]	Hoyt, R. E., Snider, D. H., Thompson, C. J., Mantravadi, S., “IBM Watson analytics: automating visualization, descriptive, and predictive statistics,” JMIR public health and surveillance, 2(2), 157, 2016.

[21]	Regkas, G., “Empowering Citizen Data Scientists with IBM Watson AutoAI,” in https://towardsdatascience.com/empowering- citizen-data-scientists-with-watson-autoai-49a087df99e5, 2020.

[22]	Li, X., Ye, N., “Decision tree classifiers for computer intrusion detection,” Journal of Parallel and Distributed Computing Practices, 4(2), 179-190, 2001.

[23]	Geurts, P., Ernst, D., Wehenkel, L., “Extremely randomized trees,” Machine learning, 63(1), 3-42, 2006.

[24]	Verma, P., Anwar, S., Khan, S., Mane, S. B., “Network intrusion detection using clustering and gradient boosting,” 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT) IEEE, 2018, 1-7.

[25]	Dhaliwal, S. S., Nahid, A.-A., Abbas, R., “Effective intrusion detection system using XGBoost,” Information, 9(7), 149, 2018.

[26]	Alzamzami, F., Hoda, M., Saddik, A. El, “Light Gradient Boosting Machine for General Sentiment Classification on Short Texts: A Comparative Evaluation,” IEEE Access, 2020.

[27]	Farnaaz, N., Jabbar, M., “Random forest modeling for network intrusion detection system,” Procedia Computer Science, 89(1), 213-217, 2016.

[28]	Lashkari, A. H., Seo, A., Gil, G. D., Ghorbani, A., “CIC-AB: Online ad blocker for browsers,” International Carnahan Conference on Security Technology (ICCST) IEEE, 2017, 1-7.