| [1] | http://www.incapsula.com/blog/what-google-doesnt-show-you-31-of-website-traffic-can-harm-your-business.html. |
| |
| [2] | P. N. Tan and V. Kumar, “Discovery of web robot sessions based on their navigational patterns,” Data Mining and Knowledge Discovery, vol. 6, pp. 9-35, 2002. |
| |
| [3] | D. Doran and S. S. Gokhale. Web Robot Detection Techniques: Overview and Limitations. Data Mining and Knowledge Discovery, 22(1-2):183-210, 2011. |
| |
| [4] | M. F. Arlitt and C. L. Williamson, “Web server workload characterization: The search for invariants,” ACM SIGMETRICS Performance Evaluation Review, pp. 126-137, 1996. |
| |
| [5] | Mark E. Crovella and Azer Bestavros. Self-similarity in World Wide Web traffic: Evidence and possible causes. Transactions on Networking, 5(6):835-846, December 1997. |
| |
| [6] | J. X. Yu, Y. Ou, C. Zhang, and S. Zhang, “Identifying interesting customers through web log classification,” IEEE Intelligent Systems, vol. 20, no. 3, pp. 55-59, 2005. |
| |
| [7] | F. Li, K. Goseva-Popstojanova, and A. Ross, “Discovering web workload characteristics through cluster analysis,” in Proc. IEEE International Symposium on Network Computing and Applications, 2007, pp. 61-68. |
| |
| [8] | M. Spiliopoulou, “Web usage mining for web site evaluation,” Communications of the ACM, vol. 43, no. 8, 2000. |
| |
| [9] | M.-L. Shyu, C. Haruechaiyasak, and S.-C. Chen, “Mining user access patterns with traversal constraint for predicting web page requests,” Knowl. Inf. Syst., vol. 10, no. 4, pp. 515-528, 2006. |
| |
| [10] | Almeida, V., Menascé, D., Riedi, R., Peligrinelli, F., Fonseca, R., & Meira Jr, W. (2001, June). Analyzing Web robots and their impact on caching. In Proc. Sixth Workshop on Web Caching and Content Distribution (pp. 20-22). |
| |
| [11] | R. White and S. Drucker, “Investigating behavioral variability in web search,” in Proc. of the 16th Intl. conference on World Wide Web. ACM, 2007, pp. 21-30. |
| |
| [12] | X. Lin, L. Quan, and H. Wu, “An automatic scheme to categorize user sessions in modern http traffic,” in Proc. Of IEEE Global Telecommunications Conference (GLOBECOM 08), New Orleans, LO, November 2008, pp. 1-6. |
| |
| [13] | M. D. Dikaiakosa, A. Stassopouloub, and L. Papageorgioua. An Investigation of Web Crawler Behavior: Characterization and Metrics. Computer Networks, 28:880-897, 2005. |
| |
| [14] | Lee, Junsup, Sungdeok Cha, Dongkun Lee, and Hyungkyu Lee. “Classification of web robots: An empirical study based on over one billion requests.” computers & security 28, no. 8 (2009): 795-802. |
| |
| [15] | P. Huntington, D. Nicholas, and H. R. Jamali, “Web robot detection in the scholarly information environment,” Journal of Information Science, vol. 34, no. 5, pp. 726-741, 2008. |
| |
| [16] | G. Jacob, E. Kirda, C. Kruegel, and G. Vigna, “PUBCRAWL: protecting users and businesses from crawlers,” in Proceedings of the 21st USENIX conference on Security symposium. USENIX Association, 2012. |
| |
| [17] | Doran, Derek, Kevin Morillo, and Swapna S. Gokhale. “A comparison of web robot and human requests.” In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1374-1380. ACM, 2013. |
| |
| [18] | Sisodia, Dilip Singh, and Shrish Verma. “Web usage pattern analysis through web logs: A review.” In Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference on, pp. 49-53. IEEE, 2012. |
| |
| [19] | “AWStats - free log file analyzer for advanced statistics (GNU GPL), http://awstats.sourceforge.net/.(accesed in February 2014) |
| |
| [20] | User agents database http://www.user-agents.org/index.shtml (accessed in February 2014). |
| |
| [21] | Well known robots database http://www.robotstxt.org/db.html(accesed in February 2014). |
| |
| [22] | Berendt, B., Mobasher, B., Spiliopoulou, M., and Nakagawa, M. “A Framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis,” INFORMS Journal of Computing, Special Issue on Mining Web-Based Data for E-Business Applications Vol. 15, No. 2, 2003. |
| |