Detection of malicious and non-malicious website visitors using unsupervised neural network learning

被引:78
作者
Stevanovic, Dusan [1 ]
Vlajic, Natalija [1 ]
An, Aijun [1 ]
机构
[1] York Univ, Dept Comp Sci & Engn, Toronto, ON M3J 1P3, Canada
关键词
Web crawler detection; Neural networks; Web server access logs; Machine learning; Clustering; Denial of service;
D O I
10.1016/j.asoc.2012.08.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed denials of service (DDoS) attacks are recognized as one of the most damaging attacks on the Internet security today. Recently, malicious web crawlers have been used to execute automated DDoS attacks on web sites across the WWW. In this study, we examine the use of two unsupervised neural network (NN) learning algorithms for the purpose web-log analysis: the Self-Organizing Map (SOM) and Modified Adaptive Resonance Theory 2 (Modified ART2). In particular, through the use of SOM and modified ART2, our work aims to obtain a better insight into the types and distribution of visitors to a public web-site based on their browsing behavior, as well as to investigate the relative differences and/or similarities between malicious web crawlers and other non-malicious visitor groups. The results of our study show that, even though there is a pretty clear separation between malicious web-crawlers and other visitor groups, 52% of malicious crawlers exhibit very 'human-like' browsing behavior and as such pose a particular challenge for future web-site security systems. Also, we show that some of the feature values of malicious crawlers that exhibit very 'human-like' browsing behavior are not significantly different than the features values of human visitors. Additionally, we show that Google, MSN and Yahoo crawlers exhibit distinct crawling behavior. (C) 2012 Elsevier B. V. All rights reserved.
引用
收藏
页码:698 / 708
页数:11
相关论文
共 19 条
[1]  
[Anonymous], 2010, EV BOTN CAP WHAT THI
[2]  
Bomhardt C., 2005, P SISCLADAG BOL IT
[3]  
Carpenter G.A., 1998, HDB BRAIN THEORY NEU, P79
[4]   Web robot detection techniques: overview and limitations [J].
Doran, Derek ;
Gokhale, Swapna S. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 22 (1-2) :183-210
[5]   Web Spambot Detection Based on Web Navigation Behaviour [J].
Hayati, Pedram ;
Potdar, Vidyasagar ;
Chai, Kevin ;
Talevski, Alex .
2010 24TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2010, :797-803
[6]  
Hiltunen Y, 2002, LECT NOTES COMPUT SC, V2412, P31
[7]  
Kohonen T., 2001, SELF ORG MAPS, V3rd ed, DOI 10.1007/978-3-642-56927-2
[8]   Classification of web robots: An empirical study based on over one billion requests [J].
Lee, Junsup ;
Cha, Sunydeok ;
Lee, Dongkun ;
Lee, Hyungkyu .
COMPUTERS & SECURITY, 2009, 28 (08) :795-802
[9]  
Martin-Guerrero J., 2006, IADIS SAN SAB SPAIN, P334
[10]  
Oikonomou G, 2009, IEEE ICC, P625