Social Fingerprinting: Detection of Spambot Groups Through DNA-Inspired Behavioral Modeling

被引:110
作者
Cresci, Stefano [1 ]
Di Pietro, Roberto [1 ,2 ,3 ]
Petrocchi, Marinella
Spognardi, Angelo [1 ,4 ]
Tesconi, Maurizio [1 ]
机构
[1] CNR, Inst Informat & Telemat IIT, I-56124 Pisa, Italy
[2] Nokia Bell Labs, F-91620 Paris, France
[3] Univ Padua, Math Dept, I-35122 Padua, Italy
[4] Tech Univ Denmark, DTU Compute, DK-2800 Lyngby, Denmark
关键词
Spambot detection; social bots; online social networks; Twitter; behavioral modeling; digital DNA; PREDICTION;
D O I
10.1109/TDSC.2017.2681672
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Spambot detection in online social networks is a long-lasting challenge involving the study and design of detection techniques capable of efficiently identifying ever-evolving spammers. Recently, a new wave of social spambots has emerged, with advanced human-like characteristics that allow them to go undetected even by current state-of-the-art algorithms. In this paper, we show that efficient spambots detection can be achieved via an in-depth analysis of their collective behaviors exploiting the digital DNA technique for modeling the behaviors of social network users. Inspired by its biological counterpart, in the digital DNA representation the behavioral lifetime of a digital account is encoded in a sequence of characters. Then, we define a similarity measure for such digital DNA sequences. We build upon digital DNA and the similarity between groups of users to characterize both genuine accounts and spambots. Leveraging such a characterization, we design the Social Fingerprinting technique, which is able to discriminate among spambots and genuine accounts in both a supervised and an unsupervised fashion. We also evaluate the effectiveness of Social Fingerprinting and we compare it with three state-of-the-art detection showing the superiority of our solution. Finally, among the peculiarities of our approach is the possibility to apply off-the-shelf DNA analysis techniques to study online users behaviors and to efficiently rely on a limited number of lightweight account characteristics.
引用
收藏
页码:561 / 576
页数:16
相关论文
共 55 条
[1]   A generic statistical approach for spam detection in Online Social Networks [J].
Ahmed, Faraz ;
Abulaish, Muhammad .
COMPUTER COMMUNICATIONS, 2013, 36 (10-11) :1120-1129
[2]  
[Anonymous], 1998, Mach Learn, DOI DOI 10.1023/A:1017181826899
[3]   Linear Time Algorithms for Generalizations of the Longest Common Substring Problem [J].
Arnold, Michael ;
Ohlebusch, Enno .
ALGORITHMICA, 2011, 60 (04) :806-818
[4]  
Avvenuti M., 2017, P 26 WORLD WID WEB C
[5]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[6]   Characterizing user navigation and interactions in online social networks [J].
Benevenuto, Fabricio ;
Rodrigues, Tiago ;
Cha, Meeyoung ;
Almeida, Virgilio .
INFORMATION SCIENCES, 2012, 195 :1-24
[7]   A survey of longest common subsequence algorithms [J].
Bergroth, L ;
Hakonen, H ;
Raita, T .
SPIRE 2000: SEVENTH INTERNATIONAL SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL - PROCEEDINGS, 2000, :39-48
[8]  
Beutel A., 2013, WWW 2013 P 22 INT C, P119, DOI DOI 10.1145/2488388.2488400
[9]  
Boshmaf Y, 2011, 27TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2011), P93
[10]   In-depth behavior understanding and use: The behavior informatics approach [J].
Cao, Longbing .
INFORMATION SCIENCES, 2010, 180 (17) :3067-3085