Inferring social ties from geographic coincidences

被引:253
作者
Crandall, David J. [2 ]
Backstrom, Lars [1 ]
Cosley, Dan [3 ]
Suri, Siddharth [1 ]
Huttenlocher, Daniel [1 ]
Kleinberg, Jon [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
[2] Indiana Univ, Sch Informat & Comp, Bloomington, IN 47403 USA
[3] Cornell Univ, Dept Informat Sci, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
computer science; privacy; probabilistic models; social networks; ANONYMITY;
D O I
10.1073/pnas.1006155107
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We investigate the extent to which social ties between people can be inferred from co-occurrence in time and space: Given that two people have been in approximately the same geographic locale at approximately the same time, on multiple occasions, how likely are they to know each other? Furthermore, how does this likelihood depend on the spatial and temporal proximity of the co-occurrences? Such issues arise in data originating in both online and offline domains as well as settings that capture interfaces between online and offline behavior. Here we develop a framework for quantifying the answers to such questions, and we apply this framework to publicly available data from a social media site, finding that even a very small number of co-occurrences can result in a high empirical likelihood of a social tie. We then present probabilistic models showing how such large probabilities can arise from a natural model of proximity and co-occurrence in the presence of social ties. In addition to providing a method for establishing some of the first quantifiable estimates of these measures, our findings have potential privacy implications, particularly for the ways in which social structures can be inferred from public online records that capture individuals' physical locations over time.
引用
收藏
页码:22436 / 22441
页数:6
相关论文
共 25 条
[1]   Predicting Social Security numbers from public data [J].
Acquisti, Alessandro ;
Gross, Ralph .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (27) :10975-10980
[2]  
[Anonymous], IEEE T VISUAL COMPUT, DOI [10.1109/TVCG.2010.44, DOI 10.1109/TVCG.2010.44]
[3]  
[Anonymous], 1980, MARKOV RANDOM FIELDS, DOI DOI 10.1090/CONM/001
[4]  
[Anonymous], 1949, U CHICAGO LAW REV, V17, P148
[5]  
[Anonymous], P 16 INT WORLD WID W
[6]  
[Anonymous], 2005, P 2005 ACM WORKSH PR, DOI DOI 10.1145/1102199.1102214
[7]   Fast approximate energy minimization via graph cuts [J].
Boykov, Y ;
Veksler, O ;
Zabih, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (11) :1222-1239
[8]   The scaling laws of human travel [J].
Brockmann, D ;
Hufnagel, L ;
Geisel, T .
NATURE, 2006, 439 (7075) :462-465
[9]   METHODS FOR STUDYING COINCIDENCES [J].
DIACONIS, P ;
MOSTELLER, F .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (408) :853-861
[10]   Inferring friendship network structure by using mobile phone data [J].
Eagle, Nathan ;
Pentland, Alex ;
Lazer, David .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (36) :15274-15278