Correlated network data publication via differential privacy

被引:161
作者
Chen, Rui [1 ]
Fung, Benjamin C. M. [2 ]
Yu, Philip S. [3 ]
Desai, Bipin C. [4 ]
机构
[1] Hong Kong Baptist Univ, Kowloon, Hong Kong, Peoples R China
[2] McGill Univ, Montreal, PQ, Canada
[3] Univ Illinois, Chicago, IL USA
[4] Concordia Univ, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Network data; Differential privacy; Data correlation; Non-interactive publication;
D O I
10.1007/s00778-013-0344-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, targeting different privacy requirements. A large body of existing works focus on preventing node re-identification against adversaries with structural background knowledge, while some other studies aim to thwart edge disclosure. In general, the line of research on preventing edge disclosure is less fruitful, largely due to lack of a formal privacy model. The recent emergence of differential privacy has shown great promise for rigorous prevention of edge disclosure. Yet recent research indicates that differential privacy is vulnerable to data correlation, which hinders its application to network data that may be inherently correlated. In this paper, we show that differential privacy could be tuned to provide provable privacy guarantees even in the correlated setting by introducing an extra parameter, which measures the extent of correlation. We subsequently provide a holistic solution for non-interactive network data publication. First, we generate a private vertex labeling for a given network dataset to make the corresponding adjacency matrix form dense clusters. Next, we adaptively identify dense regions of the adjacency matrix by a data-dependent partitioning process. Finally, we reconstruct a noisy adjacency matrix by a novel use of the exponential mechanism. To our best knowledge, this is the first work providing a practical solution for publishing real-life network data via differential privacy. Extensive experiments demonstrate that our approach performs well on different types of real-life network datasets.
引用
收藏
页码:653 / 676
页数:24
相关论文
共 40 条
[1]  
[Anonymous], 2012, Proc. ACM workshop on online social networks (WOSN)
[2]  
[Anonymous], 2009, Proceedings of the SIAM International Conference on Data Mining, SDM'09
[3]  
[Anonymous], 2007, P 16 INT C WORLD WID
[4]  
[Anonymous], 2008, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data
[5]   Chains of affection: The structure of adolescent romantic and sexual networks [J].
Bearman, PS ;
Moody, J ;
Stovel, K .
AMERICAN JOURNAL OF SOCIOLOGY, 2004, 110 (01) :44-91
[6]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[7]  
Bhagat S., 2009, P VLDB ENDOW, V2, P766, DOI DOI 10.14778/1687627.1687714
[8]  
Chen R, 2011, PROC VLDB ENDOW, V4, P1087
[9]  
Cheng J., 2010, P 2010 ACM SIGMOD IN, P459, DOI DOI 10.1145/1807167.1807218
[10]  
Cormen TH., 2009, Introduction to Algorithms, V3