A Framework for Efficient Data Anonymization under Privacy and Accuracy Constraints

被引:47
作者
Ghinita, Gabriel [1 ]
Karras, Panagiotis [1 ]
Kalnis, Panos [1 ]
Mamoulis, Nikos [2 ]
机构
[1] Natl Univ Singapore, Singapore 117417, Singapore
[2] Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2009年 / 34卷 / 02期
关键词
Design; Experimentation; Security; Privacy; anonymity;
D O I
10.1145/1538909.1538911
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy-preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) l-diversification is solved by techniques developed for the simpler k-anonymization problem, causing unnecessary information loss. (ii) The anonymization process is inefficient in terms of computational and I/O cost. (iii) Previous research focused exclusively on the privacy-constrained problem and ignored the equally important accuracy-constrained (or dual) anonymization problem. In this article, we propose a framework for efficient anonymization of microdata that addresses these deficiencies. First, we focus on one-dimensional (i.e., single-attribute) quasi-identifiers, and study the properties of optimal solutions under the k-anonymity and l-diversity models for the privacy-constrained (i.e., direct) and the accuracy-constrained (i.e., dual) anonymization problems. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multidimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the existing approaches in terms of execution time and information loss.
引用
收藏
页数:47
相关论文
共 28 条
[1]  
AGGARWAL G, 2006, P 25 ACM SIGMOD SIGA, P153, DOI DOI 10.1145/1142351.1142374
[2]  
Aggarwal G., 2005, J. Priv. Technol, V2005112001, P400
[3]  
[Anonymous], 2006, P 32 INT C VER LARG
[4]  
[Anonymous], 2006, P INT C DAT ENG ICDE
[5]  
Bayardo RJ, 2005, PROC INT CONF DATA, P217
[6]  
Byun JW, 2007, LECT NOTES COMPUT SC, V4443, P188
[7]   The death of privacy? [J].
Froomkin, AM .
STANFORD LAW REVIEW, 2000, 52 (05) :1461-1543
[8]  
Ghinita G., 2007, P 33 INT C VER LARG, P758
[9]  
HARINARAYAN V, 1996, P 1996 ACM SIGMOD IN, P205
[10]  
Iwuchukwu T., 2007, VLDB 07, P746