Protecting respondents' identities in microdata release

被引:1211
作者
Samarati, P [1 ]
机构
[1] Univ Milan, Dipartimento Tecnol Informaz, I-26013 Crema, Italy
基金
美国国家科学基金会;
关键词
privacy; data anonymity; disclosure control; microdata release; inference; record linkage; security; information protection;
D O I
10.1109/69.971193
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today's globally networked society places great demand on the dissemination and sharing of information. While in the past released information was mostly in tabular and statistical form, many situations call today for the release of specific data (microdata). In order to protect the anonymity of the, entities (called respondents) to which information refers, data holders often remove or encrypt explicit identifiers such as names, addresses, and phone numbers. Deidentifying data, however, provides no guarantee of anonymity. Released information often contains other data, such as race, birth date, sex, and ZIP code, that can be linked to publicly available information to reidentify respondents and inferring information that was not intended for disclosure. In this paper we address the problem of releasing microdata while safeguarding the anonymity of the respondents to which the data refer. The approach is based on the definition of k-anonymity. A table provides k-anonymity if attempts to link explicitly identifying information to its content map the information to at least k entities. We illustrate how k-anonymity can be provided without compromising the Integrity (or truthfulness) of the information released by using generalization and suppression techniques. We introduce the concept of minimal generalization that captures the property of the release process not to distort the data more than needed to achieve k-anonymity, and present an algorithm for the computation of such a generalization. We also discuss possible preference policies to choose among different minimal generalizations.
引用
收藏
页码:1010 / 1027
页数:18
相关论文
共 21 条
[1]  
ADAM NR, 1989, COMPUT SURV, V21, P515, DOI 10.1145/76894.76895
[2]   A security policy model for clinical information systems [J].
Anderson, RJ .
1996 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, PROCEEDINGS, 1996, :30-43
[3]  
[Anonymous], 1982, CRYPTOGRAPHY DATA SE, DOI DOI 10.5555/539308
[4]  
[Anonymous], 1996, P 3 INT SEM STAT CON
[5]  
*COMM MAINT PRIV S, 1997, REC PROT EL HLTH INF
[7]  
DALENIUS T, 1986, J OFF STAT, V2, P329
[8]  
Davey B., 1990, INTRO LATTICES ORDER
[9]  
DOBSON J, 1998, IFIP WG11 3 WORK C D
[10]  
Duncan George T., 1993, Private lives and public policies: Confidentiality and accessibility of government statistics