Privacy-Preserving Data Sharing in Cloud Computing

Springer Science and Business Media LLC - Tập 25 Số 3 - Trang 401-414 - 2010
Wang, Hui1
1Department of Computer Science, Stevens Institute of Technology, Castle Point on Hudson, Hoboken, U.S.A.

Tóm tắt

Storing and sharing databases in the cloud of computers raise serious concern of individual privacy. We consider two kinds of privacy risk: presence leakage, by which the attackers can explicitly identify individuals in (or not in) the database, and association leakage, by which the attackers can unambiguously associate individuals with sensitive information. However, the existing privacy-preserving data sharing techniques either fail to protect the presence privacy or incur considerable amounts of information loss. In this paper, we propose a novel technique, Ambiguity, to protect both presence privacy and association privacy with low information loss. We formally define the privacy model and quantify the privacy guarantee of Ambiguity against both presence leakage and association leakage. We prove both theoretically and empirically that the information loss of Ambiguity is always less than the classic generalization-based anonymization technique. We further propose an improved scheme, PriView, that can achieve better information loss than Ambiguity. We propose efficient algorithms to construct both Ambiguity and PriView schemes. Extensive experiments demonstrate the effectiveness and efficiency of both Ambiguity and PriView schemes.

Tài liệu tham khảo

Weiss A. Computing in the clouds. NetWorker, Dec. 2007, 11(4): 16–25. citation_journal_title=Communications of the ACM; citation_title=Cloud computing; citation_author=B Hayes; citation_volume=51; citation_issue=7; citation_publication_date=2008; citation_pages=9-11; citation_doi=10.1145/1364782.1364786; citation_id=CR2 Nergiz M E, Atzori M, Clifton C W. Hiding the presence of individuals from shared databases. In Proc. ACM's Special Interest Group on Management of Data (SIGMOD2007), Beijing, China, June 11–17, 2007, pp.665–676. Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information. In Proc. ACM International Conference on Principles of Database Systems (PODS), Seattle, USA, June 1–4, 1998, p.188. Samarati P, Sweeney L. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report, SRI International, 1998. citation_journal_title=International Journal on Uncertainty, Fuzziness and Knowledge-Based Systems; citation_title=A model for protecting privacy; citation_author=L. Sweeney; citation_volume=10; citation_issue=5; citation_publication_date=2002; citation_pages=557-570; citation_id=CR6 Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. In Proc. International Conference on Data Engineering Conference (ICDE), Atlanta, USA, Apr. 2006, p.24. Xiao X, Tao Y. Anatomy: Simple and effective privacy preservation. In Proc. Very Large Data Base Conference (VLDB), Seoul, Korea, Sept. 12–15, 2006, pp.139-150. Zhang Q, Koudas N, Srivastava D, Yu T. Aggregate query answering on anonymized tables. In Proc. International Conference on Data Engineering Conference (ICDE), Istanbul, Turkey, Apr. 15–20, 2007, pp.116–125. Ghinita G, Karras P, Kalnis P, Mamoulis N. Fast data anonymization with low information loss. In Proc. Very Large Data Base Conference (VLDB), Vienna, Austria, Sept. 23–27, 2007, pp.758–769. Li N, Li T. t-Closeness: Privacy beyond K-anonymity and l-diversity. In Proc. International Conference on Data Engineering Conference (ICDE), Istanbul, Turkey, April 15-20, 2007, pp.106-115. Wong R C W, Li J, Fu A W C, Wang K. (α, k)-anonymity: An enhanced k-anonymity model for privacy-preserving data publishing. In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Philadelphia, USA, Aug. 20–23, 2006, pp.754–759. Bayardo R J, Agrawal R. Data privacy through optimal k-anonymization. In Proc. International Conference on Data Engineering Conference (ICDE), Tokyo, Japan, Apr. 5–8, 2005, pp.217–228. Meyerson A, Williams R. On the complexity of optimal k-anonymity. In Proc. ACM International Conference on Principles of Database Systems (PODS), Paris, France, June 14–16, 2004, pp.223–228. Aggarwal C C. On k-anonymity and the curse of dimensionality. In Proc. Very Large Data Base Conference (VLDB), Trondheim, Norway, June 14-16, 2005, pp.901-909. Martin D J, Kifer D, Machanavajjhala A, Gehrke J, Halpern J Y. Worst-case background knowledge for privacy-preservingdata publishing. In Proc. International Conference on Data Engineering Conference (ICDE), Istanbul, Turkey, April 15–20, 2007, pp.126–135. Fung B C M, Wang K, Chen R, Yu P S. Privacy-preserving data publishing: A survey on recent developments. ACM Computing Surveys (CSUR), December 2010, 42(4). (to appear). Rastogi V, Suciu D, Hong S. The boundary between privacy and utility in data publishing. In Proc. Very Large Data Base Conference (VLDB), Vienna, Austria, Sept. 23-27, 2007, pp.531–542. citation_journal_title=IEEE Transactions on Knowledge and Data Engineering (TKDE); citation_title=Protecting respondents' identities in microdata release; citation_author=P Samarati; citation_volume=13; citation_issue=6; citation_publication_date=2001; citation_pages=1010-1027; citation_doi=10.1109/69.971193; citation_id=CR19 Iyengar V S. Transforming data to satisfy privacy constraints. In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Edmonton, Canada, July 23–26, 2002, pp.279–288. Xu J, Wang W, Pei J, Wang X, Shi B, Fu A W C. Utility-based anonymization using local recoding. In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Philadelphia, USA, Aug. 20–23, 2006, pp.785–790. Kifer D, Gehrke J. Injecting utility into anonymized datasets. In Proc. ACM's Special Interest Group on Management Of Data (SIGMOD), Chicago, USA, June 27–29, 2006, pp.217–228. LeFevre K, DeWitt D, Ramakrishnan R. Incognito: Efficient full-domain k-anonymity. In Proc. ACM's Special Interest Group on Management Of Data (SIGMOD), Baltimore, USA, June 14–16, 2005, pp.49–60. LeFevre K, DeWitt D, Ramakrishnan R. Mondrian multi-dimensional k-anonymity. In Proc. International Conference on Data Engineering Conference (ICDE), Tokyo, Japan, Apr. 5–8, 2005, p.25.