Applying Differential Privacy to Search Queries in a Policy Based Interactive Framework

Author/Creator ORCID

Date

2009-11-06

Department

Program

Citation of Original Publication

Palanivel Andiappan Kodeswaran and Evelyne Viegas, Applying Differential Privacy to Search Queries in a Policy Based Interactive Framework, ACM International Workshop on Privacy and Annonymity for Very Large Datasets, Pages 25-32 , 2009 , DOI: 10.1145/1651449.1651455

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Web search logs are of growing importance to researchers as they help understanding search behavior and search engine performance. However, search logs typically contain sensitive information about users and therefore considerable caution must be exercised when considering releasing the logs to the research community. Current approaches to releasing search logs focus on either protecting the privacy of users or enhancing the utility of data to researchers. In this work, we address the privacy-utility tradeoff by providing safe access to search logs, instead of releasing them. We propose a policy based safe interactive framework built on semantic policies and differential privacy to allow researchers access to search logs, while maintaining the privacy of the users. Semantic policies are used to infer the higher levels of information that can be mined from a dataset based on the fields accessed by a researcher. The accessed fields are then used to build research profile(s) that guide the amount of privacy to be enforced using differential privacy. We show the additional utility that can be obtained in our framework by two demonstrative experiments that involve access to user level information. Our results indicate that valid research can be conducted in our framework without forgoing the privacy of individuals.