Optimizing Privacy-Accuracy Tradeoff for Privacy Preserving Distance-Based Classification

Date

2012-04

Department

Program

Citation of Original Publication

Kim, Dongjin; Chen, Zhiyuan; Gangopadhyay, Aryya; Optimizing Privacy-Accuracy Tradeoff for Privacy Preserving Distance-Based Classification; International Journal of Information Security and Privacy (IJISP) 6(2), 16-33, April 2012; https://doi.org/10.4018/jisp.2012040102

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

Privacy concerns often prevent organizations from sharing data for data mining purposes. There has been a rich literature on privacy preserving data mining techniques that can protect privacy and still allow accurate mining. Many such techniques have some parameters that need to be set correctly to achieve the desired balance between privacy protection and quality of mining results. However, there has been little research on how to tune these parameters effectively. This paper studies the problem of tuning the group size parameter for a popular privacy preserving distance-based mining technique: the condensation method. The contributions include: 1) a class-wise condensation method that selects an appropriate group size based on heuristics and avoids generating groups with mixed classes, 2) a rule-based approach that uses binary search and several rules to further optimize the setting for the group size parameter. The experimental results demonstrate the effectiveness of the authors’ approach.