Data perturbation and feature selection in preserving privacy
- Resource Type
- Conference
- Authors
- Jahan, Thanveer; Narsimha, G.; Rao, C. V. Guru
- Source
- 2012 Ninth International Conference on Wireless and Optical Communications Networks (WOCN) Wireless and Optical Communications Networks (WOCN), 2012 Ninth International Conference on. :1-6 Sep, 2012
- Subject
- Communication, Networking and Broadcast Technologies
Computing and Processing
Components, Circuits, Devices and Systems
Signal Processing and Analysis
Perturbation
Feature selection
SVD
SSVD
SVM
ID3
C4.5
- Language
- ISSN
- 1811-3923
2151-7703
2151-7681
Privacy Preserving plays a vital role; in designing various security-related data mining applications. Protecting sensitive information in data mining has become an important issue. Data distortion or data perturbation is a critical component, widely used to protect sensitive data. Many approaches try to preserve privacy by adding noise or by matrix decomposition methods. In this paper we propose data distortion methods such as singular value decomposition (SVD) and sparsified singular value decomposition (SSVD) technique along with feature selection to reduce feature space. Various privacy metrics have been proposed to measure the difference between original dataset and distorted dataset and degree of privacy protection. Our experimental results use a real world dataset. It shows a feasible solution using sparsified singular value decomposition along with a feature selection, which could better preserve privacy. Extracting accurate information from datasets will make reasonable decisions using data mining algorithms. The mining utility on perturbed data is tested with a well known classifiers such as SVM, ID3and C4.5.