数据匿名技术是目前应用最广泛的隐私保护技术,可以在保护数据私密性的同时最大限度地保证数据的可用性和计算的高效性.然而,现有的数据匿名模型采用的都是二分类的匿名模式,这种非此即彼的处理方式往往过度偏激,造成大量不必要的信息损失.针对这个问题,提出一种基于三支决策的新型分类匿名模型.首先,在k-匿名模型的基础上,提出匿名上、下限以及模糊数据的概念;其次,将三支决策的思想引入数据匿名过程,通过延迟决策的方式考虑实际决策过程中可能出现的边缘性的模糊数据,提出一种新型的三支分类匿名模型,即(Uk,Lk)-分类匿名模型;然后,为了验证所提模型的可用性,结合差分隐私的思想,在延迟决策中使用添加噪声的方式对模糊数据进行再处理.实验结果证明,提出的模型可以很好地提高数据可用性,在实际应用场景中的适用性更强.
Data anonymization technology is the most widespread data privacy protection technology as it maximizes data availability and computational efficiency while protecting data privacy.However,existing data anonymization models adopt the binary classification anonymity model,and this either-or treatment is often overly biased,resulting in massive unnecessary information loss.To address this problem,this paper combines the idea of three-way decisions and proposes a novel classification anonymity model based on three-way decisions.Firstly,we propose the concept of anonymous upper,lower bounds and fuzzy data on the basis of k-anonymity model.Secondly,the idea of three-way decisions is introduced into the data anonymization,and the marginal fuzzy data that may appear in the actual decision process is considered by delaying the decision.A novel three-way classified anonymity model,the(Uk,Lk)-classified anonymity model is proposed.Then,in order to verify the usability of the proposed model,the fuzzy data are reprocessed by adding noise in the delayed decision in combination with the idea of differential privacy.Finally,experimental results demonstrate that the proposed model improves the data availability well and is more applicable in practical application scenarios.