Visible to the public Differential Privacy High-dimensional Data Publishing Method Based on Bayesian Network

TitleDifferential Privacy High-dimensional Data Publishing Method Based on Bayesian Network
Publication TypeConference Paper
Year of Publication2022
AuthorsLu, Xiaotian, Piao, Chunhui, Han, Jianghe
Conference Name2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)
Date Publishedjul
KeywordsBayes methods, Bayesian Network, composability, Correlation, Differential privacy, elastic privacy budget allocation, high-dimensional data publishing, Human Behavior, maximum information coefficient, privacy, pubcrawl, Publishing, resilience, Resiliency, Scalability, Support vector machines, Training data
AbstractEnsuring high data availability while realizing privacy protection is a research hotspot in the field of privacy-preserving data publishing. In view of the instability of data availability in the existing differential privacy high-dimensional data publishing methods based on Bayesian networks, this paper proposes an improved MEPrivBayes privacy-preserving data publishing method, which is mainly improved from two aspects. Firstly, in view of the structural instability caused by the random selection of Bayesian first nodes, this paper proposes a method of first node selection and Bayesian network construction based on the Maximum Information Coefficient Matrix. Then, this paper proposes a privacy budget elastic allocation algorithm: on the basis of pre-setting differential privacy budget coefficients for all branch nodes and all leaf nodes in Bayesian network, the influence of branch nodes on their child nodes and the average correlation degree between leaf nodes and all other nodes are calculated, then get a privacy budget strategy. The SVM multi-classifier is constructed with privacy preserving data as training data set, and the original data set is used as input to evaluate the prediction accuracy in this paper. The experimental results show that the MEPrivBayes method proposed in this paper has higher data availability than the classical PrivBayes method. Especially when the privacy budget is small (noise is large), the availability of the data published by MEPrivBayes decreases less.
DOI10.1109/ICCEAI55464.2022.00132
Citation Keylu_differential_2022