Biblio

List
Filter

Found 105 results

Filters: Keyword is privacy protection [Clear All Filters]

2023-06-30

Shi, Er-Mei, Liu, Jia-Xi, Ji, Yuan-Ming, Chang, Liang. 2022. DP-BEGAN: A Generative Model of Differential Privacy Algorithm. 2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI). :168–172.

In recent years, differential privacy has gradually become a standard definition in the field of data privacy protection. Differential privacy does not need to make assumptions about the prior knowledge of privacy adversaries, so it has a more stringent effect than existing privacy protection models and definitions. This good feature has been used by researchers to solve the in-depth learning problem restricted by the problem of privacy and security, making an important breakthrough, and promoting its further large-scale application. Combining differential privacy with BEGAN, we propose the DP-BEGAN framework. The differential privacy is realized by adding carefully designed noise to the gradient of Gan model training, so as to ensure that Gan can generate unlimited synthetic data that conforms to the statistical characteristics of source data and does not disclose privacy. At the same time, it is compared with the existing methods on public datasets. The results show that under a certain privacy budget, this method can generate higher quality privacy protection data more efficiently, which can be used in a variety of data analysis tasks. The privacy loss is independent of the amount of synthetic data, so it can be applied to large datasets.

2023-05-12

Yu, Juan. 2022. Research on Location Information and Privacy Protection Based on Big Data. 2022 International Conference on Industrial IoT, Big Data and Supply Chain (IIoTBDSC). :226–229.

In the context of big data era, in order to prevent malicious access and information leakage during data services, researchers put forward a location big data encryption method based on privacy protection in practical exploration. According to the problems arising from the development of information network in recent years, users often encounter the situation of randomly obtaining location information in the network environment, which not only threatens their privacy security, but also affects the effective transmission of information. Therefore, this study proposed the privacy protection as the core position of big data encryption method, must first clear position with large data representation and positioning information, distinguish between processing position information and the unknown information, the fuzzy encryption theory, dynamic location data regrouping, eventually build privacy protection as the core of the encryption algorithm. The empirical results show that this method can not only effectively block the intrusion of attack data, but also effectively control the error of position data encryption.

2023-04-14

Li, Xiling, Ma, Zhaofeng, Luo, Shoushan. 2022. Blockchain-Oriented Privacy Protection with Online and Offline Verification in Cross-Chain System. 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS). :177–181.

User privacy is an attractive and valuable task to the success of blockchain systems. However, user privacy protection's performance and data capacity have not been well studied in existing access control models of blockchain systems because of traceability and openness of the P2P network. This paper focuses on investigating performance and data capacity from a blockchain infrastructure perspective, which adds secondary encryption to shield confidential information in a non-invasive way. First, we propose an efficient asymmetric encryption scheme by combining homomorphic encryption and state-of-the-art multi-signature key aggregation to preserve privacy. Second, we use smart contracts and CA infrastructure to achieve attribute-based access control. Then, we use the non-interactive zero-knowledge proof scheme to achieve secondary confidentiality explicitly. Finally, experiments show our scheme succeeds better performance in data capacity and system than other schemes. This scheme improves availability and robust scalability, solves the problem of multi-signature key distribution and the unlinkability of transactions. Our scheme has established a sound security cross-chain system and privacy confidentiality mechanism and that has more excellent performance and higher system computing ability than other schemes.

2023-03-31

Du, Juan. 2021. Research on Enterprise Information Security and Privacy Protection in Big Data Environment. 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI). :324–327.

With the development of information technology, extracting important data that people need from the vast information has become the key to a successful era. Therefore, big data technology is increasingly recognized by the public. While creating a lot of commercial value for enterprises, it also brings huge challenges to information security and privacy. In the big data environment, data has become an important medium for corporate decision-making, and information security and privacy protection have become the “army battleground” in corporate competition. Therefore, information security and privacy protection are getting more and more attention from enterprises, which also determines whether enterprises can occupy a place in the fiercely competitive market. This article analyzes the information security and privacy protection issues of enterprises in the big data environment from three aspects. Starting from the importance and significance of big data protection, it analyzes the security and privacy issues of big data in enterprise applications, and finally conducts information security and privacy protection for enterprises. Privacy protection puts forward relevant suggestions.

2023-02-17

Lu, Shaofeng, Lv, Chengzhe, Wang, Wei, Xu, Changqing, Fan, Huadan, Lu, Yuefeng, Hu, Yulong, Li, Wenxi. 2022. Secret Numerical Interval Decision Protocol for Protecting Private Information and Its Application. 2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML). :726–731.

Cooperative secure computing based on the relationship between numerical value and numerical interval is not only the basic problems of secure multiparty computing but also the core problems of cooperative secure computing. It is of substantial theoretical and practical significance for information security in relation to scientific computing to continuously investigate and construct solutions to such problems. Based on the Goldwasser-Micali homomorphic encryption scheme, this paper propose the Morton rule, according to the characteristics of the interval, a double-length vector is constructed to participate in the exclusive-or operation, and an efficient cooperative decision-making solution for integer and integer interval security is designed. This solution can solve more basic problems in cooperative security computation after suitable transformations. A theoretical analysis shows that this solution is safe and efficient. Finally, applications that are based on these protocols are presented.

2023-01-20

Wang, Wei, Yao, Jiming, Shao, Weiping, Xu, Yangzhou, Peng, Shaowu. 2022. Efficient 5G Network Slicing Selection with Privacy in Smart Grid. 2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC). 10:916—922.

To fulfill different requirements from various services, the smart grid typically uses 5G network slicing technique for splitting the physical network into multiple virtual logical networks. By doing so, end users in smart grid can select appropriate slice that is suitable for their services. Privacy has vital significance in network slicing selection, since both the end user and the network entities are afraid that their sensitive slicing features are leaked to an adversary. At the same time, in the smart grid, there are many low-power users who are not suitable for complex security schemes. Therefore, both security and efficiency are basic requirements for 5G slicing selection schemes. Considering both security and efficiency, we propose a 5G slicing selection security scheme based on matching degree estimation, called SS-MDE. In SS-MDE, a set of random numbers is used to hide the feature information of the end user and the AMF which can provide privacy protection for exchanged slicing features. Moreover, the best matching slice is selected by calculating the Euclid distance between two slices. Since the algorithms used in SS-MDE include only several simple mathematical operations, which are quite lightweight, SS-MDE can achieve high efficiency. At the same time, since third-party attackers cannot extract the slicing information, SS-MDE can fulfill security requirements. Experimental results show that the proposed scheme is feasible in real world applications.

Wang, Mei. 2022. Big Data Analysis and Mining Technology of Smart Grid Based on Privacy Protection. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC). :868—871.

Aiming at the big data security and privacy protection issues in the smart grid, the current key technologies for big data security and privacy protection in smart grids are sorted out, and a privacy-protecting smart grid association rule is proposed according to the privacy-protecting smart grid big data analysis and mining technology route The mining plan specifically analyzes the risk factors in the operation of the new power grid, and discusses the information security of power grid users from the perspective of the user, focusing on the protection of privacy and security, using safe multi-party calculation of the support and confidence of the association rules. Privacy-protecting smart grid big data mining enables power companies to improve service quality to 7.5% without divulging customer private information.

2023-01-05

Zhao, Jing, Wang, Ruwu. 2022. FedMix: A Sybil Attack Detection System Considering Cross-layer Information Fusion and Privacy Protection. 2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). :199–207.

Sybil attack is one of the most dangerous internal attacks in Vehicular Ad Hoc Network (VANET). It affects the function of the VANET network by maliciously claiming or stealing multiple identity propagation error messages. In order to prevent VANET from Sybil attacks, many solutions have been proposed. However, the existing solutions are specific to the physical or application layer's single-level data and lack research on cross-layer information fusion detection. Moreover, these schemes involve a large number of sensitive data access and transmission, do not consider users' privacy, and can also bring a severe communication burden, which will make these schemes unable to be actually implemented. In this context, this paper introduces FedMix, the first federated Sybil attack detection system that considers cross-layer information fusion and provides privacy protection. The system can integrate VANET physical layer data and application layer data for joint analyses simultaneously. The data resides locally in the vehicle for local training. Then, the central agency only aggregates the generated model and finally distributes it to the vehicles for attack detection. This process does not involve transmitting and accessing any vehicle's original data. Meanwhile, we also designed a new model aggregation algorithm called SFedAvg to solve the problems of unbalanced vehicle data quality and low aggregation efficiency. Experiments show that FedMix can provide an intelligent model with equivalent performance under the premise of privacy protection and significantly reduce communication overhead, compared with the traditional centralized training attack detection model. In addition, the SFedAvg algorithm and cross-layer information fusion bring better aggregation efficiency and detection performance, respectively.

2022-10-03

Xu, Ruikun. 2021. Location Based Privacy Protection Data Interference Method. 2021 International Conference on Electronic Information Technology and Smart Agriculture (ICEITSA). :89–93.

In recent years, with the rise of the Internet of things industry, a variety of user location-based applications came into being. While users enjoy these convenient services, their location information privacy is also facing a great threat. Therefore, the research on location privacy protection in the Internet of things has become a hot spot for scholars. Privacy protection microdata publishing is a hot spot in data privacy protection research. Data interference is an effective solution for privacy protection microdata publishing. Aiming at privacy protection clustering problem, a privacy protection data interference method is proposed. In this paper, the location privacy protection algorithm is studied, with the purpose of providing location services and protecting the data interference of users' location privacy. In this paper, the source location privacy protection protocol (PR \_ CECRP) algorithm with controllable energy consumption is proposed to control the energy consumption of phantom routing strategy. In the routing process from the source node to the phantom node, the source data packet forwarding mechanism based on sector area division is adopted, so that the random routing path is generated and the routing energy consumption and transmission delay are effectively controlled.

Wang, Yang. 2021. TSITE IP: A Case Study of Intellectual Property Distributed Platform based on Cloud Services. 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys). :1876–1880.

In recent years, the “whole chain” development level of China's intellectual property creation, protection and application has been greatly improved. At the same time, cloud computing technology is booming, and intellectual property data distributed platforms based on cloud storage are emerging one after another. Firstly, this paper introduces the domestic intellectual property cloud platform services from the perspectives of government, state-owned enterprises and private enterprises; Secondly, four typical distributed platforms provided by commercial resources are selected to summarize the problems faced by the operation mode of domestic intellectual property services; Then, it compares and discusses the functions and service modes of domestic intellectual property distributed platform, and takes TSITE IP as an example, puts forward the design and construction strategies of intellectual property protection, intellectual property operation service distributed platform and operation service mode under the background of information age. Finally, according to the development of contemporary information technology, this paper puts forward challenges and development direction for the future development of intellectual property platform.

2022-09-20

Wang, Zisen, Liang, Ying, Xie, Xiaojie, Liu, Zhengjun. 2021. Privacy Protection Method for Experts' Evaluation Ability Calculation of Peer Review. 2021 International Conference on Communications, Information System and Computer Engineering (CISCE). :611—615.

Most of the existing calculation method of expert evaluation ability directly call data onto calculation, which leads to the risk of privacy leakage of expert review information and affects the peer review environment. With regard to this problem, a privacy protection method of experts' evaluation ability calculation of peer review is proposed. Privacy protection and data usability are adjusted according to privacy preferences. Using Gauss distribution and combining with the distributive law of real evaluation data, the virtual projects are generated, and the project data are anonymized according to the virtual projects. Laplace distribution is used to add noise to the evaluation sub score for perturbation, and the evaluation data are obfuscation according to the perturbation sub score. Based on the protected project data and evaluation data, the expert evaluation ability is calculated, and the review privacy is protected. The experimental results show that the proposed method can effectively balance the privacy protection and the accuracy of the calculation results.

2022-08-26

Sun, Zice, Wang, Yingjie, Tong, Xiangrong, Pan, Qingxian, Liu, Wenyi, Zhang, Jiqiu. 2021. Service Quality Loss-aware Privacy Protection Mechanism in Edge-Cloud IoTs. 2021 13th International Conference on Advanced Computational Intelligence (ICACI). :207—214.

With the continuous development of edge computing, the application scope of mobile crowdsourcing (MCS) is constantly increasing. The distributed nature of edge computing can transmit data at the edge of processing to meet the needs of low latency. The trustworthiness of the third-party platform will affect the level of privacy protection, because managers of the platform may disclose the information of workers. Anonymous servers also belong to third-party platforms. For unreal third-party platforms, this paper recommends that workers first use the localized differential privacy mechanism to interfere with the real location information, and then upload it to an anonymous server to request services, called the localized differential anonymous privacy protection mechanism (LDNP). The two privacy protection mechanisms further enhance privacy protection, but exacerbate the loss of service quality. Therefore, this paper proposes to give corresponding compensation based on the authenticity of the location information uploaded by workers, so as to encourage more workers to upload real location information. Through comparative experiments on real data, the LDNP algorithm not only protects the location privacy of workers, but also maintains the availability of data. The simulation experiment verifies the effectiveness of the incentive mechanism.

2022-07-14

Gong, Changqing, Dong, Zhaoyang, Gani, Abdullah, Qi, Han. 2021. Quantum Ciphertext Dimension Reduction Scheme for Homomorphic Encrypted Data. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :903—910.

At present, in the face of the huge and complex data in cloud computing, the parallel computing ability of quantum computing is particularly important. Quantum principal component analysis algorithm is used as a method of quantum state tomography. We perform feature extraction on the eigenvalue matrix of the density matrix after feature decomposition to achieve dimensionality reduction, proposed quantum principal component extraction algorithm (QPCE). Compared with the classic algorithm, this algorithm achieves an exponential speedup under certain conditions. The specific realization of the quantum circuit is given. And considering the limited computing power of the client, we propose a quantum homomorphic ciphertext dimension reduction scheme (QHEDR), the client can encrypt the quantum data and upload it to the cloud for computing. And through the quantum homomorphic encryption scheme to ensure security. After the calculation is completed, the client updates the key locally and decrypts the ciphertext result. We have implemented a quantum ciphertext dimensionality reduction scheme implemented in the quantum cloud, which does not require interaction and ensures safety. In addition, we have carried out experimental verification on the QPCE algorithm on IBM's real computing platform. Experimental results show that the algorithm can perform ciphertext dimension reduction safely and effectively.

2022-05-09

Huang, Liangqun, Xu, Lei, Zhu, Liehuang, Gai, Keke. 2021. A Blockchain-Assisted Privacy-Preserving Cloud Computing Method with Multiple Keys. 2021 IEEE 6th International Conference on Smart Cloud (SmartCloud). :19–25.

How to analyze users' data without compromising individual privacy is an important issue in cloud computing. In order to protect privacy and enable the cloud to perform computing, users can apply homomorphic encryption schemes to their data. Most of existing homomorphic encryption-based cloud computing methods require that users' data are encrypted with the same key. While in practice, different users may prefer to use different keys. In this paper, we propose a privacy-preserving cloud computing method which adopts a double-trapdoor homomorphic encryption scheme to deal with the multi-key issue. The proposed method uses two cloud servers to analyze users' encrypted data. And we propose to use blockchain to monitor the information exchanged between the servers. Security analysis shows that the introduction of blockchain can help to prevent the two servers from colluding with each other, hence data privacy is further enhanced. And we conduct simulations to demonstrate the feasibility of the propose method.

2022-05-06

Lei, Qinyi, Sun, Qi, Zhao, Linyan, Hong, Dehua, Hu, Cailiang. 2021. Power Grid Data Confirmation Model Based on Behavior Characteristics. 2021 IEEE 5th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC). 5:1252–1256.

The power grid has high requirements for data security, and data security audit technology is facing challenges. Because the server in the power grid operating environment is considered untrustworthy and does not have the authority to obtain the secret key, the encrypted data cannot be parsed and the data processing ability of the data center is restricted. In response to the above problems, the power grid database encryption system was designed, and the access control module and the encryption module that should be written based on SQL statements were explained. The database encryption system was developed using the Java language and deployed in the cloud environment. Finally, the method was proved by experiments. feasibility.

2022-02-09

Deng, Han, Wang, Zhechon, Zhang, Yazhen. 2021. Overview of Privacy Protection Data Release Anonymity Technology. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :151–156.

The collection of digital information by governments, companies and individuals creates tremendous opportunities for knowledge and information-based decision-making. Driven by mutual benefit and laws and regulations, there is a need for data exchange and publication between all parties. However, data in its original form usually contains sensitive information about individuals and publishing such data would violate personal privacy. Privacy Protection Data Distribution (PPDP) provides methods and tools to release useful information while protecting data privacy. In recent years, PPDP has received extensive attention from the research community, and many solutions have been proposed for different data release scenarios. How to ensure the availability of data under the premise of protecting user privacy is the core problem to be solved in this field. This paper studies the existing achievements of privacy protection data release anonymity technology, focusing on the existing anonymity technology in three aspects of high-dimensional, high-deficiency, and complex relational data, and analyzes and summarizes them.

2021-06-30

Yan, Chenyang, Zhang, Yulei, Wang, Hongshuo, Yu, Shaoyang. 2020. A Safe and Efficient Message Authentication Scheme In The Internet Of Vehicles. 2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS). :10—13.

In order to realize the security authentication of information transmission between vehicle nodes in the vehicular ad hoc network, based on the certificateless public key cryptosystem and aggregate signature, a privacy-protected certificateless aggregate signature scheme is proposed, which eliminates the complicated certificate maintenance cost. This solution also solves the key escrow problem. By Communicating with surrounding nodes through the pseudonym of the vehicle, the privacy protection of vehicle users is realized. The signature scheme satisfies the unforgeability of an adaptive selective message attack under a random prophetic machine. The scheme meets message authentication, identity privacy protection, resistance to reply attacks.

2021-06-02

Xu, Yizheng. 2020. Application Research Based on Machine Learning in Network Privacy Security. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :237—240.

As the hottest frontier technology in the field of artificial intelligence, machine learning is subverting various industries step by step. In the future, it will penetrate all aspects of our lives and become an indispensable technology around us. Among them, network security is an area where machine learning can show off its strengths. Among many network security problems, privacy protection is a more difficult problem, so it needs more introduction of new technologies, new methods and new ideas such as machine learning to help solve some problems. The research contents for this include four parts: an overview of machine learning, the significance of machine learning in network security, the application process of machine learning in network security research, and the application of machine learning in privacy protection. It focuses on the issues related to privacy protection and proposes to combine the most advanced matching algorithm in deep learning methods with information theory data protection technology, so as to introduce it into biometric authentication. While ensuring that the loss of matching accuracy is minimal, a high-standard privacy protection algorithm is concluded, which enables businesses, government entities, and end users to more widely accept privacy protection technology.

2021-04-27

Khokhlov, I., Reznik, L.. 2020. What is the Value of Data Value in Practical Security Applications. 2020 IEEE Systems Security Symposium (SSS). :1—8.

Data value (DV) is a novel concept that is introduced as one of the Big Data phenomenon features. While continuing an investigation of the DV ontology and its relationship with the data quality (DQ) on the conceptual level, this paper researches possible applications and use of the DV in the practical design of security and privacy protection systems and tools. We present a novel approach to DV evaluation that maps DQ metrics into DV value. Developed methods allow DV and DQ use in a wide range of application domains. To demonstrate DQ and DV concept employment in real tasks we present two real-life scenarios. The first use case demonstrates the DV use in crowdsensing application design. It shows up how DV can be calculated by integrating various metrics characterizing data application functionality, accuracy, and security. The second one incorporates the privacy consideration into DV calculus by exploring the relationship between privacy, DQ, and DV in the defense against web-site fingerprinting in The Onion Router (TOR) networks. These examples demonstrate how our methods of the DV and DQ evaluation may be employed in the design of real systems with security and privacy consideration.

Zhang, M., Chen, Y., Huang, J.. 2020. SE-PPFM: A Searchable Encryption Scheme Supporting Privacy-Preserving Fuzzy Multikeyword in Cloud Systems. IEEE Systems Journal. :1–9.

Cloud computing provides an appearing application for compelling vision in managing big-data files and responding queries over a distributed cloud platform. To overcome privacy revealing risks, sensitive documents and private data are usually stored in the clouds in a cipher-based manner. However, it is inefficient to search the data in traditional encryption systems. Searchable encryption is a useful cryptographic primitive to enable users to retrieve data in ciphertexts. However, the traditional searchable encryptions provide lower search efficiency and cannot carry out fuzzy multikeyword queries. To solve this issue, in this article, we propose a searchable encryption that supports privacy-preserving fuzzy multikeyword search (SE-PPFM) in cloud systems, which is built by asymmetric scalar-product-preserving encryptions and Hadamard product operations. In order to realize the functionality of efficient fuzzy searches, we employ Word2vec as the primitive of machine learning to obtain a fuzzy correlation score between encrypted data and queries predicates. We analyze and evaluate the performance in terms of token of multikeyword, retrieval and match time, file retrieval time and matching accuracy, etc. The experimental results show that our scheme can achieve a higher efficiency in fuzzy multikeyword ciphertext search and provide a higher accuracy in retrieving and matching procedure.

2021-03-29

Gupta, S., Buduru, A. B., Kumaraguru, P.. 2020. imdpGAN: Generating Private and Specific Data with Generative Adversarial Networks. 2020 Second IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :64–72.

Generative Adversarial Network (GAN) and its variants have shown promising results in generating synthetic data. However, the issues with GANs are: (i) the learning happens around the training samples and the model often ends up remembering them, consequently, compromising the privacy of individual samples - this becomes a major concern when GANs are applied to training data including personally identifiable information, (ii) the randomness in generated data - there is no control over the specificity of generated samples. To address these issues, we propose imdpGAN-an information maximizing differentially private Generative Adversarial Network. It is an end-to-end framework that simultaneously achieves privacy protection and learns latent representations. With experiments on MNIST dataset, we show that imdpGAN preserves the privacy of the individual data point, and learns latent codes to control the specificity of the generated samples. We perform binary classification on digit pairs to show the utility versus privacy trade-off. The classification accuracy decreases as we increase privacy levels in the framework. We also experimentally show that the training process of imdpGAN is stable but experience a 10-fold time increase as compared with other GAN frameworks. Finally, we extend imdpGAN framework to CelebA dataset to show how the privacy and learned representations can be used to control the specificity of the output.

2021-03-22

Wang, Z., Chen, L.. 2020. Re-encrypted Data Access Control Scheme Based on Blockchain. 2020 IEEE 6th International Conference on Computer and Communications (ICCC). :1757–1764.

Nowadays, massive amounts of data are stored in the cloud, how to access control the cloud data has become a prerequisite for protecting the security of cloud data. In order to address the problems of centralized control and privacy protection in current access control, we propose an access control scheme based on the blockchain and re-encryption technology, namely PERBAC-BC scheme. The access control policy is managed by the decentralized and immutability characteristics of blockchain, while the re-encryption is protected by the trusted computing characteristic of blockchain and the privacy is protected by the identity re-encryption technology. The overall structure diagram and detailed execution flow of the scheme are given in this paper. Experimental results show that, compared with the traditional hybrid encryption scheme, the time and space consumption is less when the system is expanded. Then, the time and space performance of each part of the scheme is simulated, and the security of blockchain is proved. The results also show that the time and space performance of the scheme are better and the security is stronger, which has certain stability and expandability.

2021-01-28

Esmeel, T. K., Hasan, M. M., Kabir, M. N., Firdaus, A.. 2020. Balancing Data Utility versus Information Loss in Data-Privacy Protection using k-Anonymity. 2020 IEEE 8th Conference on Systems, Process and Control (ICSPC). :158—161.

Data privacy has been an important area of research in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize interests of individuals associated with the data. In order to resolve this issue, privacy techniques can be used to hinder the identification of a person through anonymization of the sensitive data in the dataset to protect sensitive information, while the anonymized dataset can be used by the third parties for analysis purposes without obstruction. In this research, we investigated a privacy technique, k-anonymity for different values of on different number columns of the dataset. Next, the information loss due to k-anonymity is computed. The anonymized files go through the classification process by some machine-learning algorithms i.e., Naive Bayes, J48 and neural network in order to check a balance between data anonymity and data utility. Based on the classification accuracy, the optimal values of and are obtained, and thus, the optimal and can be used for k-anonymity algorithm to anonymize optimal number of columns of the dataset.

Fan, M., Yu, L., Chen, S., Zhou, H., Luo, X., Li, S., Liu, Y., Liu, J., Liu, T.. 2020. An Empirical Evaluation of GDPR Compliance Violations in Android mHealth Apps. 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). :253—264.

The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named HPDROID, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on HPDROID, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.

2021-01-11

Wang, J., Wang, A.. 2020. An Improved Collaborative Filtering Recommendation Algorithm Based on Differential Privacy. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). :310–315.

In this paper, differential privacy protection method is applied to matrix factorization method that used to solve the recommendation problem. For centralized recommendation scenarios, a collaborative filtering recommendation model based on matrix factorization is established, and a matrix factorization mechanism satisfying ε-differential privacy is proposed. Firstly, the potential characteristic matrix of users and projects is constructed. Secondly, noise is added to the matrix by the method of target disturbance, which satisfies the differential privacy constraint, then the noise matrix factorization model is obtained. The parameters of the model are obtained by the stochastic gradient descent algorithm. Finally, the differential privacy matrix factorization model is used for score prediction. The effectiveness of the algorithm is evaluated on the public datasets including Movielens and Netflix. The experimental results show that compared with the existing typical recommendation methods, the new matrix factorization method with privacy protection can recommend within a certain range of recommendation accuracy loss while protecting the users' privacy information.