Biblio
To our best knowledge, the p-sensitive k-anonymity model is a sophisticated model to resist linking attacks and homogeneous attacks in data publishing. However, if the distribution of sensitive values is skew, the model is difficult to defend against skew attacks and even faces sensitive attacks. In practice, the privacy requirements of different sensitive values are not always identical. The “one size fits all” unified privacy protection level may cause unnecessary information loss. To address these problems, the paper quantifies privacy requirements with the concept of IDF and concerns more about sensitive groups. Two enhanced anonymous models with personalized protection characteristic, that is, (p,αisg) -sensitive k-anonymity model and (pi,αisg)-sensitive k-anonymity model, are then proposed to resist skew attacks and sensitive attacks. Furthermore, two clustering algorithms with global search and local search are designed to implement our models. Experimental results show that the two enhanced models have outstanding advantages in better privacy at the expense of a little data utility.
Recently, a large amount of research studies aiming at the privacy-preserving data publishing have been conducted. We find that most K-anonymity algorithms fail to consider the characteristics of attribute values distribution in data and the contribution value differences in quasi-identifier attributes when service-oriented. In this paper, the importance of distribution characteristics of attribute values and the differences in contribution value of quasi-identifier attributes to anonymous results are illustrated. In order to maximize the utility of released data, a service-oriented adaptive anonymity algorithm is proposed. We establish a model of reaction dispersion degree to quantify the characteristics of attribute value distribution and introduce the concept of utility weight related to the contribution value of quasi-identifier attributes. The priority coefficient and the characterization coefficient of partition quality are defined to optimize selection strategies of dimension and splitting value in anonymity group partition process adaptively, which can reduce unnecessary information loss so as to further improve the utility of anonymized data. The rationality and validity of the algorithm are verified by theoretical analysis and multiple experiments.
The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named HPDROID, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on HPDROID, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.
Increased availability of mobile cameras has led to more opportunities for people to record videos of significantly more of their lives. Many times people want to share these videos, but only to certain people who were co-present. Since the videos may be of a large event where the attendees are not necessarily known, we need a method for proving co-presence without revealing information before co-presence is proven. In this demonstration, we present a privacy-preserving method for comparing the similarity of two videos without revealing the contents of either video. This technique leverages the Similarity of Simultaneous Observation technique for detecting hidden webcams and modifies the existing algorithms so that they are computationally feasible to run under fully homomorphic encryption scheme on modern mobile devices. The demonstration will consist of a variety of devices preloaded with our software. We will demonstrate the video sharing software performing comparisons in real time. We will also make the software available to Android devices via a QR code so that participants can record and exchange their own videos.
Privacy preservation is a challenging task with the huge amount of data that are available in social media. The data those are stored in the distributed environment or in cloud environment need to ensure confidentiality to data. In addition, representing the voluminous data is graph will be convenient to perform keyword search. The proposed work initially reads the data corresponding to social media and converts that into a graph. In order to prevent the data from the active attacks Advanced Encryption Standard algorithm is used to perform graph encryption. Later, search operation is done using two algorithms: kNK keyword search algorithm and top k nearest keyword search algorithm. The first scheme is used to fetch all the data corresponding to the keyword. The second scheme is used to fetch the nearest neighbor. This scheme increases the efficiency of the search process. Here shortest path algorithm is used to find the minimum distance. Now, based on the minimum value the results are produced. The proposed algorithm shows high performance for graph generation and searching and moderate performance for graph encryption.
The Internet of Things enables interaction between IoT devices and users through the cloud. The cloud provides services such as account monitoring, device management, and device control. As the center of the IoT platform, the cloud provides services to IoT devices and IoT applications through APIs. Therefore, the permission verification of the API is essential. However, we found that some APIs are unverified, which allows unauthorized users to access cloud resources or control devices; it could threaten the security of devices and cloud. To check for unauthorized access to the API, we developed IoT-APIScanner, a framework to check the permission verification of the cloud API. Through observation, we found there is a large amount of interactive information between IoT application and cloud, which include the APIs and related parameters, so we can extract them by analyzing the code of the IoT application, and use this for mutating API test cases. Through these test cases, we can effectively check the permissions of the API. In our research, we extracted a total of 5 platform APIs. Among them, the proportion of APIs without permission verification reached 13.3%. Our research shows that attackers could use the API without permission verification to obtain user privacy or control of devices.
Cipher Text Policy Attribute Based Encryption which is a form of Public Key Encryption has become a renowned approach as a Data access control scheme for data security and confidentiality. It not only provides the flexibility and scalability in the access control mechanisms but also enhances security by fuzzy fined-grained access control. However, schemes are there which for more security increases the key size which ultimately leads to high encryption and decryption time. Also, there is no provision for handling the middle man attacks during data transfer. In this paper, a light-weight and more scalable encryption mechanism is provided which not only uses fewer resources for encoding and decoding but also improves the security along with faster encryption and decryption time. Moreover, this scheme provides an efficient key sharing mechanism for providing secure transfer to avoid any man-in-the-middle attacks. Also, due to fuzzy policies inclusion, chances are there to get approximation of user attributes available which makes the process fast and reliable and improves the performance of legitimate users.
With increasing monitoring and regulation by platforms, communities with criminal interests are moving to the dark web, which hosts content ranging from whistle-blowing and privacy, to drugs, terrorism, and hacking. Using post discussion data from six dark web forums we construct six interaction graphs and use social network analysis tools to study these underground communities. We observe the structure of each network to highlight structural patterns and identify nodes of importance through network centrality analysis. Our findings suggest that in the majority of the forums some members are highly connected and form hubs, while most members have a lower number of connections. When examining the posting activities of central nodes we found that most of the central nodes post in sub-forums with broader topics, such as general discussions and tutorials. These members play different roles in the different forums, and within each forum we identified diverse user profiles.
Partitional Clustering Algorithm (PCA) on the Hadoop Distributed File System is to perform big data securities using the Perturbation Technique is the main idea of the proposed work. There are numerous clustering methods available that are used to categorize the information from the big data. PCA discovers the cluster based on the initial partition of the data. In this approach, it is possible to develop a security safeguarding of data that is impoverished to allow the calculations and communication. The performances were analyzed on Health Care database under the studies of various parameters like precision, accuracy, and F-score measure. The outcome of the results is to demonstrate that this method is used to decrease the complication in preserving privacy and better accuracy than that of the existing techniques.
Since trajectory data is widely collected and utilized for scientific research and business purpose, publishing trajectory without proper privacy-policy leads to an acute threat to individual data. Recently, several methods, i.e., k-anonymity, l-diversity, t-closeness have been studied, though they tend to protect by reducing data depends on a feature of each method. When a strong privacy protection is required, these methods have excessively reduced data utility that may affect the result of scientific research. In this research, we suggest a novel approach to tackle this existing dilemma via an adding noise trajectory on a vector-based grid environment.
Cloud computing is an Internet-based technology that emerging rapidly in the last few years due to popular and demanded services required by various institutions, organizations, and individuals. structured, unstructured, semistructured data is transfer at a record pace on to the cloud server. These institutions, businesses, and organizations are shifting more and more increasing workloads on cloud server, due to high cost, space and maintenance issues from big data, cloud computing will become a potential choice for the storage of data. In Cloud Environment, It is obvious that data is not secure completely yet from inside and outside attacks and intrusions because cloud servers are under the control of a third party. The Security of data becomes an important aspect due to the storage of sensitive data in a cloud environment. In this paper, we give an overview of characteristics and state of art of big data and data security & privacy top threats, open issues and current challenges and their impact on business are discussed for future research perspective and review & analysis of previous and recent frameworks and architectures for data security that are continuously established against threats to enhance how to keep and store data in the cloud environment.
As the number of data in various industries and government sectors is growing exponentially, the `7V' concept of big data aims to create a new value by indiscriminately collecting and analyzing information from various fields. At the same time as the ecosystem of the ICT industry arrives, big data utilization is treatened by the privacy attacks such as infringement due to the large amount of data. To manage and sustain the controllable privacy level, there need some recommended de-identification techniques. This paper exploits those de-identification processes and three types of commonly used privacy models. Furthermore, this paper presents use cases which can be adopted those kinds of technologies and future development directions.