Biblio
The smart grid is a complex cyber-physical system (CPS) that poses challenges related to scale, integration, interoperability, processes, governance, and human elements. The US National Institute of Standards and Technology (NIST) and its government, university and industry collaborators, developed an approach, called CPS Framework, to reasoning about CPS across multiple levels of concern and competency, including trustworthiness, privacy, reliability, and regulatory. The approach uses ontology and reasoning techniques to achieve a greater understanding of the interdependencies among the elements of the CPS Framework model applied to use cases. This paper demonstrates that the approach extends naturally to automated and manual decision-making for smart grids: we apply it to smart grid use cases, and illustrate how it can be used to analyze grid topologies and address concerns about the smart grid. Smart grid stakeholders, whose decision making may be assisted by this approach, include planners, designers and operators.
This paper studies the deletion propagation problem in terms of minimizing view side-effect. It is a problem funda-mental to data lineage and quality management which could be a key step in analyzing view propagation and repairing data. The investigated problem is a variant of the standard deletion propagation problem, where given a source database D, a set of key preserving conjunctive queries Q, and the set of views V obtained by the queries in Q, we try to identify a set T of tuples from D whose elimination prevents all the tuples in a given set of deletions on views △V while preserving any other results. The complexity of this problem has been well studied for the case with only a single query. Dichotomies, even trichotomies, for different settings are developed. However, no results on multiple queries are given which is a more realistic case. We study the complexity and approximations of optimizing the side-effect on the views, i.e., find T to minimize the additional damage on V after removing all the tuples of △V. We focus on the class of key-preserving conjunctive queries which is a dichotomy for the single query case. It is surprising to find that except the single query case, this problem is NP-hard to approximate within any constant even for a non-trivial set of multiple project-free conjunctive queries in terms of view side-effect. The proposed algorithm shows that it can be approximated within a bound depending on the number of tuples of both V and △V. We identify a class of polynomial tractable inputs, and provide a dynamic programming algorithm to solve the problem. Besides data lineage, study on this problem could also provide important foundations for the computational issues in data repairing. Furthermore, we introduce some related applications of this problem, especially for query feedback based data cleaning.
Lack of effective accountability mechanisms brings a series of security problems for Internet today. In Next Generation Internet based on IPv6, the system of identity authentication and IP verification is the key to accounting ability. Source Address Validation Improvement (SAVI) can protect IP source addresses from being faked. But without identity authentication mechanism and certain relationship between IP and accountable identity, the accountability is still unreliable. To solve this problem, most research focus on embedding accountable identity into IP address which need either changing DHCP client or twice DHCP request process due to the separate process of user authentication and address assignment. Different from previous research, this paper first analyzes the problems and requirements of combining Web Portal or 802.1X, two main identity authentication mechanism (AAA), with the accountable address assignment in SAVI frame-work. Then a novel Cooperative mechanism for Accountable IP address assignment (CAIP) is proposed based on 802.1X and SAVI, which takes into account the validation of IP address, the authenticity and accountability of identity at the same time. Finally, we build up prototype system for both Fat AP and Thin AP wireless scenarios and simulate the performance of CAIP through large-scale campus networks' data logs. The experiment result shows that the IP addresses and identities in CAIP are protective and accountable. Compared with other previous research, CAIP is not only transparent to the terminals and networks, but also low impact on network equipment, which makes CAIP easy deployment with high compatibility and low cost.
With the development of new technologies in the world, governments have tendency to make a communications with people and business with the help of such technologies. Electronic government (e-government) is defined as utilizing information technologies such as electronic networks, Internet and mobile phones by organizations and state institutions in order to making wide communication between citizens, business and different state institutions. Development of e-government starts with making website in order to share information with users and is considered as the main infrastructure for further development. Website assessment is considered as a way for improving service quality. Different international researches have introduced various indexes for website assessment, they only see some dimensions of website in their research. In this paper, the most important indexes for website quality assessment based on accurate review of previous studies are "Web design", "navigation", services", "maintenance and Support", "Citizens Participation", "Information Quality", "Privacy and Security", "Responsiveness", "Usability". Considering mentioned indexes in designing the website facilitates user interaction with the e-government websites.
Steganography is defined as the art of hiding secret data in a non-secret digital carrier called cover media. Trading delicate data without assurance against intruders that may intrude on this data is a lethal. In this manner, transmitting delicate information and privileged insights must not rely on upon just the current communications channels insurance advancements. Likewise should make more strides towards information insurance. This article proposes an improved approach for video steganography. The improvement made by searching for exact matching between the secret text and the video frames RGB channels and Random Key -Dependent Data, achieving steganography performance criteria, invisibility, payload/capacity and robustness.
Analyzing clustering results may lead to the privacy disclosure issue in big data mining. In this paper, we put forward a differential privacy-based protecting data preprocessing method for distance-based clustering. Firstly, the data distortion technique differential privacy is used to prevent the distances in distance-based clustering from disclosing the relationships. Differential privacy may affect the clustering results while protecting privacy. Then an adaptive privacy budget parameter adjustment mechanism is applied for keeping the balance between the privacy protection and the clustering results. By solving the maximum and minimum problems, the differential privacy budget parameter can be obtained for different clustering algorithms. Finally, we conduct extensive experiments to evaluate the performance of our proposed method. The results demonstrate that our method can provide privacy protection with precise clustering results.
K-anonymity is a popular model used in microdata publishing to protect individual privacy. This paper introduces the idea of ball tree and projection area density partition into k-anonymity algorithm.The traditional kd-tree implements the division by forming a super-rectangular, but the super-rectangular has the area angle, so it cannot guarantee that the records on the corner are most similar to the records in this area. In this paper, the super-sphere formed by the ball-tree is used to address this problem. We adopt projection area density partition to increase the density of the resulting recorded points. We implement our algorithm with the Gotrack dataset and the Adult dataset in UCI. The experimentation shows that the k-anonymity algorithm based on ball-tree and projection area density partition, obtains more anonymous groups, and the generalization rate is lower. The smaller the K is, the more obvious the result advantage is. The result indicates that our algorithm can make data usability even higher.
Expected and unexpected risks in cloud computing, which included data security, data segregation, and the lack of control and knowledge, have led to some dilemmas in several fields. Among all of these dilemmas, the privacy problem is even more paramount, which has largely constrained the prevalence and development of cloud computing. There are several privacy protection algorithms proposed nowadays, which generally include two categories, Anonymity algorithm, and differential privacy mechanism. Since many types of research have already focused on the efficiency of the algorithms, few of them emphasized the different orientation and demerits between the two algorithms. Motivated by this emerging research challenge, we have conducted a comprehensive survey on the two popular privacy protection algorithms, namely K-Anonymity Algorithm and Differential Privacy Algorithm. Based on their principles, implementations, and algorithm orientations, we have done the evaluations of these two algorithms. Several expectations and comparisons are also conducted based on the current cloud computing privacy environment and its future requirements.
Because cloud storage services have been broadly used in enterprises for online sharing and collaboration, sensitive information in images or documents may be easily leaked outside the trust enterprise on-premises due to such cloud services. Existing solutions to this problem have not fully explored the tradeoffs among application performance, service scalability, and user data privacy. Therefore, we propose CloudDLP, a generic approach for enterprises to automatically sanitize sensitive data in images and documents in browser-based cloud storage. To the best of our knowledge, CloudDLP is the first system that automatically and transparently detects and sanitizes both sensitive images and textual documents without compromising user experience or application functionality on browser-based cloud storage. To prevent sensitive information escaping from on-premises, CloudDLP utilizes deep learning methods to detect sensitive information in both images and textual documents. We have evaluated the proposed method on a number of typical cloud applications. Our experimental results show that it can achieve transparent and automatic data sanitization on the cloud storage services with relatively low overheads, while preserving most application functionalities.
The rapid growth of Android malware apps poses a great security threat to users thus it is very important and urgent to detect Android malware effectively. What's more, the increasing unknown malware and evasion technique also call for novel detection method. In this paper, we focus on API feature and develop a novel method to detect Android malware. First, we propose a novel selection method for API feature related with the malware class. However, such API also has a legitimate use in benign app thus causing FP problem (misclassify benign as malware). Second, we further explore structure relationships between these APIs and map to a matrix interpreted as the hand-refined API-based feature graph. Third, a CNN-based classifier is developed for the API-based feature graph classification. Evaluations of a real-world dataset containing 3,697 malware apps and 3,312 benign apps demonstrate that selected API feature is effective for Android malware classification, just top 20 APIs can achieve high F1 of 94.3% under Random Forest classifier. When the available API features are few, classification performance including FPR indicator can achieve effective improvement effectively by complementing our further work.
We present a novel, and use case agnostic method of identifying and circumventing private data exposure across distributed and high-dimensional data repositories. Examples of distributed high-dimensional data repositories include medical research and treatment data, where oftentimes more than 300 describing attributes appear. As such, providing strong guarantees of data anonymity in these repositories is a hard constraint in adhering to privacy legislation. Yet, when applied to distributed high-dimensional data, existing anonymisation algorithms incur high levels of information loss and do not guarantee privacy defeating the purpose of anonymisation. In this paper, we address this issue by using Bayesian networks to handle data transformation for anonymisation. By evaluating every attribute combination to determine the privacy exposure risk, the conditional probability linking attribute pairs is computed. Pairs with a high conditional probability expose the risk of deanonymisation similar to quasi-identifiers and can be separated instead of deleted, as in previous algorithms. Attribute separation removes the risk of privacy exposure, and deletion avoidance results in a significant reduction in information loss. In other words, assimilating the conditional probability of outliers directly in the adjacency matrix in a greedy fashion is quick and thwarts de-anonymisation. Since identifying every privacy violating attribute combination is a W[2]-complete problem, we optimise the procedure with a multigrid solver method by evaluating the conditional probabilities between attribute pairs, and aggregating state space explosion of attribute pairs through manifold learning. Finally, incremental processing of new data is achieved through inexpensive, continuous (delta) learning.
File update operations generate many invalid flash pages in Solid State Drives (SSDs) because of the-of-place update feature. If these invalid flash pages are not securely deleted, they will be left in the “missing” state, resulting in leakage of sensitive information. However, deleting these invalid pages in real time greatly reduces the performance of SSD. In this paper, we propose a Per-File Secure Deletion (PSD) scheme for SSD to achieve non-real-time secure deletion. PSD assigns a globally unique identifier (GUID) to each file to quickly locate the invalid data blocks and uses Security-TRIM command to securely delete these invalid data blocks. Moreover, we propose a PSD-MLC scheme for Multi-Level Cell (MLC) flash memory. PSD-MLC distributes the data blocks of a file in pairs of pages to avoid the influence of programming crosstalk between paired pages. We evaluate our schemes on different hardware platforms of flash media, and the results prove that PSD and PSD-MLC only have little impact on the performance of SSD. When the cache is disabled and enabled, compared with the system without the secure deletion, PSD decreases SSD throughput by 1.3% and 1.8%, respectively. PSD-MLC decreases SSD throughput by 9.5% and 10.0%, respectively.
Federated learning is a novel distributed learning framework, where the deep learning model is trained in a collaborative manner among thousands of participants. The shares between server and participants are only model parameters, which prevent the server from direct access to the private training data. However, we notice that the federated learning architecture is vulnerable to an active attack from insider participants, called poisoning attack, where the attacker can act as a benign participant in federated learning to upload the poisoned update to the server so that he can easily affect the performance of the global model. In this work, we study and evaluate a poisoning attack in federated learning system based on generative adversarial nets (GAN). That is, an attacker first acts as a benign participant and stealthily trains a GAN to mimic prototypical samples of the other participants' training set which does not belong to the attacker. Then these generated samples will be fully controlled by the attacker to generate the poisoning updates, and the global model will be compromised by the attacker with uploading the scaled poisoning updates to the server. In our evaluation, we show that the attacker in our construction can successfully generate samples of other benign participants using GAN and the global model performs more than 80% accuracy on both poisoning tasks and main tasks.
Signal processing in encrypted domain has become an important mean to protect privacy in an untrusted network environment. Due to the limitations of the underlying encryption methods, many useful algorithms that are sophisticated are not well implemented. Considering that QR decomposition is widely used in many fields, in this paper, we propose to implement QR decomposition in homomorphic encrypted domain. We firstly realize some necessary primitive operations in homomorphic encrypted domain, including division and open square operation. Gram-Schmidt process is then studied in the encrypted domain. We propose the implementation of QR decomposition in the encrypted domain by using the secure implementation of Gram-Schmidt process. We conduct experiments to demonstrate the effectiveness and analyze the performance of the proposed outsourced QR decomposition.