Biblio
We explore a new security model for secure computation on large datasets. We assume that two servers have been employed to compute on private data that was collected from many users, and, in order to improve the efficiency of their computation, we establish a new tradeoff with privacy. Specifically, instead of claiming that the servers learn nothing about the input values, we claim that what they do learn from the computation preserves the differential privacy of the input. Leveraging this relaxation of the security model allows us to build a protocol that leaks some information in the form of access patterns to memory, while also providing a formal bound on what is learned from the leakage. We then demonstrate that this leakage is useful in a broad class of computations. We show that computations such as histograms, PageRank and matrix factorization, which can be performed in common graph-parallel frameworks such as MapReduce or Pregel, benefit from our relaxation. We implement a protocol for securely executing graph-parallel computations, and evaluate the performance on the three examples just mentioned above. We demonstrate marked improvement over prior implementations for these computations.
A distinguisher is employed by an adversary to explore the privacy property of a cryptographic primitive. If a cryptographic primitive is said to be private, there is no distinguisher algorithm that can be used by an adversary to distinguish the encodings generated by this primitive with non-negligible advantage. Recently, two privacy-preserving matrix transformations first proposed by Salinas et al. have been widely used to achieve the matrix-related verifiable (outsourced) computation in data protection. Salinas et al. proved that these transformations are private (in terms of indistinguishability). In this paper, we first propose the concept of a linear distinguisher and two constructions of the linear distinguisher algorithms. Then, we take those two matrix transformations (including Salinas et al.\$'\$s original work and Yu et al.\$'\$s modification) as example targets and analyze their privacy property when our linear distinguisher algorithms are employed by the adversaries. The results show that those transformations are not private even against passive eavesdropping.
While the control of individuals over their personal data is increasingly seen as an essential component of their privacy, the word "control" is usually used in a very vague way, both by lawyers and by computer scientists. This lack of precision may lead to misunderstandings and makes it difficult to check compliance. To address this issue, we propose a formal framework based on capacities to specify the notion of control over personal data and to reason about control properties. We illustrate our framework with social network systems and show that it makes it possible to characterize the types of control over personal data that they provide to their users and to compare them in a rigorous way.
Blockchain-based cryptocurrencies secure a decentralized consensus protocol by incentives. The protocol participants, called miners, generate (mine) a series of blocks, each containing monetary transactions created by system users. As incentive for participation, miners receive newly minted currency and transaction fees paid by transaction creators. Blockchain bandwidth limits lead users to pay increasing fees in order to prioritize their transactions. However, most prior work focused on models where fees are negligible. In a notable exception, Carlsten et al. [17] postulated that if incentives come only from fees then a mining gap would form\textasciitilde— miners would avoid mining when the available fees are insufficient. In this work, we analyze cryptocurrency security in realistic settings, taking into account all elements of expenses and rewards. To study when gaps form, we analyze the system as a game we call the gap game. We analyze the game with a combination of symbolic and numeric analysis tools in a wide range of scenarios. Our analysis confirms Carlsten et al.'s postulate; indeed, we show that gaps form well before fees are the only incentive, and analyze the implications on security. Perhaps surprisingly, we show that different miners choose different gap sizes to optimize their utility, even when their operating costs are identical. Alarmingly, we see that the system incentivizes large miner coalitions, reducing system decentralization. We describe the required conditions to avoid the incentive misalignment, providing guidelines for future cryptocurrency design.
Nowadays, many applications involve big data and big data analysis methods appear in many fields. As a preliminary attempt to solve the challenge of big data analysis, this paper presents a distributed online learning algorithm based on differential privacy. Since online learning can effectively process sensitive data, we introduce the concept of differential privacy in distributed online learning algorithms, with the aim at ensuring data privacy during online learning to prevent adversarial nodes from inferring any important data information. In particular, for different adversary models, we consider different type graphs to tolerate a limited number of adversaries near each regular node or tolerate a global limited number of adversaries.
Wireless sensor networks consist of various sensors that are deployed to monitor the physical world. And many existing security schemes use traditional cryptography theory to protect message content and contextual information. However, we are concerned about location security of nodes. In this paper, we propose an anonymous routing strategy for preserving location privacy (ARPLP), which sets a proxy source node to hide the location of real source node. And the real source node randomly selects several neighbors as receivers until the packets are transmitted to the proxy source. And the proxy source is randomly selected so that the adversary finds it difficult to obtain the location information of the real source node. Meanwhile, our scheme sets a branch area around the sink, which can disturb the adversary by increasing the routing branch. According to the analysis and simulation experiments, our scheme can reduce traffic consumption and communication delay, and improve the security of source node and base station.
The prevalent use of mobile applications using location information to improve the quality of their service has arisen privacy issues, particularly regarding the extraction of user's points on interest. Many studies in the literature focus on presenting algorithms that allow to protect the user of such applications. However, these solutions often require a high level of expertise to be understood and tuned properly. In this paper, the first control-based approach of this problem is presented. The protection algorithm is considered as the ``physical'' plant and its parameters as control signals that enable to guarantee privacy despite user's mobility pattern. The following of the paper presents the first control formulation of POI-related privacy measure, as well as dynamic modeling and a simple yet efficient PI control strategy. The evaluation using simulated mobility records shows the relevance and efficiency of the presented approach.
A 2D-Compressive Sensing and hyper-chaos based image compression-encryption algorithm is proposed. The 2D image is compressively sampled and encrypted using two measurement matrices. A chaos based measurement matrix construction is employed. The construction of the measurement matrix is controlled by the initial and control parameters of the chaotic system, which are used as the secret key for encryption. The linear measurements of the sparse coefficients of the image are then subjected to a hyper-chaos based diffusion which results in the cipher image. Numerical simulation and security analysis are performed to verify the validity and reliability of the proposed algorithm.
Compressed sensing (CS) integrates sampling and compression into a single step to reduce the processed data amount. However, the CS reconstruction generally suffers from high complexity. To solve this problem, compressive signal processing (CSP) is recently proposed to implement some signal processing tasks directly in the compressive domain without reconstruction. Among various CSP techniques, compressive detection achieves the signal detection based on the CS measurements. This paper investigates the compressive detection problem of random signals when the measurements are corrupted. Different from the current studies that only consider the dense noise, our study considers both the dense noise and sparse error. The theoretical performance is derived, and simulations are provided to verify the derived theoretical results.
The security and confidentiality of the data can be guaranteed by using a technique called watermarking. In this study, compressive sampling is designed and analyzed on video watermarking. Before the watermark compression process was carried out, the watermark was encoding the Bose Chaudhuri Hocquenghem Code (BCH Code). After that, the watermark is processed using the Discrete Sine Transform (DST) and Discrete Wavelet Transform (DWT). The watermark insertion process to the video host using the Stationary Wavelet Transform (SWT), and Singular Value Decomposition (SVD) methods. The results of our system are obtained with the PSNR 47.269 dB, MSE 1.712, and BER 0.080. The system is resistant to Gaussian Blur and rescaling noise attacks.
Click-through rate prediction is an essential task in industrial applications, such as online advertising. Recently deep learning based models have been proposed, which follow a similar Embedding&MLP paradigm. In these methods large scale sparse input features are first mapped into low dimensional embedding vectors, and then transformed into fixed-length vectors in a group-wise manner, finally concatenated together to fed into a multilayer perceptron (MLP) to learn the nonlinear relations among features. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are. The use of fixed-length vector will be a bottleneck, which brings difficulty for Embedding&MLP methods to capture user's diverse interests effectively from rich historical behaviors. In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. This representation vector varies over different ads, improving the expressive ability of model greatly. Besides, we develop two techniques: mini-batch aware regularization and data adaptive activation function which can help training industrial deep networks with hundreds of millions of parameters. Experiments on two public datasets as well as an Alibaba real production dataset with over 2 billion samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with state-of-the-art methods. DIN now has been successfully deployed in the online display advertising system in Alibaba, serving the main traffic.
We introduce a new sub-linear space sketch—the Weight-Median Sketch—for learning compressed linear classifiers over data streams while supporting the efficient recovery of large-magnitude weights in the model. This enables memory-limited execution of several statistical analyses over streams, including online feature selection, streaming data explanation, relative deltoid detection, and streaming estimation of pointwise mutual information. Unlike related sketches that capture the most frequently-occurring features (or items) in a data stream, the Weight-Median Sketch captures the features that are most discriminative of one stream (or class) compared to another. The Weight-Median Sketch adopts the core data structure used in the Count-Sketch, but, instead of sketching counts, it captures sketched gradient updates to the model parameters. We provide a theoretical analysis that establishes recovery guarantees for batch and online learning, and demonstrate empirical improvements in memory-accuracy trade-offs over alternative memory-budgeted methods, including count-based sketches and feature hashing.
The difference of sensor devices and the camera position offset will lead the geometric differences of the matching images. The traditional SIFT image matching algorithm has a large number of incorrect matching point pairs and the matching accuracy is low during the process of image matching. In order to solve this problem, a SIFT image matching based on Maximum Likelihood Estimation Sample Consensus (MLESAC) algorithm is proposed. Compared with the traditional SIFT feature matching algorithm, SURF feature matching algorithm and RANSAC feature matching algorithm, the proposed algorithm can effectively remove the false matching feature point pairs during the image matching process. Experimental results show that the proposed algorithm has higher matching accuracy and faster matching efficiency.
With the increase of mobile equipment and transmission data, Common Public Radio Interface (CPRI) between Building Base band Unit (BBU) and Remote Radio Unit (RRU) suffers amounts of increasing transmission data. It is essential to compress the data in CPRI if more data should be transferred without congestion under the premise of restriction of fiber consumption. A data compression scheme based on Discrete Sine Transform (DST) and Lloyd-Max quantization is proposed in distributed Base Station (BS) architecture. The time-domain samples are transformed by DST according to the characteristics of Orthogonal Frequency Division Multiplexing (OFDM) baseband signals, and then the coefficients after transformation are quantified by the Lloyd-Max quantizer. The simulation results show that the proposed scheme can work at various Compression Ratios (CRs) while the values of Error Vector Magnitude (EVM) are better than the limits in 3GPP.
Compressed sensing (CS) can recover a signal that is sparse in certain representation and sample at the rate far below the Nyquist rate. But limited to the accuracy of atomic matching of traditional reconstruction algorithm, CS is difficult to reconstruct the initial signal with high resolution. Meanwhile, scholar found that trained neural network have a strong ability in settling such inverse problems. Thus, we propose a Super-Resolution Convolutional Neural Network (SRCNN) that consists of three convolutional layers. Every layer has a fixed number of kernels and has their own specific function. The process is implemented using classical compressed sensing algorithm to process the input image, afterwards, the output images are coded via SRCNN. We achieve higher resolution image by using the SRCNN algorithm proposed. The simulation results show that the proposed method helps improve PSNR value and promote visual effect.
This paper presents a novel Kriged Compressive Sensing (KCS) approach for the reconstruction of underwater acoustic intensity fields sampled by multiple gliders following sawtooth sampling patterns. Blank areas in between the sampling trajectories may cause unsatisfying reconstruction results. The KCS method leverages spatial statistical correlation properties of the acoustic intensity field being sampled to improve the compressive reconstruction process. Virtual data samples generated from a kriging method are inserted into the blank areas. We show that by using the virtual samples along with real samples, the acoustic intensity field can be reconstructed with higher accuracy when coherent spatial patterns exist. Corresponding algorithms are developed for both unweighted and weighted KCS methods. By distinguishing the virtual samples from real samples through weighting, the reconstruction results can be further improved. Simulation results show that both algorithms can improve the reconstruction results according to the PSNR and SSIM metrics. The methods are applied to process the ocean ambient noise data collected by the Sea-Wing acoustic gliders in the South China Sea.
Traditional image compressed sensing (CS) coding frameworks solve an inverse problem that is based on the measurement coding tools (prediction, quantization, entropy coding, etc.) and the optimization based image reconstruction method. These CS coding frameworks face the challenges of improving the coding efficiency at the encoder, while also suffering from high computational complexity at the decoder. In this paper, we move forward a step and propose a novel deep network based CS coding framework of natural images, which consists of three sub-networks: sampling sub-network, offset sub-network and reconstruction sub-network that responsible for sampling, quantization and reconstruction, respectively. By cooperatively utilizing these sub-networks, it can be trained in the form of an end-to-end metric with a proposed rate-distortion optimization loss function. The proposed framework not only improves the coding performance, but also reduces the computational cost of the image reconstruction dramatically. Experimental results on benchmark datasets demonstrate that the proposed method is capable of achieving superior rate-distortion performance against state-of-the-art methods.
We describe a new way of compressing two-party communication protocols to get protocols with potentially smaller communication. We show that every communication protocol that communicates C bits and reveals I bits of information about the participants' private inputs to an observer that watches the communication, can be simulated by a new protocol that communicates at most poly(I) $\cdot$ loglog(C) bits. Our result is tight up to polynomial factors, as it matches the recent work separating communication complexity from external information cost.
The integrity and reliability of speech data have been important issues to probative use. Watermarking technologies supplies an alternative solution to guarantee the the authenticity of multiple data besides digital signature. This work proposes a novel digital watermarking based on a reversible compression algorithm with sample scanning to detect tampering in time domain. In order to detect tampering precisely, the digital speech data is divided into length-fixed frames and the content-based hash information of each frame is calculated and embedded into the speech data for verification. Huffman compression algorithm is applied to each four sampling bits from least significant bit in each sample after pulse-code modulation processing to achieve low distortion and high capacity for hiding payload. Experimental experiments on audio quality, detection precision and robustness towards attacks are taken, and the results show the effectiveness of tampering detection with a precision with an error around 0.032 s for a 10 s speech clip. Distortion is imperceptible with an average 22.068 dB for Huffman-based and 24.139 dB for intDCT-based method in terms of signal-to-noise, and with an average MOS 3.478 for Huffman-based and 4.378 for intDCT-based method. The bit error rate (BER) between stego data and attacked stego data in both of time-domain and frequency domain is approximate 28.6% in average, which indicates the robustness of the proposed hiding method.
The term "Cyber Physical System" (CPS) has been used in the recent years to describe a system type, which makes use of powerful communication networks to functionally combine systems that were previously thought of as independent. The common theme of CPSs is that through communication, CPSs can make decisions together and achieve common goals. Yet, in contrast to traditional system types such as embedded systems, the functional dependence between CPSs can change dynamically at runtime. Hence, their functional dependence may cause unforeseen runtime behavior, e.g., when a CPS becomes unavailable, but others depend on its correct operation. During development of any individual CPS, this runtime behavior must hence be predicted, and the system must be developed with the appropriate level of robustness. Since at present, research is mainly concerned with the impact of functional dependence in CPS on development, the impact on runtime behavior is mere conjecture. In this paper, we present AirborneCPS, a simulation tool for functionally dependent CPSs which simulates runtime behavior and aids in the identification of undesired functional interaction.
Converged Multi-Level Secure systems allow users to interact with and freely move between applications and data of varying sensitivity on a single user interface. They promise unprecedented usability and security, especially in security-critical environments like Defence. Yet these promises rely on hard assumptions about secure user behaviour. We present initial work to test the validity of these assumptions in the absence of deception by an adversary. We conducted a user study with 21 participants on the Cross Domain Desktop Compositor. Chief amongst our findings is that the vast majority of participants (19 of 21) behave securely, even when doing so requires more effort than to behave insecurely. Our findings suggest that there is large scope for further research on converged Multi-Level Secure systems, and highlight the value of user studies to complement formal security analyses of critical systems.
Assurance is a demonstration that a complex system (such as a car or a communication network) possesses an importantproperty, such as safety or security, with a high level of confidence. In contrast to currently dominant approaches to building assurance cases, which are focused on goal structuring and/or logical inference, we propose considering assurance as a model transformation (MT) enterprise: saying that a system possesses an assured property amounts to saying that a particular assurance view of the system comprising the assurance data, satisfies acceptance criteria posed as assurance constraints. While the MT realizing this view is very complex, we show that it can be decomposed into elementary MTs via a hierarchy of refinement steps. The transformations at the bottom level are ordinary MTs that can be executed for data specifying the system, thus providing the assurance data to be checked against the assurance constraints. In this way, assurance amounts to traversing the hierarchy from the top to the bottom and assuring the correctness of each MT in the path. Our approach has a precise mathematical foundation (rooted in process algebra and category theory) –- a necessity if we are to model precisely and then analyze our assurance cases. We discuss the practical applicability of the approach, and argue that it has several advantages over existing approaches.
Contemporary enterprise systems focus primarily on performance and development/maintenance costs. Dealing with cyber-threats and system compromise is relegated to good coding (i.e., defensive programming) and secure environment (e.g., patched OS, firewalls, etc.). This approach, while a necessary start, is not sufficient. Such security relies on no missteps, and compromise only need a single flaw; consequently, we must design for compromise and mitigate its impact. One approach is to utilize fine-grained modularization and isolation. In such a system, decomposition ensures that compromise of a single module presents limited and known risk to data/resource theft and denial. We propose mechanisms for automating such modular composition and consider its system performance impact.
Complex software is built by composing components implementing largely independent blocks of functionality. However, once the sources are compiled into an executable, that modularity is lost. This is unfortunate for code recipients, for whom knowing the components has many potential benefits, such as improved program understanding for reverse-engineering, identifying shared code across different programs, binary code reuse, and authorship attribution. A novel approach for decomposing such source-free program executables into components is here proposed. Given an executable, the approach first statically builds a decomposition graph, where nodes are functions and edges capture three types of relationships: code locality, data references, and function calls. It then applies a graph-theoretic approach to partition the functions into disjoint components. A prototype implementation, BCD, demonstrates the approach's efficacy: Evaluation of BCD with 25 C++ binary programs to recover the methods belonging to each class achieves high precision and recall scores for these tested programs.
We present improved protocols for the conversion of secret-shared bit-vectors into secret-shared integers and vice versa, for the use as subroutines in secure multiparty computation (SMC) protocols and for protocols verifying the adherence of parties to prescribed SMC protocols. The protocols are primarily designed for three-party computation with honest majority. We evaluate our protocols as part of the Sharemind three-party protocol set and see a general reduction of verification overheads, thereby increasing the practicality of covertly or actively secure Sharemind protocols.