Biblio
In cloud computing, computationally weak users are always willing to outsource costly computations to a cloud, and at the same time they need to check the correctness of the result provided by the cloud. Such activities motivate the occurrence of verifiable computation (VC). Recently, Parno, Raykova and Vaikuntanathan showed any VC protocol can be constructed from an attribute-based encryption (ABE) scheme for a same class of functions. In this paper, we propose two practical and efficient semi-adaptively secure key-policy attribute-based encryption (KP-ABE) schemes with constant-size ciphertexts. The semi-adaptive security requires that the adversary designates the challenge attribute set after it receives public parameters but before it issues any secret key query, which is stronger than selective security guarantee. Our first construction deals with small universe while the second one supports large universe. Both constructions employ the technique underlying the prime-order instantiation of nested dual system groups, which are based on the \$d\$-linear assumption including SXDH and DLIN assumptions. In order to evaluate the performance, we implement our ABE schemes using \$\textbackslashtextsf\Python\\$ language in Charm. Compared with previous KP-ABE schemes with constant-size ciphertexts, our constructions achieve shorter ciphertext and secret key sizes, and require low computation costs, especially under the SXDH assumption.
We study a sensor network setting in which samples are encrypted individually using different keys and maintained on a cloud storage. For large systems, e.g. those that generate several millions of samples per day, fine-grained sharing of encrypted samples is challenging. Existing solutions, such as Attribute-Based Encryption (ABE) and Key Aggregation Cryptosystem (KAC), can be utilized to address the challenge, but only to a certain extent. They are often computationally expensive and thus unlikely to operate at scale. We propose an algorithmic enhancement and two heuristics to improve KAC's key reconstruction cost, while preserving its provable security. The improvement is particularly significant for range and down-sampling queries – accelerating the reconstruction cost from quadratic to linear running time. Experimental study shows that for queries of size 32k samples, the proposed fast reconstruction techniques speed-up the original KAC by at least 90 times on range and down-sampling queries, and by eight times on general (arbitrary) queries. It also shows that at the expense of splitting the query into 16 sub-queries and correspondingly issuing that number of different aggregated keys, reconstruction time can be reduced by 19 times. As such, the proposed techniques make KAC more applicable in practical scenarios such as sensor networks or the Internet of Things.
We study a sensor network setting in which samples are encrypted individually using different keys and maintained on a cloud storage. For large systems, e.g. those that generate several millions of samples per day, fine-grained sharing of encrypted samples is challenging. Existing solutions, such as Attribute-Based Encryption (ABE) and Key Aggregation Cryptosystem (KAC), can be utilized to address the challenge, but only to a certain extent. They are often computationally expensive and thus unlikely to operate at scale. We propose an algorithmic enhancement and two heuristics to improve KAC's key reconstruction cost, while preserving its provable security. The improvement is particularly significant for range and down-sampling queries – accelerating the reconstruction cost from quadratic to linear running time. Experimental study shows that for queries of size 32k samples, the proposed fast reconstruction techniques speed-up the original KAC by at least 90 times on range and down-sampling queries, and by eight times on general (arbitrary) queries. It also shows that at the expense of splitting the query into 16 sub-queries and correspondingly issuing that number of different aggregated keys, reconstruction time can be reduced by 19 times. As such, the proposed techniques make KAC more applicable in practical scenarios such as sensor networks or the Internet of Things.
As everyone knows vulnerability detection is a very difficult and time consuming work, so taking advantage of the unlabeled data sufficiently is needed and helpful. According the above reality, in this paper a method is proposed to predict buffer overflow based on semi-supervised learning. We first employ Antlr to extract AST from C/C++ source files, then according to the 22 buffer overflow attributes taxonomies, a 22-dimension vector is extracted from every function in AST, at last, the vector is leveraged to train a classifier to predict buffer overflow vulnerabilities. The experiment and evaluation indicate our method is correct and efficient.
Malware evolves perpetually and relies on increasingly so- phisticated attacks to supersede defense strategies. Data-driven approaches to malware detection run the risk of becoming rapidly antiquated. Keeping pace with malware requires models that are periodically enriched with fresh knowledge, commonly known as retraining. In this work, we propose the use of Venn-Abers predictors for assessing the quality of binary classification tasks as a first step towards identifying antiquated models. One of the key benefits behind the use of Venn-Abers predictors is that they are automatically well calibrated and offer probabilistic guidance on the identification of nonstationary populations of malware. Our framework is agnostic to the underlying classification algorithm and can then be used for building better retraining strategies in the presence of concept drift. Results obtained over a timeline-based evaluation with about 90K samples show that our framework can identify when models tend to become obsolete.
New hardware primitives such as Intel SGX secure a user-level process in presence of an untrusted or compromised OS. Such "enclaved execution" systems are vulnerable to several side-channels, one of which is the page fault channel. In this paper, we show that the page fault side-channel has sufficient channel capacity to extract bits of encryption keys from commodity implementations of cryptographic routines in OpenSSL and Libgcrypt – leaking 27% on average and up to 100% of the secret bits in many case-studies. To mitigate this, we propose a software-only defense that masks page fault patterns by determinising the program's memory access behavior. We show that such a technique can be built into a compiler, and implement it for a subset of C which is sufficient to handle the cryptographic routines we study. This defense when implemented generically can have significant overhead of up to 4000X, but with help of developer-assisted compiler optimizations, the overhead reduces to at most 29.22% in our case studies. Finally, we discuss scope for hardware-assisted defenses, and show one solution that can reduce overheads to 6.77% with support from hardware changes.
Vehicular users are expected to consume large amounts of data, for both entertainment and navigation purposes. This will put a strain on cellular networks, which will be able to cope with such a load only if proper caching is in place; this in turn begs the question of which caching architecture is the best-suited to deal with vehicular content consumption. In this paper, we leverage a large-scale, crowd-sourced trace to (i) characterize the vehicular traffic demand, in terms of overall magnitude and content breakup; (ii) assess how different caching approaches perform against such a real-world load; (iii) study the effect of recommendation systems and local content items. We define a price-of-fog metric, expressing the additional caching capacity to deploy when moving from traditional, centralized caching architectures to a "fog computing" approach, where caches are closer to the network edge. We find that for location-specific items, such as the ones that vehicular users are most likely to request, such a price almost disappears. Vehicular networks thus make a strong case for the adoption of mobile-edge caching, as we are able to reap the benefit thereof – including a reduction in the distance travelled by data, within the core network – with little or none of the associated disadvantages.
Unease over data privacy will retard consumer acceptance of IoT deployments. The primary source of discomfort is a lack of user control over raw data that is streamed directly from sensors to the cloud. This is a direct consequence of the over-centralization of today's cloud-based IoT hub designs. We propose a solution that interposes a locally-controlled software component called a privacy mediator on every raw sensor stream. Each mediator is in the same administrative domain as the sensors whose data is being collected, and dynamically enforces the current privacy policies of the owners of the sensors or mobile users within the domain. This solution necessitates a logical point of presence for mediators within the administrative boundaries of each organization. Such points of presence are provided by cloudlets, which are small locally-administered data centers at the edge of the Internet that can support code mobility. The use of cloudlet-based mediators aligns well with natural personal and organizational boundaries of trust and responsibility.
The wide presence of large graph data and the increasing popularity of storing data in the cloud drive the needs for graph query processing on a remote cloud. But a fundamental challenge is to process user queries without compromising sensitive information. This work focuses on privacy preserving subgraph matching in a cloud server. The goal is to minimize the overhead on both cloud and client sides for subgraph matching, without compromising users' sensitive information. To that end, we transform an original graph \$G\$ into a privacy preserving graph Gk, which meets the requirement of an existing privacy model known as k-automorphism. By making use of the symmetry in a k-automorphic graph, a subgraph matching query can be efficiently answered using a graph Go, a small subset of Gk. This approach saves both space and query cost in the cloud server. We also anonymize the query graphs to protect their label information using label generalization technique. To reduce the search space for a subgraph matching query, we propose a cost model to select the more effective label combinations. The effectiveness and efficiency of our method are demonstrated through extensive experimental results on real datasets.
The lifelogging activity enables a user, the lifelogger, to passively capture multimodal records from a first-person perspective and ultimately create a visual diary encompassing every possible aspect of her life with unprecedented details. In recent years it has gained popularity among different groups of users. However, the possibility of ubiquitous presence of lifelogging devices especially in private spheres has raised serious concerns with respect to personal privacy. Different practitioners and active researchers in the field of lifelogging have analysed the issue of privacy in lifelogging and proposed different mitigation strategies. However, none of the existing works has considered a well-defined privacy threat model in the domain of lifelogging. Without a proper threat model, any analysis and discussion of privacy threats in lifelogging remains incomplete. In this paper we aim to fill in this gap by introducing a first-ever privacy threat model identifying several threats with respect to lifelogging. We believe that the introduced threat model will be an essential tool and will act as the basis for any further research within this domain.
Biometric verification systems have security issues regarding the storage of biometric data in that a user's biometric features cannot be changed into other ones even when a system is compromised. To address this issue, it may be safe to store the biometrics data on a reliable remote server instead of storing them in a local device. However, this approach may raise a privacy issue. In this paper, we propose a biometric verification system where the biometric data are stored in a remote server in an encrypted form and the similarity of the user input to the registered biometric data is computed in an encrypted domain using a homomorphic encryption. We evaluated the performance of the proposed system through an implementation on an Android-based smartphone and an i7-based server.
Vehicle trajectories are one of the most important data in location-based services. The quality of trajectories directly affects the services. However, in the real applications, trajectory data are not always sampled densely. In this paper, we study the problem of recovering the entire route between two distant consecutive locations in a trajectory. Most existing works solve the problem without using those informative historical data or solve it in an empirical way. We claim that a data-driven and probabilistic approach is actually more suitable as long as data sparsity can be well handled. We propose a novel route recovery system in a fully probabilistic way which incorporates both temporal and spatial dynamics and addresses all the data sparsity problem introduced by the probabilistic method. It outperforms the existing works with a high accuracy (over 80%) and shows a strong robustness even when the length of routes to be recovered is very long (about 30 road segments) or the data is very sparse.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
Web applications are a frequent target of successful attacks. In most web frameworks, the damage is amplified by the fact that application code is responsible for security enforcement. In this paper, we design and evaluate Radiatus, a shared-nothing web framework where application-specific computation and storage on the server is contained within a sandbox with the privileges of the end-user. By strongly isolating users, user data and service availability can be protected from application vulnerabilities. To make Radiatus practical at the scale of modern web applications, we introduce a distributed capabilities system to allow fine-grained secure resource sharing across the many distributed services that compose an application. We analyze the strengths and weaknesses of a shared-nothing web architecture, which protects applications from a large class of vulnerabilities, but adds an overhead of 60.7% per server and requires an additional 31MB of memory per active user. We demonstrate that the system can scale to 20K operations per second on a 500-node AWS cluster.
Modern world has witnessed a dramatic increase in our ability to collect, transmit and distribute real-time monitoring and surveillance data from large-scale information systems and cyber-physical systems. Detecting system anomalies thus attracts significant amount of interest in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be a powerful way in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: 1) fault propagation in the network is ignored; 2) the root casual anomalies may not always be the nodes with a high-percentage of vanishing correlations; 3) temporal patterns of vanishing correlations are not exploited for robust detection. To address these limitations, in this paper we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectively model fault propagation over the entire invariant network, and can perform joint inference on both the structural, and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations, and can compensate for unstructured measurement noise in the system. Extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets demonstrate the effectiveness of our approach.
An Egyptian statue on display at the Manchester Museum mysteriously spins on its axis every day; it is eventually discovered that this is due to anisotropic friction forces, and that the motile power comes from imperceptible mechanical waves caused by visitors' footsteps and nearby traffic. This phenomena involves microscopic ratchets, and is pervasive in the microscopic world - this is basically how muscles contract. It was the source of inspiration to think about everyday objects that move by harvesting external vibration rather than using mechanical traction and steering wheels. We propose here a strategy for displacing objects by attaching relatively small vibration sources. After learning how several random bursts of vibration affect its pose, an optimization algorithm discovers the optimal sequence of vibration patterns required to (slowly but surely) move the object to a very different specified position. We describe and demonstrate two application scenarios, namely assisted transportation of heavy objects with little effort on the part of the human and self arranging furniture, useful for instance to clean classrooms or restaurants during vacant hours.
Anti-tampering is a form of software protection conceived to detect and avoid the execution of tampered programs. Tamper detection assesses programs' integrity with load or execution-time checks. Avoidance reacts to tampered programs by stopping or rendering them unusable. General purpose reactions (such as halting the execution) stand out like a lighthouse in the code and are quite easy to defeat by an attacker. More sophisticated reactions, which degrade the user experience or the quality of service, are less easy to locate and remove but are too tangled with the program's business logic, and are thus difficult to automate by a general purpose protection tool. In the present paper, we propose a novel approach to anti-tampering that (i) fully automatically applies to a target program, (ii) uses Remote Attestation for detection purposes and (iii) adopts a server-side reaction that is difficult to block by an attacker. By means of Client/Server Code Splitting, a crucial part of the program is removed from the client and executed on a remote trusted server in sync with the client. If a client program provides evidences of its integrity, the part moved to the server is executed. Otherwise, a server-side reaction logic may (temporarily or definitely) decide to stop serving it. Therefore, a tampered client application can not continue its execution. We assessed our automatic protection tool on a case study Android application. Experimental results show that all the original and tampered executions are correctly detected, reactions are promptly applied, and execution overhead is on an acceptable level.
This paper describes the ASPIRE reference architecture designed to tackle one major problem in this domain: the lack of a clear process and an open software architecture for the composition and deployment of multiple software protections on software applications.
This paper describes the ASPIRE reference architecture designed to tackle one major problem in this domain: the lack of a clear process and an open software architecture for the composition and deployment of multiple software protections on software applications.
We present ReproZip, the recommended packaging tool for the SIGMOD Reproducibility Review. ReproZip was designed to simplify the process of making an existing computational experiment reproducible across platforms, even when the experiment was put together without reproducibility in mind. The tool creates a self-contained package for an experiment by automatically tracking and identifying all its required dependencies. The researcher can share the package with others, who can then use ReproZip to unpack the experiment, reproduce the findings on their favorite operating system, as well as modify the original experiment for reuse in new research, all with little effort. The demo will consist of examples of non-trivial experiments, showing how these can be packed in a Linux machine and reproduced on different machines and operating systems. Demo visitors will also be able to pack and reproduce their own experiments.
In order to realize the accurate positioning and recognition effectively of the analog circuit, the feature extraction of fault information is an extremely important port. This arrival based on the experimental circuit which is designed as a failure mode to pick-up the fault sample set. We have chosen two methods, one is the combination of wavelet transform and principal component analysis, the other is the factorial analysis for the fault data's feature extraction, and we also use the extreme learning machine to train and diagnose the data, to compare the performance of these two methods through the accuracy of the diagnosis. The results of the experiment shows that the data which we get from the experimental circuit, after dealing with these two methods can quickly get the fault location.