Biblio
Accelerated Processing Unit (APU) is a heterogeneous multicore processor that contains general-purpose CPU cores and a GPU in a single chip. It also supports Heterogeneous System Architecture (HSA) that provides coherent physically-shared memory between the CPU and the GPU. In this paper, we present the design and implementation of a high-performance IPsec gateway using a low-cost commodity embedded APU. The HSA supported by the APUs eliminates the data copy overhead between the CPU and the GPU, which is unavoidable in the previous discrete GPU approaches. The gateway is implemented in OpenCL to exploit the GPU and uses zero-copy packet I/O APIs in DPDK. The IPsec gateway handles the real-world network traffic where each packet has a different workload. The proposed packet scheduling algorithm significantly improves GPU utilization for such traffic. It works not only for APUs but also for discrete GPUs. With three CPU cores and one GPU in the APU, the IPsec gateway achieves a throughput of 10.36 Gbps with an average latency of 2.79 ms to perform AES-CBC+HMAC-SHA1 for incoming packets of 1024 bytes.
We present simple, practical, and powerful new techniques for garbled circuits. These techniques result in significant concrete and asymptotic improvements over the state of the art, for several natural kinds of computations. For arithmetic circuits over the integers, our construction results in garbled circuits with free addition, weighted threshold gates with cost independent of fan-in, and exponentiation by a fixed exponent with cost independent of the exponent. For boolean circuits, our construction gives an exponential improvement over the state of the art for threshold gates (including AND/OR gates) of high fan-in. Our construction can be efficiently instantiated with practical symmetric-key primitives (e.g., AES), and is proven secure under similar assumptions to that of the Free-XOR garbling scheme (Kolesnikov & Schneider, ICALP 2008). We give an extensive comparison between our scheme and state-of-the-art garbling schemes applied to boolean circuits.
Compression is desirable for network applications as it saves bandwidth. Differently, when data is compressed before being encrypted, the amount of compression leaks information about the amount of redundancy in the plaintext. This side channel has led to the “Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext (BREACH)” attack on web traffic protected by the TLS protocol. The general guidance to prevent this attack is to disable HTTP compression, preserving confidentiality but sacrificing bandwidth. As a more sophisticated countermeasure, fixed-dictionary compression was introduced in 2015 enabling compression while protecting high-value secrets, such as cookies, from attacks. The fixed-dictionary compression method is a cryptographically sound countermeasure against the BREACH attack, since it is proven secure in a suitable security model. In this project, we integrate the fixed-dictionary compression method as a countermeasure for BREACH attack, for real-world client-server setting. Further, we measure the performance of the fixed-dictionary compression algorithm against the DEFLATE compression algorithm. The results evident that, it is possible to save some amount of bandwidth, with reasonable compression/decompression time compared to DEFLATE operations. The countermeasure is easy to implement and deploy, hence, this would be a possible direction to mitigate the BREACH attack efficiently, rather than stripping off the HTTP compression entirely.
Internet of Things(IoT) is the next big boom in the networking field. The vision of IoT is to connect daily used objects (which have the ability of sensing and actuation) to the Internet. This may or may or may not involve human. IoT field is still maturing and has many open issues. We build up on the security issues. As the devices have low computational power and low memory the existing security mechanisms (which are a necessity) should also be optimized accordingly or a clean slate approach needs to be followed. This is a survey paper to focus on the security aspects of IoT. We further also discuss the open challenges in this field.
In order to provide secure data communication in present cyber space world, a stronger encryption technique becomes a necessity that can help people to protect their sensitive information from cryptanalyst. This paper proposes a novel symmetric block cipher algorithm that uses multiple access circular queues (MACQs) of variable lengths for diffusion of information to a greater extent. The keys are randomly generated and will be of variable lengths depending upon the size of each MACQ.A number of iterations of circular rotations, swapping of elements and XORing the key with queue elements are performed on each MACQ. S-box is used so that the relationship between the key and the cipher text remains indeterminate or obscure. These operations together will help in transforming the cipher into a much more complex and secure block cipher. This paper attempt to propose an encryption algorithm that is secure and fast.
Today, cloud vendors host third party black-box services, whose developers usually provide only textual descriptions or purely syntactical interface specifications. Cloud vendors that give substantial support to other third party developers to integrate hosted services into new software solutions would have a unique selling feature over their competitors. However, to reliably determine if a service is reusable, comprehensive service specifications are needed. Characteristic for comprehensive in contrast to syntactical specifications are the formalization of ontological and behavioral semantics, homogeneity according to a global ontology, and a service grounding that links the abstract service description and its technical realization. Homogeneous, semantical specifications enable to reliably identify reusable services, whereas the service grounding is needed for the technical service integration. In general, comprehensive specifications are not available and have to be derived. Existing automatized approaches are restricted to certain characteristics of comprehensiveness. In my PhD, I consider an automatized approach to derive fully-fledged comprehensive specifications for black-box services. Ontological semantics are derived from syntactical interface specifications. Behavioral semantics are mined from call logs that cloud vendors create to monitor the hosted services. The specifications are harmonized over a global ontology. The service grounding is established using traceability information. The approach enables third party developers to compose services into complex systems and creates new sales channels for cloud and service providers.
Finding differences between programs with similar functionality is an important security problem as such differences can be used for fingerprinting or creating evasion attacks against security software like Web Application Firewalls (WAFs) which are designed to detect malicious inputs to web applications. In this paper, we present SFADIFF, a black-box differential testing framework based on Symbolic Finite Automata (SFA) learning. SFADIFF can automatically find differences between a set of programs with comparable functionality. Unlike existing differential testing techniques, instead of searching for each difference individually, SFADIFF infers SFA models of the target programs using black-box queries and systematically enumerates the differences between the inferred SFA models. All differences between the inferred models are checked against the corresponding programs. Any difference between the models, that does not result in a difference between the corresponding programs, is used as a counterexample for further refinement of the inferred models. SFADIFF's model-based approach, unlike existing differential testing tools, also support fully automated root cause analysis in a domain-independent manner. We evaluate SFADIFF in three different settings for finding discrepancies between: (i) three TCP implementations, (ii) four WAFs, and (iii) HTML/JavaScript parsing implementations in WAFs and web browsers. Our results demonstrate that SFADIFF is able to identify and enumerate the differences systematically and efficiently in all these settings. We show that SFADIFF is able to find differences not only between different WAFs but also between different versions of the same WAF. SFADIFF is also able to discover three previously-unknown differences between the HTML/JavaScript parsers of two popular WAFs (PHPIDS 0.7 and Expose 2.4.0) and the corresponding parsers of Google Chrome, Firefox, Safari, and Internet Explorer. We confirm that all these differences can be used to evade the WAFs and launch successful cross-site scripting attacks.
Processes to automate the selection of appropriate algorithms for various matrix computations are described. In particular, processes to check for, and certify, various matrix properties of black-box matrices are presented. These include sparsity patterns and structural properties that allow "superfast" algorithms to be used in place of black-box algorithms. Matrix properties that hold generically, and allow the use of matrix preconditioning to be reduced or eliminated, can also be checked for and certified –- notably including in the small-field case, where this presently has the greatest impact on the efficiency of the computation.
Tensors are a multi-linear generalization of matrices to their d-way counterparts, and are receiving intense interest recently due to their natural representation of high-dimensional data and the availability of fast tensor decomposition algorithms. Given the input-output data of a nonlinear system/circuit, this paper presents a nonlinear model identification and simulation framework built on top of Volterra series and its seamless integration with tensor arithmetic. By exploiting partially-symmetric polyadic decompositions of sparse Toeplitz tensors, the proposed framework permits a pleasantly scalable way to incorporate high-order Volterra kernels. Such an approach largely eludes the curse of dimensionality and allows computationally fast modeling and simulation beyond weakly nonlinear systems. The black-box nature of the model also hides structural information of the system/circuit and encapsulates it in terms of compact tensors. Numerical examples are given to verify the efficacy, efficiency and generality of this tensor-based modeling and simulation framework.
One important goal of black-box complexity theory is the development of complexity models allowing to derive meaningful lower bounds for whole classes of randomized search heuristics. Complementing classical runtime analysis, black-box models help us understand how algorithmic choices such as the population size, the variation operators, or the selection rules influence the optimization time. One example for such a result is the Ω(n log n) lower bound for unary unbiased algorithms on functions with a unique global optimum [Lehre/Witt, GECCO 2010], which tells us that higher arity operators or biased sampling strategies are needed when trying to beat this bound. In lack of analyzing techniques, almost no non-trivial bounds are known for other restricted models. Proving such bounds therefore remains to be one of the main challenges in black-box complexity theory. With this paper we contribute to our technical toolbox for lower bound computations by proposing a new type of information-theoretic argument. We regard the permutation- and bit-invariant version of LeadingOnes and prove that its (1+1) elitist black-box complexity is Ω(n2), a bound that is matched by (1+1)-type evolutionary algorithms. The (1+1) elitist complexity of LeadingOnes is thus considerably larger than its unrestricted one, which is known to be of order n log log n [Afshani et al., 2013].
While compilers offer a fair trade-off between productivity and executable performance in single-threaded execution, their optimizations remain fragile when addressing compute-intensive code for parallel architectures with deep memory hierarchies. Moreover, these optimizations operate as black boxes, impenetrable for the user, leaving them with no alternative to time-consuming and error-prone manual optimization in cases where an imprecise cost model or a weak analysis resulted in a bad optimization decision. To address this issue, we propose a technique allowing to automatically translate an arbitrary polyhedral optimization, used internally by loop-level optimization frameworks of several modern compilers, into a sequence of comprehensible syntactic transformations as long as this optimization focuses on scheduling loop iterations. This approach opens the black box of the polyhedral frameworks enabling users to examine, refine, replay and even design complex optimizations semi-automatically in partnership with the compiler.
Synchronous replication is critical for today's enterprise IT organization. It is mandatory by regulation in several countries for some types of organizations, including banks and insurance companies. The technology has been available for a long period of time, but due to speed of light and maximal latency limitations, it is usually limited to a distance of 50-100 miles. Flight data recorders, also known as black boxes, have long been used to record the last actions which happened in airplanes at times of disasters. We present an integration between an Enterprise Data Recorder and an asynchronous replication mechanism, which allows breaking the functional limits that light speed imposes on synchronous replication.
In classical runtime analysis it has been observed that certain working principles of an evolutionary algorithm cannot be understood by only looking at the asymptotic order of the runtime, but that more precise estimates are needed. In this work we demonstrate that the same observation applies to black-box complexity analysis. We prove that the unary unbiased black-box complexity of the classic OneMax function class is n ln(n) – cn ± o(n) for a constant c between 0.2539 and 0.2665. Our analysis yields a simple (1+1)-type algorithm achieving this runtime bound via a fitness-dependent mutation strength. When translated into a fixed-budget perspective, our algorithm with the same budget computes a solution that asymptotically is 13% closer to the optimum (given that the budget is at least 0.2675n).
Finding and proving lower bounds on black-box complexities is one of the hardest problems in theory of randomized search heuristics. Until recently, there were no general ways of doing this, except for information theoretic arguments similar to the one of Droste, Jansen and Wegener. In a recent paper by Buzdalov, Kever and Doerr, a theorem is proven which may yield tighter bounds on unrestricted black-box complexity using certain problem-specific information. To use this theorem, one should split the search process into a finite number of states, describe transitions between states, and for each state specify (and prove) the maximum number of different answers to any query. We augment these state constraints by one more kind of constraints on states, namely, the maximum number of different currently possible optima. An algorithm is presented for computing the lower bounds based on these constraints. We also empirically show improved lower bounds on black-box complexity of OneMax and Mastermind.
Searchable symmetric encryption (SSE) enables a client to store a database on an untrusted server while supporting keyword search in a secure manner. Despite the rapidly increasing interest in SSE technology, experiments indicate that the performance of the known schemes scales badly to large databases. Somewhat surprisingly, this is not due to their usage of cryptographic tools, but rather due to their poor locality (where locality is defined as the number of non-contiguous memory locations the server accesses with each query). The only known schemes that do not suffer from poor locality suffer either from an impractical space overhead or from an impractical read efficiency (where read efficiency is defined as the ratio between the number of bits the server reads with each query and the actual size of the answer). We construct the first SSE schemes that simultaneously enjoy optimal locality, optimal space overhead, and nearly-optimal read efficiency. Specifically, for a database of size N, under the modest assumption that no keyword appears in more than N1 − 1/loglogN documents, we construct a scheme with read efficiency Õ(loglogN). This essentially matches the lower bound of Cash and Tessaro (EUROCRYPT ’14) showing that any SSE scheme must be sub-optimal in either its locality, its space overhead, or its read efficiency. In addition, even without making any assumptions on the structure of the database, we construct a scheme with read efficiency Õ(logN). Our schemes are obtained via a two-dimensional generalization of the classic balanced allocations (“balls and bins”) problem that we put forward. We construct nearly-optimal two-dimensional balanced allocation schemes, and then combine their algorithmic structure with subtle cryptographic techniques.
Searchable symmetric encryption (SSE) enables a client to store a database on an untrusted server while supporting keyword search in a secure manner. Despite the rapidly increasing interest in SSE technology, experiments indicate that the performance of the known schemes scales badly to large databases. Somewhat surprisingly, this is not due to their usage of cryptographic tools, but rather due to their poor locality (where locality is defined as the number of non-contiguous memory locations the server accesses with each query). The only known schemes that do not suffer from poor locality suffer either from an impractical space overhead or from an impractical read efficiency (where read efficiency is defined as the ratio between the number of bits the server reads with each query and the actual size of the answer). We construct the first SSE schemes that simultaneously enjoy optimal locality, optimal space overhead, and nearly-optimal read efficiency. Specifically, for a database of size N, under the modest assumption that no keyword appears in more than N1 − 1/loglogN documents, we construct a scheme with read efficiency Õ(loglogN). This essentially matches the lower bound of Cash and Tessaro (EUROCRYPT ’14) showing that any SSE scheme must be sub-optimal in either its locality, its space overhead, or its read efficiency. In addition, even without making any assumptions on the structure of the database, we construct a scheme with read efficiency Õ(logN). Our schemes are obtained via a two-dimensional generalization of the classic balanced allocations (“balls and bins”) problem that we put forward. We construct nearly-optimal two-dimensional balanced allocation schemes, and then combine their algorithmic structure with subtle cryptographic techniques.
Security in Mobile Ad Hoc networks is still ongoing research in the scientific community and it is difficult bring an overall security solution. In this paper we assess feasibility of distributed firewall solutions in the Mobile Ad Hoc Networks. Attention is also focused on different security solutions in the Ad Hoc networks. We propose a security architecture which secures network on the several layers and is the most secured solution out of analyzed materials. For this purpose we use distributed public key infrastructure, distributed firewall and intrusion detection system. Our architecture is using both symmetric and asymmetric cryptography and in this paper we present performance measurements and the security analysis of our solution.
Among the current Wi-Fi two security models (Enterprise and Personal), while the Enterprise model (802.1X) offers an effective framework for authenticating and controlling the user traffic to a protected network, the Personal model (802.11) offers the cheapest and the easiest to setup solution. However, the drawback of the personal model implementation is that all access points and client radio NIC on the wireless LAN should use the same encryption key. A major underlying problem of the 802.11 standard is that the pre-shared keys are cumbersome to change. So if those keys are not updated frequently, unauthorized users with some resources and within a short timeframe can crack the key and breach the network security. The purpose of this paper is to propose and implement an effective method for the system administrator to manage the users connected to a router, update the keys and further distribute them for the trusted clients using the Freescale embedded system, Infrared and Bluetooth modules.
The aim of this research is to advance the user active authentication using keystroke dynamics. Through this research, we assess the performance and influence of various keystroke features on keystroke dynamics authentication systems. In particular, we investigate the performance of keystroke features on a subset of most frequently used English words. The performance of four features such as i) key duration, ii) flight time latency, iii) digraph time latency, and iv) word total time duration are analyzed. Experiments are performed to measure the performance of each feature individually as well as the results from the different subsets of these features. Four machine learning techniques are employed for assessing keystroke authentications. The selected classification methods are two-class support vector machine (TC) SVM, one-class support vector machine (OC) SVM, k-nearest neighbor classifier (K-NN), and Naive Bayes classifier (NB). The logged experimental data are captured for 28 users. The experimental results show that key duration time offers the best performance result among all four keystroke features, followed by word total time. Furthermore, our results show that TC SVM and KNN perform the best among the four classifiers.