Biblio
The Named-Data Networking (NDN) has emerged as a clean-slate Internet proposal on the wave of Information-Centric Networking. Although the NDN's data-plane seems to offer many advantages, e.g., native support for multicast communications and flow balance, it also makes the network infrastructure vulnerable to a specific DDoS attack, the Interest Flooding Attack (IFA). In IFAs, a botnet issuing unsatisfiable content requests can be set up effortlessly to exhaust routers' resources and cause a severe performance drop to legitimate users. So far several countermeasures have addressed this security threat, however, their efficacy was proved by means of simplistic assumptions on the attack model. Therefore, we propose a more complete attack model and design an advanced IFA. We show the efficiency of our novel attack scheme by extensively assessing some of the state-of-the-art countermeasures. Further, we release the software to perform this attack as open source tool to help design future more robust defense mechanisms.
Emerging computing relies heavily on secure backend storage for the massive size of big data originating from the Internet of Things (IoT) smart devices to the Cloud-hosted web applications. Structured Query Language (SQL) Injection Attack (SQLIA) remains an intruder's exploit of choice to pilfer confidential data from the back-end database with damaging ramifications. The existing approaches were all before the new emerging computing in the context of the Internet big data mining and as such will lack the ability to cope with new signatures concealed in a large volume of web requests over time. Also, these existing approaches were strings lookup approaches aimed at on-premise application domain boundary, not applicable to roaming Cloud-hosted services' edge Software-Defined Network (SDN) to application endpoints with large web request hits. Using a Machine Learning (ML) approach provides scalable big data mining for SQLIA detection and prevention. Unfortunately, the absence of corpus to train a classifier is an issue well known in SQLIA research in applying Artificial Intelligence (AI) techniques. This paper presents an application context pattern-driven corpus to train a supervised learning model. The model is trained with ML algorithms of Two-Class Support Vector Machine (TC SVM) and Two-Class Logistic Regression (TC LR) implemented on Microsoft Azure Machine Learning (MAML) studio to mitigate SQLIA. This scheme presented here, then forms the subject of the empirical evaluation in Receiver Operating Characteristic (ROC) curve.
Services today are configured through policies that capture expected behaviors. However, because of subtle and changing stakeholder requirements, producing and maintaining policies is nontrivial. Policy errors are surprisingly common and cause avoidable security vulnerabilities.
We propose Aragorn, an approach that applies formal argumentation to produce policies that balance stakeholder concerns. We demonstrate empirically that, compared to the traditional approach for specifying policies, Aragorn performs (1) better on coverage, correctness, and quality; (2) equally well on learnability and effort÷coverage and difficulty; and (3) slightly worse on time and effort needed. Thus, Aragorn demonstrates the potential for capturing policy rationales as arguments.
To appear
We seek to address the challenge of engineering socially intelligent personal agents that are privacy-aware. We propose Arnor, a method, including a metamodel based on social constructs. Arnor incorporates social norms and goes beyond existing agent-oriented software engineering (AOSE) methods by systematically capturing how a personal agent’s actions influence the social experience it delivers. We conduct two empirical studies to evaluate Arnor. First, via a multiphase developer study, we show that Arnor simplifies application development. Second, via simulation experiments, we show that Arnor provides improved privacy-preserving social experience to end users than personal agents engineered using a traditional AOSE method.
Onion sites on the darkweb operate using the Tor Hidden Service (HS) protocol to shield their locations on the Internet, which (among other features) enables these sites to host malicious and illegal content while being resistant to legal action and seizure. Identifying and monitoring such illicit sites in the darkweb is of high relevance to the Computer Security and Law Enforcement communities. We have developed an automated infrastructure that crawls and indexes content from onion sites into a large-scale data repository, called LIGHTS, with over 100M pages. In this paper we describe Automated Tool for Onion Labeling (ATOL), a novel scalable analysis service developed to conduct a thematic assessment of the content of onion sites in the LIGHTS repository. ATOL has three core components – (a) a novel keyword discovery mechanism (ATOLKeyword) which extends analyst-provided keywords for different categories by suggesting new descriptive and discriminative keywords that are relevant for the categories; (b) a classification framework (ATOLClassify) that uses the discovered keywords to map onion site content to a set of categories when sufficient labeled data is available; (c) a clustering framework (ATOLCluster) that can leverage information from multiple external heterogeneous knowledge sources, ranging from domain expertise to Bitcoin transaction data, to categorize onion content in the absence of sufficient supervised data. The paper presents empirical results of ATOL on onion datasets derived from the LIGHTS repository, and additionally benchmarks ATOL's algorithms on the publicly available 20 Newsgroups dataset to demonstrate the reproducibility of its results. On the LIGHTS dataset, ATOLClassify gives a 12% performance gain over an analyst-provided baseline, while ATOLCluster gives a 7% improvement over state-of-the-art semi-supervised clustering algorithms. We also discuss how ATOL has been deployed and externally evaluated, as part of the LIGHTS system.
Testing and fixing Web Application Firewalls (WAFs) are two relevant and complementary challenges for security analysts. Automated testing helps to cost-effectively detect vulnerabilities in a WAF by generating effective test cases, i.e., attacks. Once vulnerabilities have been identified, the WAF needs to be fixed by augmenting its rule set to filter attacks without blocking legitimate requests. However, existing research suggests that rule sets are very difficult to understand and too complex to be manually fixed. In this paper, we formalise the problem of fixing vulnerable WAFs as a combinatorial optimisation problem. To solve it, we propose an automated approach that combines machine learning with multi-objective genetic algorithms. Given a set of legitimate requests and bypassing SQL injection attacks, our approach automatically infers regular expressions that, when added to the WAF's rule set, prevent many attacks while letting legitimate requests go through. Our empirical evaluation based on both open-source and proprietary WAFs shows that the generated filter rules are effective at blocking previously identified and successful SQL injection attacks (recall between 54.6% and 98.3%), while triggering in most cases no or few false positives (false positive rate between 0% and 2%).
The Bonseyes EU H2020 collaborative project aims to develop a platform consisting of a Data Marketplace, a Deep Learning Toolbox, and Developer Reference Platforms for organizations wanting to adopt Artificial Intelligence. The project will be focused on using artificial intelligence in low power Internet of Things (IoT) devices ("edge computing"), embedded computing systems, and data center servers ("cloud computing"). It will bring about orders of magnitude improvements in efficiency, performance, reliability, security, and productivity in the design and programming of systems of artificial intelligence that incorporate Smart Cyber-Physical Systems (CPS). In addition, it will solve a causality problem for organizations who lack access to Data and Models. Its open software architecture will facilitate adoption of the whole concept on a wider scale. To evaluate the effectiveness, technical feasibility, and to quantify the real-world improvements in efficiency, security, performance, effort and cost of adding AI to products and services using the Bonseyes platform, four complementary demonstrators will be built. Bonseyes platform capabilities are aimed at being aligned with the European FI-PPP activities and take advantage of its flagship project FIWARE. This paper provides a description of the project motivation, goals and preliminary work.
An operating system kernel written in the Rust language would have extremely fine-grained isolation boundaries, have no memory leaks, and be safe from a wide range of security threats and memory bugs. Previous efforts towards this end concluded that writing a kernel requires changing Rust. This paper reaches a different conclusion, that no changes to Rust are needed and a kernel can be implemented with a very small amount of unsafe code. It describes how three sample kernel mechanisms–-DMA, USB, and buffer caches–-can be built using these abstractions.
Embedded software is found everywhere from our highly visible mobile devices to the confines of our car in the form of smart sensors. Embedded software companies are under huge pressure to produce safe applications that limit risks, and testing is absolutely critical to alleviate concerns regarding safety and user privacy. This requires using large test suites throughout the development process, increasing time-to-market and ultimately hindering competitivity. Speeding up test execution is, therefore, of paramount importance for embedded software developers. This is traditionally achieved by running, in parallel, multiple tests on large-scale clusters of computers. However, this approach is costly in terms of infrastructure maintenance and energy consumed, and is at times inconvenient as developers have to wait for their tests to be scheduled on a shared resource. We propose to look at exploiting GPUs (Graphics Processing Units) for running embedded software testing. GPUs are readily available in most computers and offer tremendous amounts of parallelism, making them an ideal target for embedded software testing. In this paper, we demonstrate, for the first time, how test executions of embedded C programs can be automatically performed on a GPU, without involving the end user. We take a compiler-assisted approach which automatically compiles the C program into GPU kernels for parallel execution of the input tests. Using this technique, we achieve an average speedup of 16Ã when compared to CPU execution of input tests across nine programs from an industry standard embedded benchmark suite.
For the last several decades, the rapid development of information technology and computer performance accelerates generation, transportation and accumulation of digital data, it came to be called "Big Data". In this context, researchers and companies are eager to utilize the data to create new values or manage a wide range of issues, and much focus is being placed on "Data Science" to extract useful information (knowledge) from digital data. Data Science has been developed from several independent fields such as Mathematics/Operations Research, Computer Science, Data Engineering, Visualization and Statistics since 1800s. In addition, Artificial Intelligence converges on this stream recent years. On the other hand, the national projects have been established to utilize data for society with concerns surrounding the security and privacy. In this paper, through detailed analysis on history of this field, processes of development and integration among related fields are discussed as well as comparative aspects between Japan and the United States. This paper also includes a brief discussion of future directions.
The risk posed by insider threats has usually been approached by analyzing the behavior of users solely in the cyber domain. In this paper, we show the viability of using physical movement logs, collected via a building access control system, together with an understanding of the layout of the building housing the system's assets, to detect malicious insider behavior that manifests itself in the physical domain. In particular, we propose a systematic framework that uses contextual knowledge about the system and its users, learned from historical data gathered from a building access control system, to select suitable models for representing movement behavior. We then explore the online usage of the learned models, together with knowledge about the layout of the building being monitored, to detect malicious insider behavior. Finally, we show the effectiveness of the developed framework using real-life data traces of user movement in railway transit stations.
Educational games have a potentially significant role to play in the increasing efforts to expand access to computer science education. Computational thinking is an area of particular interest, including the development of problem-solving strategies like divide and conquer. Existing games designed to teach computational thinking generally consist of either open-ended exploration with little direct guidance or a linear series of puzzles with lots of direct guidance, but little exploration. Educational research indicates that the most effective approach may be a hybrid of these two structures. We present Dragon Architect, an educational computational thinking game, and use it as context for a discussion of key open problems in the design of games to teach computational thinking. These problems include how to directly teach computational thinking strategies, how to achieve a balance between exploration and direct guidance, and how to incorporate engaging social features. We also discuss several important design challenges we have encountered during the design of Dragon Architect. We contend the problems we describe are relevant to anyone making educational games or systems that need to teach complex concepts and skills.
The display image on the visual display unit (VDU) can be retrieved from the radiated and conducted emission at some distance with no trace. In this paper, the maximum eavesdropping distance for the unintentional radiation and conduction electromagnetic (EM) signals which contain information has been estimated in theory by considering some realistic parameters. Firstly, the maximum eavesdropping distance for the unintentional EM radiation is estimated based on the reception capacity of a log-periodic antenna which connects to a receiver, the experiment data, the attenuation in free-space and the additional attenuation in the propagation path. And then, based on a multi-conductor transmission model and some experiment results, the maximum eavesdropping distance for the conducted emission is theoretically derived. The estimating results demonstrated that the ITE equipment may also exist threat of the information leakage even if it has met the current EMC requirements.
Here we explore the applicability of traditional sliding window based convolutional neural network (CNN) detection pipeline and region based object detection techniques such as Faster Region-based CNN (R-CNN) and Region-based Fully Convolutional Networks (R-FCN) on the problem of object detection in X-ray security imagery. Within this context, with limited dataset availability, we employ a transfer learning paradigm for network training tackling both single and multiple object detection problems over a number of R-CNN/R-FCN variants. The use of first-stage region proposal within the Faster RCNN and R-FCN provide superior results than traditional sliding window driven CNN (SWCNN) approach. With the use of Faster RCNN with VGG16, pretrained on the ImageNet dataset, we achieve 88.3 mAP for a six object class X-ray detection problem. The use of R-FCN with ResNet-101, yields 96.3 mAP for the two class firearm detection problem requiring 0.1 second computation per image. Overall we illustrate the comparative performance of these techniques as object localization strategies within cluttered X-ray security imagery.
Meta-programs are programs that generate other programs, but in weakly type-safe systems, type-checking a meta-program only establishes its own type safety, and generated programs need additional type-checking after generation. Strong type safety of a meta-program implies type safety of any generated object program, a property with important engineering benefits. Current strongly type-safe systems suffer from expressivity limitations and cannot support many meta-programs found in practice, for example automatic generation of lenses. To overcome this, we move away from the idea of staged meta-programming. Instead, we use an off-the-shelf dependently-typed language as the meta-language and a relatively standard, intrinsically well-typed representation of the object language. We scale this approach to practical meta-programming, by choosing a high-level, explicitly typed intermediate representation as the object language, rather than a surface programming language. We implement our approach as a library for the Glasgow Haskell Compiler (GHC) and evaluate it on several meta-programs, including a deriveLenses meta-program taken from a real-world Haskell lens library. Our evaluation demonstrates expressivity beyond the state of the art and applicability to real settings, at little cost in terms of code size.
Applications for data analysis of biomedical data are complex programs and often consist of multiple components. Re-usage of existing solutions from external code repositories or program libraries is common in algorithm development. To ease reproducibility as well as transfer of algorithms and required components into distributed infrastructures Linux containers are increasingly used in those environments, that are at least partly connected to the internet. However concerns about the untrusted application remain and are of high interest when medical data is processed. Additionally, the portability of the containers needs to be ensured by using only security technologies, that do not require additional kernel modules. In this paper we describe measures and a solution to secure the execution of an example biomedical application for normalization of multidimensional biosignal recordings. This application, the required runtime environment and the security mechanisms are installed in a Docker-based container. A fine-grained restricted environment (sandbox) for the execution of the application and the prevention of unwanted behaviour is created inside the container. The sandbox is based on the filtering of system calls, as they are required to interact with the operating system to access potentially restricted resources e.g. the filesystem or network. Due to the low-level character of system calls, the creation of an adequate rule set for the sandbox is challenging. Therefore the presented solution includes a monitoring component to collect required data for defining the rules for the application sandbox. Performance evaluation of the application execution shows no significant impact of the resulting sandbox, while detailed monitoring may increase runtime up to over 420%.
The challenge of maintaining confidentiality of stored and processed data in a remote database or cloud is quite urgent. Using homomorphic encryption may solve the problem, because it allows to compute some functions over encrypted data without preliminary deciphering of data. Fully homomorphic encryption schemes have a number of limitations such as accumulation of noise and increase of ciphertext extension during performing operations, the range of operations is limited. Nowadays a lot of homomorphic encryption schemes and their modifications have been investigated, so more than 25 reports on homomorphic encryption schemes have already been published on Cryptology ePrint Archive for 2016. We propose an overview of current Fully Homomorphic Encryption Schemes and analyze specific operations for databases which homomorphic cryptosystems allow to perform. We also investigate the possibility of sorting over encrypted data and present our approach to compare data encrypted by Multi-bit FHE scheme.
This paper presents a new fractional-order hidden strange attractor generated by a chaotic system without equilibria. The proposed non-equilibrium fractional-order chaotic system (FOCS) is asymmetric, dissimilar, topologically inequivalent to typical chaotic systems and challenges the conventional notion that the presence of unstable equilibria is mandatory to ensure the existence of chaos. The new fractional-order model displays rich bifurcation undergoing a period doubling route to chaos, where the fractional order α is the bifurcation parameter. Study of the hidden attractor dynamics is carried out with the aid of phase portraits, sensitivity to initial conditions, fractal Lyapunov dimension, maximum Lyapunov exponents spectrum and bifurcation analysis. The minimum commensurate dimension to display chaos is determined. With a view to utilizing it in chaos based cryptology and coding information, a synchronisation control scheme is designed. Finally the theoretical analyses are validated by numerical simulation results which are in good agreement with the former.