Biblio
Internet Engineering Task Force (IETF) is working on 6LoW-PAN standard which allows smart devices to be connected to Internet using large address space of IPV6. 6LoWPAN acts as a bridge between resource constrained devices and the Internet. The entire IoT space is vulnerable to local threats as well as the threats from the Internet. Due to the random deployment of the network nodes and the absence of tamper resistant shield, the resource constrained IoT elements face potential insider attacks even in presence of front line defense mechanism that involved cryptographic protocols. To detect such insidious nodes, an Intrusion Detection System (IDS) is required as a second line of defense. In this paper, we attempt to analyze such potential insider attacks, while reviewing the IDS based countermeasures. We attempt to propose a baseline for designing IDS for 6LoWPAN based IoT system.
In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole. With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to networkm restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others.
Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks – at the level of small network subgraphs – remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order connectivity patterns. This framework provides mathematical guarantees on the optimality of obtained clusters and scales to networks with billions of edges. The framework reveals higher-order organization in a number of networks, including information propagation units in neuronal networks and hub structure in transportation networks. Results show that networks exhibit rich higher-order organizational structures that are exposed by clustering based on higher-order connectivity patterns. Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
When reasoning about software security, researchers and practitioners use the phrase ``attack surface'' as a metaphor for risk. Enumerate and minimize the ways attackers can break in then risk is reduced and the system is better protected, the metaphor says. But software systems are much more complicated than their surfaces. We propose function- and file-level attack surface metrics–-proximity and risky walk–-that enable fine-grained risk assessment. Our risky walk metric is highly configurable: we use PageRank on a probability-weighted call graph to simulate attacker behavior of finding or exploiting a vulnerability. We provide evidence-based guidance for deploying these metrics, including an extensive parameter tuning study. We conducted an empirical study on two large open source projects, FFmpeg and Wireshark, to investigate the potential correlation between our metrics and historical post-release vulnerabilities. We found our metrics to be statistically significantly associated with vulnerable functions/files with a small-to-large Cohen's d effect size. Our prediction model achieved an increase of 36% (in FFmpeg) and 27% (in Wireshark) in the average value of F-measure over a base model built with SLOC and coupling metrics. Our prediction model outperformed comparable models from prior literature with notable improvements: 58% reduction in false negative rate, 81% reduction in false positive rate, and 548% increase in F-measure. These metrics advance vulnerability prevention by [(a)] being flexible in terms of granularity, performing better than vulnerability prediction literature, and being tunable so that practitioners can tailor the metrics to their products and better assess security risk.
Conventional security methods like password and ID card methods are now rapidly replacing by biometrics for identification of a person. Biometrics uses physiological or behavioral characteristics of a person. Usage of biometric raises critical privacy and security concerns that, due to the noisy nature of biometrics, cannot be addressed using standard cryptographic methods. The loss of an enrollment biometric to an attacker is a security hazard because it may allow the attacker to get an unauthorized access to the system. Biometric template can be stolen and intruder can get access of biometric system using fake input. Hence, it becomes essential to design biometric system with secure template or if the biometric template in an application is compromised, the biometric signal itself is not lost forever and a new biometric template can be issued. One way is to combine the biometrics and cryptography or use transformed data instead of original biometric template. But traditional cryptography methods are not useful in biometrics because of intra-class variation. Biometric cryptosystem can apply fuzzy vault, fuzzy commitment, helper data and secure sketch, whereas, cancelable biometrics uses distorting transforms, Bio-Hashing, and Bio-Encoding techniques. In this paper, biometric cryptosystem is presented with fuzzy vault and fuzzy commitment techniques for fingerprint recognition system.
Synchronous replication is critical for today's enterprise IT organization. It is mandatory by regulation in several countries for some types of organizations, including banks and insurance companies. The technology has been available for a long period of time, but due to speed of light and maximal latency limitations, it is usually limited to a distance of 50-100 miles. Flight data recorders, also known as black boxes, have long been used to record the last actions which happened in airplanes at times of disasters. We present an integration between an Enterprise Data Recorder and an asynchronous replication mechanism, which allows breaking the functional limits that light speed imposes on synchronous replication.
After a software system is compromised, it can be difficult to understand what vulnerabilities attackers exploited. Any information residing on that machine cannot be trusted as attackers may have tampered with it to cover their tracks. Moreover, even after an exploit is known, it can be difficult to determine whether it has been used to compromise a given machine. Aviation has long-used black boxes to better understand the causes of accidents, enabling improvements that reduce the likelihood of future accidents. Many attacks introduce abnormal control flows to compromise systems. In this paper, we present BlackBox, a monitoring system for COTS software. Our techniques enable BlackBox to efficiently monitor unexpected and potentially harmful control flow in COTS binaries. BlackBox constructs dynamic profiles of an application's typical control flows to filter the vast majority of expected control flow behavior, leaving us with a manageable amount of data that can be logged across the network to remote devices. Modern applications make extensive use of dynamically generated code, some of which varies greatly between executions. We introduce support for code generators that can detect security-sensitive behaviors while allowing BlackBox to avoid logging the majority of ordinary behaviors. We have implemented BlackBox in DynamoRIO. We evaluate the runtime overhead of BlackBox, and show that it can effectively monitor recent versions of Microsoft Office and Google Chrome. We show that in ROP, COOP, and state- of-the-art JIT injection attacks, BlackBox logs the pivotal actions by which the attacker takes control, and can also blacklist those actions to prevent repeated exploits.
Explicit non-linear transformations of existing steganalysis features are shown to boost their ability to detect steganography in combination with existing simple classifiers, such as the FLD-ensemble. The non-linear transformations are learned from a small number of cover features using Nyström approximation on pilot vectors obtained with kernelized PCA. The best performance is achieved with the exponential form of the Hellinger kernel, which improves the detection accuracy by up to 2-3% for spatial-domain contentadaptive steganography. Since the non-linear map depends only on the cover source and its learning has a low computational complexity, the proposed approach is a practical and low cost method for boosting the accuracy of existing detectors built as binary classifiers. The map can also be used to significantly reduce the feature dimensionality (by up to factor of ten) without performance loss with respect to the non-transformed features.
Kernel hardening has been an important topic since many applications and security mechanisms often consider the kernel as part of their Trusted Computing Base (TCB). Among various hardening techniques, Kernel Address Space Layout Randomization (KASLR) is the most effective and widely adopted defense mechanism that can practically mitigate various memory corruption vulnerabilities, such as buffer overflow and use-after-free. In principle, KASLR is secure as long as no memory leak vulnerability exists and high entropy is ensured. In this paper, we introduce a highly stable timing attack against KASLR, called DrK, that can precisely de-randomize the memory layout of the kernel without violating any such assumptions. DrK exploits a hardware feature called Intel Transactional Synchronization Extension (TSX) that is readily available in most modern commodity CPUs. One surprising behavior of TSX, which is essentially the root cause of this security loophole, is that it aborts a transaction without notifying the underlying kernel even when the transaction fails due to a critical error, such as a page fault or an access violation, which traditionally requires kernel intervention. DrK turned this property into a precise timing channel that can determine the mapping status (i.e., mapped versus unmapped) and execution status (i.e., executable versus non-executable) of the privileged kernel address space. In addition to its surprising accuracy and precision, DrK is universally applicable to all OSes, even in virtualized environments, and generates no visible footprint, making it difficult to detect in practice. We demonstrated that DrK can break the KASLR of all major OSes (i.e., Windows, Linux, and OS X) with near-perfect accuracy in under a second. Finally, we propose potential countermeasures that can effectively prevent or mitigate the DrK attack. We urge our community to be aware of the potential threat of having Intel TSX, which is present in most recent Intel CPUs – 100% in workstation and 60% in high-end Intel CPUs since Skylake – and is even available on Amazon EC2 (X1).
We develop a systematic approach for analyzing client-server applications that aim to hide sensitive user data from untrusted servers. We then apply it to Mylar, a framework that uses multi-key searchable encryption (MKSE) to build Web applications on top of encrypted data. We demonstrate that (1) the Popa-Zeldovich model for MKSE does not imply security against either passive or active attacks; (2) Mylar-based Web applications reveal users' data and queries to passive and active adversarial servers; and (3) Mylar is generically insecure against active attacks due to system design flaws. Our results show that the problem of securing client-server applications against actively malicious servers is challenging and still unsolved. We conclude with general lessons for the designers of systems that rely on property-preserving or searchable encryption to protect data from untrusted servers.
In a secret sharing scheme a dealer shares a secret s among n parties such that an adversary corrupting up to t parties does not learn s, while any t+1 parties can efficiently recover s. Over a long period of time all parties may be corrupted thus violating the threshold, which is accounted for in Proactive Secret Sharing (PSS). PSS schemes periodically rerandomize (refresh) the shares of the secret and invalidate old ones. PSS retains confidentiality even when all parties are corrupted over the lifetime of the secret, but no more than t during a certain window of time, called the refresh period. Existing PSS schemes only guarantee secrecy in the presence of an honest majority with less than n2 total corruptions during a refresh period; an adversary corrupting a single additional party, even if only passively, obtains the secret. This work is the first feasibility result demonstrating PSS tolerating a dishonest majority, it introduces the first PSS scheme secure against t passive adversaries without recovery of lost shares, it can also recover from honest faulty parties losing their shares, and when tolerating e faults the scheme tolerates t passive corruptions. A non-robust version of the scheme can tolerate t active adversaries, and mixed adversaries that control a combination of passively and actively corrupted parties that are a majority, but where less than n/2-e of such corruptions are active. We achieve these high thresholds with O(n4) communication when sharing a single secret, and O(n3) communication when sharing multiple secrets in batches.
We develop a novel model-based hardware-in-the-loop (HIL) framework for optimising energy consumption of embedded software controllers. Controller and plant models are specified as networks of parameterised timed input/output automata and translated into executable code. The controller is encoded into the target embedded hardware, which is connected to a power monitor and interacts with the simulation of the plant model. The framework then generates a power consumption model that maps controller transitions to distributions over power measurements, and is used to optimise the timing parameters of the controller, without compromising a given safety requirement. The novelty of our approach is that we measure the real power consumption of the controller and use thus obtained data for energy optimisation. We employ timed Petri nets as an intermediate representation of the executable specification, which facilitates efficient code generation and fast simulations. Our framework uniquely combines the advantages of rigorous specifications with accurate power measurements and methods for online model estimation, thus enabling automated design of correct and energy-efficient controllers.
In the Internet of Things (IoT), Internet-connected things provide an influx of data and resources that offer unlimited possibility for applications and services. Smart City IoT systems refer to the things that are distributed over wide physical areas covering a whole city. While the new breed of data and resources looks promising, building applications in such large scale IoT systems is a difficult task due to the distributed and dynamic natures of entities involved, such as sensing, actuating devices, people and computing resources. In this paper, we explore the process of developing Smart City IoT applications from a coordination-based perspective. We show that a distributed coordination model that oversees such a large group of distributed components is necessary in building Smart City IoT applications. In particular, we propose Adaptive Distributed Dataflow, a novel Dataflow-based programming model that focuses on coordinating city-scale distributed systems that are highly heterogeneous and dynamic.
Lightweight cryptography has been widely utilized in resource constrained embedded devices of Cyber-Physical System (CPS) terminals. The hostile and unattended environment in many scenarios make those endpoints easy to be attacked by hardware based techniques. As a resource-efficient countermeasure against Fault Attacks, parity Concurrent Error Detection (CED) is preferably integrated with security-critical algorithm in CPS terminals. The parity bit changes if an odd number of faults occur during the cipher execution. In this paper, we analyze the effectiveness of fault detection of a parity CED protected cipher (PRESENT) using laser fault injection. The experimental results show that the laser perturbation to encryption can easily flip an even number of data bits, where the faults cannot be detected by parity. Due to the similarity of different parity structures, our attack can bypass almost all parity protections in block ciphers. Some suggestions are given to enhance the security of parity implementations.
Android is the most commonly used mobile device operation system. The core of Android, the System Server (SS), is a multi-threaded process that provides most of the system services. Based on a new understanding of the security risks introduced by the callback mechanism in system services, we have discovered a general type of design flaw. A vulnerability detection tool has been designed and implemented based on static taint analysis. We applied the tool on all the 80 system services in the SS of Android 5.1.0. With its help, we have discovered six previously unknown vulnerabilities, which are further confirmed on Android 2.3.7-6.0.1. According to our analysis, about 97.3% of the entire 1.4 billion real-world Android devices are vulnerable. Our proof-of-concept attack proves that the vulnerabilities can enable a malicious app to freeze critical system functionalities or soft-reboot the system immediately. It is a neat type of denial-of-service at-tack. We also proved that the attacks can be conducted at mission critical moments to achieve meaningful goals, such as anti anti-virus, anti process-killer, hindering app updates or system patching. After being informed, Google confirmed our findings promptly. Several suggestions on how to use callbacks safely are also proposed to Google.
The ability to identify mobile apps in network traffic has significant implications in many domains, including traffic management, malware detection, and maintaining user privacy. App identification methods in the literature typically use deep packet inspection (DPI) and analyze HTTP headers to extract app fingerprints. However, these methods cannot be used if HTTP traffic is encrypted. We investigate whether Android apps can be identified from their launch-time network traffic using only TCP/IP headers. We first capture network traffic of 86,109 app launches by repeatedly running 1,595 apps on 4 distinct Android devices. We then use supervised learning methods used previously in the web page identification literature, to identify the apps that generated the traffic. We find that: (i) popular Android apps can be identified with 88% accuracy, by using the packet sizes of the first 64 packets they generate, when the learning methods are trained and tested on the data collected from same device; (ii) when the data from an unseen device (but similar operating system/vendor) is used for testing, the apps can be identified with 67% accuracy; (iii) the app identification accuracy does not drop significantly even if the training data are stale by several days, and (iv) the accuracy does drop quite significantly if the operating system/vendor is very different. We discuss the implications of our findings as well as open issues.
Modern multicore processors feature easily accessible temperature sensors that provide useful information for dynamic thermal management. These sensors were recently shown to be a potential security threat, since otherwise isolated applications can exploit them to establish a thermal covert channel and leak restricted information. Previous research showed experiments that document the feasibility of (low-rate) communication over this channel, but did not further analyze its fundamental characteristics. For this reason, the important questions of quantifying the channel capacity and achievable rates remain unanswered. To address these questions, we devise and exploit a new methodology that leverages both theoretical results from information theory and experimental data to study these thermal covert channels on modern multicores. We use spectral techniques to analyze data from two representative platforms and estimate the capacity of the channels from a source application to temperature sensors on the same or different cores. We estimate the capacity to be in the order of 300 bits per second (bps) for the same-core channel, i.e., when reading the temperature on the same core where the source application runs, and in the order of 50 bps for the 1-hop channel, i.e., when reading the temperature of the core physically next to the one where the source application runs. Moreover, we show a communication scheme that achieves rates of more than 45 bps on the same-core channel and more than 5 bps on the 1-hop channel, with less than 1% error probability. The highest rate shown in previous work was 1.33 bps on the 1-hop channel with 11% error probability.
C programming language never performs automatic bounds checking in order to speed up execution. But bounds checking is absolutely necessary in any program. Because if a variable is out-of-bounds, some serious errors may occur during execution, such as endless loop or buffer overflows. When there are arrays used in a program, the index of an array must be within the boundary of the array. But programmers always miss the array bounds checking or do not perform a correct array bounds checking. In this paper, we perform static analysis based on taint analysis and data flow analysis to detect which arrays do not have correct array bounds checking in the program. And we implement an automatic static tool, Carraybound. And the experimental results show that Carraybound can work effectively and efficiently.
In this paper, we propose new types of cascading attacks against smart grid that use control command disaggregation and core smart grid services. Although there have been tremendous research efforts in injection attacks against the smart grid, to our knowledge most studies focus on false meter data injection, and false command and false feedback injection attacks have been scarcely investigated. In addition, control command disaggregation has not been addressed from a security point of view, in spite of the fact that it is becoming one of core concepts in the smart grid and hence analysing its security implications is crucial to the smart grid security. Our cascading attacks use false control command, false feedback or false meter data injection, and cascade the effects of such injections throughout the smart grid subsystems and components. Our analysis and evaluation results show that the proposed attacks can cause serious service disruptions in the smart grid. The evaluation has been performed on a widely used smart grid simulation platform.
Without violating existing app security enforcement, malicious modules inside apps, such as a library or an external class, can steal private data and abuse sensitive capabilities meant for other modules inside the same apps. These so-called "module-level attacks" are quickly emerging, fueled by the pervasive use of third-party code in apps and the lack of module-level security enforcement on mobile platforms. To systematically thwart the threats, we build CASE, an automatic app patching tool used by app developers to enable module-level security in their apps built for COTS Android devices. During runtime, patched apps enforce developer-supplied security policies that regulate interactions among modules at the granularity of a Java class. Requiring no changes or special support from the Android OS, the enforcement is complete in covering inter-module crossings in apps and is robust against malicious Java and native app modules. We evaluate CASE with 420 popular apps and a set of Android's unit tests. The results show that CASE is fully compatible with the tested apps and incurs an average performance overhead of 4.9%.
At the core of the "Big Data" revolution lie frameworks and systems that allow for the massively parallel processing of large amounts of data. Ironically, while they have been designed for processing large amounts of data, these systems are at the same time major producers of data: to support the administration and management of these huge-scale systems, they are configured to generate detailed log and monitoring data, periodically capturing the system state across all nodes, components and jobs in the system. While such logging information is used routinely by sysadmins for ad-hoc trouble-shooting and problem diagnosis, we point out that there is a tremendous value in analyzing such data from a research point of view. In this talk, we will go over several case studies that demonstrate how measuring and analyzing measurement data from production systems can provide new insights into how systems work and fail, and how these new insights can help in designing better systems.
Smart grid aims to improve control and monitoring routines to ensure reliable and efficient supply of electricity. The rapid advancements in information and communication technologies of Supervisory Control And Data Acquisition (SCADA) networks, however, have resulted in complex cyber physical systems. This added complexity has broadened the attack surface of power-related applications, amplifying their susceptibility to cyber threats. A particular class of system integrity attacks against the smart grid is False Data Injection (FDI). In a successful FDI attack, an adversary compromises the readings of grid sensors in such a way that errors introduced into estimates of state variables remain undetected. This paper presents an end-to-end case study of how to instantiate real FDI attacks to the Alternating Current (AC) –nonlinear– State Estimation (SE) process. The attack is realized through firmware modifications of the microprocessor-based remote terminal systems, falsifying the data transmitted to the SE routine, and proceeds regardless of perfect or imperfect knowledge of the current system state. The case study concludes with an investigation of an attack on the IEEE 14 bus system using load data from the New York Independent System Operator (NYISO).
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NP-hard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two state-of-the-arts approaches.
Malicious Android applications pose serious threats to mobile security. They threaten the data confidentiality and system integrity on Android devices. Monitoring runtime activities serves as an important technique for analyzing dynamic app behaviors. We design a triggering relation model for dynamically analyzing network traffic on Android devices. Our model enables one to infer the dependency of outbound network requests from the device. We describe a new machine learning approach for discovering the dependency of network requests. These request-level dependence relations are used to detect stealthy malware activities. Malicious requests are identified due to the lack of dependency with legitimate triggers. Our prototype is evaluated on 14GB network traffic data and system logs collected from an Android tablet. Experimental results show that our solution achieves a high accuracy (99.1%) in detecting malicious requests sent from new malicious apps.
SQL injection is one of the major threats to the security of the web applications. Attackers try to gain unauthorized access to the database, which has vital and private information of the users. Many researchers have provided various techniques and practices to protect the web applications from attackers. There is a plethora of techniques available to perform SQL injection and usually not everyone is familiar with every attack. Hence, this kind of attack is still the most prevalent. In this paper, we have presented the types of SQL injections attacks and most dominant ways to prevent them.