Biblio
After a program has crashed and terminated abnormally, it typically leaves behind a snapshot of its crashing state in the form of a core dump. While a core dump carries a large amount of information, which has long been used for software debugging, it barely serves as informative debugging aids in locating software faults, particularly memory corruption vulnerabilities. A memory corruption vulnerability is a special type of software faults that an attacker can exploit to manipulate the content at a certain memory. As such, a core dump may contain a certain amount of corrupted data, which increases the difficulty in identifying useful debugging information (e.g. , a crash point and stack traces). Without a proper mechanism to deal with this problem, a core dump can be practically useless for software failure diagnosis. In this work, we develop CREDAL, an automatic tool that employs the source code of a crashing program to enhance core dump analysis and turns a core dump to an informative aid in tracking down memory corruption vulnerabilities. Specifically, CREDAL systematically analyzes a core dump potentially corrupted and identifies the crash point and stack frames. For a core dump carrying corrupted data, it goes beyond the crash point and stack trace. In particular, CREDAL further pinpoints the variables holding corrupted data using the source code of the crashing program along with the stack frames. To assist software developers (or security analysts) in tracking down a memory corruption vulnerability, CREDAL also performs analysis and highlights the code fragments corresponding to data corruption. To demonstrate the utility of CREDAL, we use it to analyze 80 crashes corresponding to 73 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, CREDAL can accurately pinpoint the crash point and (fully or partially) restore a stack trace even though a crashing program stack carries corrupted data. In addition, we demonstrate CREDAL can potentially reduce the manual effort of finding the code fragment that is likely to contain memory corruption vulnerabilities.
Modern world has witnessed a dramatic increase in our ability to collect, transmit and distribute real-time monitoring and surveillance data from large-scale information systems and cyber-physical systems. Detecting system anomalies thus attracts significant amount of interest in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be a powerful way in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: 1) fault propagation in the network is ignored; 2) the root casual anomalies may not always be the nodes with a high-percentage of vanishing correlations; 3) temporal patterns of vanishing correlations are not exploited for robust detection. To address these limitations, in this paper we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectively model fault propagation over the entire invariant network, and can perform joint inference on both the structural, and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations, and can compensate for unstructured measurement noise in the system. Extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets demonstrate the effectiveness of our approach.
Workflow-centric tracing captures the workflow of causally-related events (e.g., work done to process a request) within and among the components of a distributed system. As distributed systems grow in scale and complexity, such tracing is becoming a critical tool for understanding distributed system behavior. Yet, there is a fundamental lack of clarity about how such infrastructures should be designed to provide maximum benefit for important management tasks, such as resource accounting and diagnosis. Without research into this important issue, there is a danger that workflow-centric tracing will not reach its full potential. To help, this paper distills the design space of workflow-centric tracing and describes key design choices that can help or hinder a tracing infrastructures utility for important tasks. Our design space and the design choices we suggest are based on our experiences developing several previous workflow-centric tracing infrastructures.
We propose an exact algorithm to model-free diagnosis with an application to fault localization in digital circuits. We assume that a faulty circuit and a correctness specification, e.g., in terms of an un-optimized reference circuit, are available. Our algorithm computes the exact set of all minimal diagnoses up to cardinality k considering all possible assignments to the primary inputs of the circuit. This exact diagnosis problem can be naturally formulated and solved using an oracle for Quantified Boolean Satisfiability (QSAT). Our algorithm uses Boolean Satisfiability (SAT) instead to compute the exact result more efficiently. We implemented the approach and present experimental results for determining fault candidates of digital circuits with seeded faults on the gate level. The experiments show that the presented SAT-based approach outperforms state-of-the-art techniques from solving instances of the QSAT problem by several orders of magnitude while having the same accuracy. Moreover, in contrast to QSAT, the SAT-based algorithm has any-time behavior, i.e., at any-time of the computation, an approximation of the exact result is available that can be used as a starting point for debugging. The result improves while time progresses until eventually the exact result is obtained.
Energy use of buildings represents roughly 40% of the overall energy consumption. Most of the national agendas contain goals related to reducing the energy consumption and carbon footprint. Timely and accurate fault detection and diagnosis (FDD) in building management systems (BMS) have the potential to reduce energy consumption cost by approximately 15-30%. Most of the FDD methods are data-based, meaning that their performance is tightly linked to the quality and availability of relevant data. Based on our experience, faults and relevant events data is very sparse and inadequate, mostly because of the lack of will and incentive for those that would need to keep track of faults. In this paper we introduce the idea of using crowdsourcing to support FDD data collection processes, and illustrate our idea through a mobile application that has been implemented for this purpose. Furthermore, we propose a strategy of how to successfully deploy this building occupants' crowdsourcing application.
In order to realize the accurate positioning and recognition effectively of the analog circuit, the feature extraction of fault information is an extremely important port. This arrival based on the experimental circuit which is designed as a failure mode to pick-up the fault sample set. We have chosen two methods, one is the combination of wavelet transform and principal component analysis, the other is the factorial analysis for the fault data's feature extraction, and we also use the extreme learning machine to train and diagnose the data, to compare the performance of these two methods through the accuracy of the diagnosis. The results of the experiment shows that the data which we get from the experimental circuit, after dealing with these two methods can quickly get the fault location.
This paper considers the problem of system-level fault diagnosis in highly dynamic networks. The existing fault diagnostic models deal mainly with static faults and have limited capabilities to handle dynamic networks. These fault diagnostic models are based on timers that work on a simple timeout mechanism to identify the node status, and often make simplistic assumptions for system implementations. To overcome the above problems, we propose a time-free comparison-based diagnostic model. Unlike the traditional models, the proposed model does not rely on timers and is more suitable for use in dynamic network environments. We also develop a novel comparison-based fault diagnosis protocol for identifying and diagnosing dynamic faults. The performance of the protocol has been analyzed and its correctness has been proved.
Fault diagnosis in IT environments is complicated because (i) most monitors have shared specificity (high amount of memory utilization can result from a large number of causes), (ii) it is hard to deploy and maintain enough sensors to ensure adequate coverage, and (iii) some functionality may be provided as-a-service by external parties with limited visibility and simultaneous availability of alert data. To systematically incorporate uncertainty and to be able to fuse information from multiple sources, we propose the use of Dempster-Shafer Theory (DST) of evidential reasoning for fault diagnosis and show its efficacy in the context of a distributed application.
Self-describing key-value data formats such as JSON are becoming increasingly popular as application developers choose to avoid the rigidity imposed by the relational model. Database systems designed for these self-describing formats, such as MongoDB, encourage users to use denormalized, heavily nested data models so that relationships across records and other schema information need not be predefined or standardized. Such data models contribute to long-term development complexity, as their lack of explicit entity and relationship tracking burdens new developers unfamiliar with the dataset. Furthermore, the large amount of data repetition present in such data layouts can introduce update anomalies and poor scan performance, which reduce both the quality and performance of analytics over the data. In this paper we present an algorithm that automatically transforms the denormalized, nested data commonly found in NoSQL systems into traditional relational data that can be stored in a standard RDBMS. This process includes a schema generation algorithm that discovers relationships across the attributes of the denormalized datasets in order to organize those attributes into relational tables. It further includes a matching algorithm that discovers sets of attributes that represent overlapping entities and merges those sets together. These algorithms reduce data repetition, allow the use of data analysis tools targeted at relational data, accelerate scan-intensive algorithms over the data, and help users gain a semantic understanding of complex, nested datasets.
With data becoming available in larger quantities and at higher rates, new data processing paradigms have been proposed to handle high-volume, fast-moving data. Data Stream Processing is one such paradigm wherein transient data streams flow through sets of continuous queries, only returning results when data is of interest to the querier. To avoid the large costs associated with maintaining the infrastructure required for processing these data streams, many companies will outsource their computation to third-party cloud services. This outsourcing, however, can lead to private data being accessed by parties that a data provider may not trust. The literature offers solutions to this confidentiality and access control problem but they have fallen short of providing a complete solution to these problems, due to either immense overheads or trust requirements placed on these third-party services. To address these issues, we have developed PolyStream, an enhancement to existing data stream management systems that enables data providers to specify attribute-based access control policies that are cryptographically enforced while simultaneously allowing many types of in-network data processing. We detail the access control models and mechanisms used by PolyStream, and describe a novel use of security punctuations that enables flexible, online policy management and key distribution. We detail how queries are submitted and executed using an unmodified Data Stream Management System, and show through an extensive evaluation that PolyStream yields a 550x performance gain versus the state-of-the-art system StreamForce in CODASPY 2014, while providing greater functionality to the querier.
The World Wide Web has become the most common platform for building applications and delivering content. Yet despite years of research, the web continues to face severe security challenges related to data integrity and confidentiality. Rather than continuing the exploit-and-patch cycle, we propose addressing these challenges at an architectural level, by supplementing the web's existing connection-based and server-based security models with a new approach: content-based security. With this approach, content is directly signed and encrypted at rest, enabling it to be delivered via any path and then validated by the browser. We explore how this new architectural approach can be applied to the web and analyze its security benefits. We then discuss a broad research agenda to realize this vision and the challenges that must be overcome.
This paper introduces a design and implementation of a security scheme for the Internet of Things (IoT) based on ECQV Implicit Certificates and Datagram Transport Layer Security (DTLS) protocol. In this proposed security scheme, Elliptic curve cryptography based ECQV implicit certificate plays a key role allowing mutual authentication and key establishment between two resource-constrained IoT devices. We present how IoT devices get ECQV implicit certificates and use them for authenticated key exchange in DTLS. An evaluation of execution time of the implementation is also conducted to assess the efficiency of the solution.
We study a sensor network setting in which samples are encrypted individually using different keys and maintained on a cloud storage. For large systems, e.g. those that generate several millions of samples per day, fine-grained sharing of encrypted samples is challenging. Existing solutions, such as Attribute-Based Encryption (ABE) and Key Aggregation Cryptosystem (KAC), can be utilized to address the challenge, but only to a certain extent. They are often computationally expensive and thus unlikely to operate at scale. We propose an algorithmic enhancement and two heuristics to improve KAC's key reconstruction cost, while preserving its provable security. The improvement is particularly significant for range and down-sampling queries – accelerating the reconstruction cost from quadratic to linear running time. Experimental study shows that for queries of size 32k samples, the proposed fast reconstruction techniques speed-up the original KAC by at least 90 times on range and down-sampling queries, and by eight times on general (arbitrary) queries. It also shows that at the expense of splitting the query into 16 sub-queries and correspondingly issuing that number of different aggregated keys, reconstruction time can be reduced by 19 times. As such, the proposed techniques make KAC more applicable in practical scenarios such as sensor networks or the Internet of Things.
The semantics of online authentication in the web are rather straightforward: if Alice has a certificate binding Bob's name to a public key, and if a remote entity can prove knowledge of Bob's private key, then (barring key compromise) that remote entity must be Bob. However, in reality, many websites' and the majority of the most popular ones-are hosted at least in part by third parties such as Content Delivery Networks (CDNs) or web hosting providers. Put simply: administrators of websites who deal with (extremely) sensitive user data are giving their private keys to third parties. Importantly, this sharing of keys is undetectable by most users, and widely unknown even among researchers. In this paper, we perform a large-scale measurement study of key sharing in today's web. We analyze the prevalence with which websites trust third-party hosting providers with their secret keys, as well as the impact that this trust has on responsible key management practices, such as revocation. Our results reveal that key sharing is extremely common, with a small handful of hosting providers having keys from the majority of the most popular websites. We also find that hosting providers often manage their customers' keys, and that they tend to react more slowly yet more thoroughly to compromised or potentially compromised keys.
Security threats may hinder the large scale adoption of the emerging Internet of Things (IoT) technologies. Besides efforts have already been made in the direction of data integrity preservation, confidentiality and privacy, several issues are still open. The existing solutions are mainly based on encryption techniques, but no attention is actually paid to key management. A clever key distribution system, along with a key replacement mechanism, are essentials for assuring a secure approach. In this paper, two popular key management systems, conceived for wireless sensor networks, are integrated in a real IoT middleware and compared in order to evaluate their performance in terms of overhead, delay and robustness towards malicious attacks.
The storage and management of secrets (encryption keys, passwords, etc) are significant open problems in the age of ephemeral, cloud-based computing infrastructure. How do we store and control access to the secrets necessary to configure and operate a range of modern technologies without sacrificing security and privacy requirements or significantly curtailing the desirable capabilities of our systems? To answer this question, we propose Tutamen: a next-generation secret-storage service. Tutamen offers a number of desirable properties not present in existing secret-storage solutions. These include the ability to operate across administrative domain boundaries and atop minimally trusted infrastructure. Tutamen also supports access control based on contextual, multi-factor, and alternate-band authentication parameters. These properties have allowed us to leverage Tutamen to support a variety of use cases not easily realizable using existing systems, including supporting full-disk encryption on headless servers and providing fully-featured client-side encryption for cloud-based file-storage services. In this paper, we present an overview of the secret-storage challenge, Tutamen's design and architecture, the implementation of our Tutamen prototype, and several of the applications we have built atop Tutamen. We conclude that Tutamen effectively eases the secret-storage burden and allows developers and systems administrators to achieve previously unattainable security-oriented goals while still supporting a wide range of feature-oriented requirements.
Technological changes bring great efficiencies and opportunities; however, they also bring new threats and dangers that users are often ill prepared to handle. Some individuals have training at work or school while others have family or friends to help them. However, there are few widely known or ubiquitous educational programs to inform and motivate users to develop safe cybersecurity practices. Additionally, little is known about learning strategies in this domain. Understanding how active Internet users have learned their security practices can give insight into more effective learning methods. I surveyed 800 online labor workers to discover their learning processes. They shared how they had to construct their own schema and negotiate meaning in a complex domain. Findings suggest a need to help users build a dynamic mental model of security. Participants recommend encouraging participatory and constructive learning, multi-model dissemination, and ubiquitous opportunities for learning security behaviors.
Information Systems curricula require on-going and frequent review [2] [11]. Furthermore, such curricula must be flexible because of the fast-paced, dynamic nature of the workplace. Such flexibility can be maintained through modernizing course content or, inclusively, exchanging hardware or software for newer versions. Alternatively, flexibility can arise from incorporating new information into curricula from other disciplines. One field where the pace of change is extremely high is cybersecurity [3]. Students are left with outdated skills when curricula lag behind the pace of change in industry. For example, cryptography is a required learning objective in the DHS/NSA Center of Academic Excellence (CAE) knowledge criteria [1]. However, the overarching curriculum associated with basic ciphers has gone unchanged for decades. Indeed, a general problem in cybersecurity education is that students lack fundamental knowledge in areas such as ciphers [5]. In response, researchers have developed a variety of interactive classroom visualization tools [5] [8] [9]. Such tools visualize the standard approach to frequency analysis of simple substitution ciphers that includes review of most common, single letters in ciphertext. While fundamental ciphers such as the monoalphabetic substitution cipher have not been updated (these are historical ciphers), collective understanding of how humans interact with language has changed. Updated understanding in both English language pedagogy [10] [12] and automated cryptanalysis of substitution ciphers [4] potentially renders the interactive classroom visualization tools incomplete or outdated. Classroom visualization tools are powerful teaching aids, particularly for abstract concepts. Existing research has established that such tools promote an active learning environment that translates to not only effective learning conditions but also higher student retention rates [7]. However, visualization tools require extensive planning and design when used to actively engage students with detailed, specific knowledge units such as ciphers [7] [8]. Accordingly, we propose a heatmap-based frequency analysis visualization solution that (a) incorporates digraph and trigraph language processing norms; (b) and enhances the active learning pedagogy inherent in visualization tools. Preliminary results indicate that study participants take approximately 15% longer to learn the heatmap-based frequency analysis technique compared to traditional frequency analysis but demonstrate a 50% increase in efficacy when tasked with solving simple substitution ciphers. Further, a heatmap-based solution contributes positively to the field insofar as educators have an additional tool to use in the classroom. As well, the heatmap visualization tool may allow researchers to comparatively examine efficacy of visualization tools in the cryptanalysis of mono-alphabetic substitution ciphers.
Recent years have witnessed a flourish of hands-on cybersecurity labs and competitions. The information technology (IT) education community has recognized their significant role in boosting students' interest in security and enhancing their security knowledge and skills. Compared to the focus on individual based education materials, much less attention has been paid to the development of tools and materials suitable for team-based security practices, which, however, prevail in real-world environments. One major bottleneck is lack of suitable platforms for this type of practices in IT education community. In this paper, we propose a low-cost, team-oriented cybersecurity practice platform called Platoon. The Platoon platform allows for quickly and automatically creating one or more virtual networks that mimic real-world corporate networks using a regular computer. The virtual environment created by Platoon is suitable for both cybersecurity labs, competitions, and projects. The performance data and user feedback collected from our cyber-defense exercises indicate that Platoon is practical and useful for enhancing students' security learning outcomes.
The demand for trained cybersecurity operators is growing more quickly than traditional programs in higher education can fill. At the same time, unemployment for returning military veterans has become a nationally discussed problem. We describe the design and launch of New Skills for a New Fight (NSNF), an intensive, one-year program to train military veterans for the cybersecurity field. This non-traditional program, which leverages experience that veterans gained in military service, includes recruitment and selection, a base of knowledge in the form of four university courses in a simultaneous cohort mode, a period of hands-on cybersecurity training, industry certifications and a practical internship in a Security Operations Center (SOC). Twenty veterans entered this pilot program in January of 2016, and will complete in less than a year's time. Initially funded by a global financial services company, the program provides veterans with an expense-free preparation for an entry-level cybersecurity job.
This panel will discuss and debate what role(s) the information technology discipline should have in cybersecurity. Diverse viewpoints will be considered including current and potential ACM curricular recommendations, current and potential ABET and NSA accreditation criteria, the emerging cybersecurity discipline(s), consideration of government frameworks, the need for a multi-disciplinary approach to cybersecurity, and what aspects of cybersecurity should be under information technology's purview.
In this special session, members of the ACM Joint Task Force on Cyber Education to Develop Undergraduate Curricular Guidance will provide an overview of the task force mission, objectives, and work plan. After the overview, task force members will engage session participants in the curricular development process.
By connecting devices, people, vehicles and infrastructures everywhere in a city, governments and their partners can improve community wellbeing and other economic and financial aspects (e.g., cost and energy savings). Nonetheless, smart cities are complex ecosystems that comprise many different stakeholders (network operators, managed service providers, logistic centers...) who must work together to provide the best services and unlock the commercial potential of the IoT. This is one of the major challenges that faces today's smart city movement, and more generally the IoT as a whole. Indeed, while new smart connected objects hit the market every day, they mostly feed "vertical silos" (e.g., vertical apps, siloed apps...) that are closed to the rest of the IoT, thus hampering developers to produce new added value across multiple platforms. Within this context, the contribution of this paper is twofold: (i) present the EU vision and ongoing activities to overcome the problem of vertical silos; (ii) introduce recent IoT standards used as part of a recent Horizon 2020 IoT project to address this problem. The implementation of those standards for enhanced sporting event management in a smart city/government context (FIFA World Cup 2022) is developed, presented, and evaluated as a proof-of-concept.
We propose a new voting scheme, BeleniosRF, that offers both receipt-freeness and end-to-end verifiability. It is receipt-free in a strong sense, meaning that even dishonest voters cannot prove how they voted. We provide a game-based definition of receipt-freeness for voting protocols with non-interactive ballot casting, which we name strong receipt-freeness (sRF). To our knowledge, sRF is the first game-based definition of receipt-freeness in the literature, and it has the merit of being particularly concise and simple. Built upon the Helios protocol, BeleniosRF inherits its simplicity and does not require any anti-coercion strategy from the voters. We implement BeleniosRF and show its feasibility on a number of platforms, including desktop computers and smartphones.
This paper presents the holistic approach to cyber resilience as a means of preparing for the "unknown unknowns". Principles of augmented cyber risks management and resilience management model at national level are presented, with elaboration on multi-stakeholder engagement and partnership for the implementation of national cyber resilience collaborative framework. The complementarity of governance, law, and business/industry initiatives is outlined, with examples of the collaborative resilience model for the Bulgarian national strategy and its multi-national engagements.