Biblio
Fabrication process introduces some inherent variability to the attributes of transistors (in particular length, widths, oxide thickness). As a result, every chip is physically unique. Physical uniqueness of microelectronics components can be used for multiple security applications. Physically Unclonable Functions (PUFs) are built to extract the physical uniqueness of microelectronics components and make it usable for secure applications. However, the microelectronics components used by PUFs designs suffer from external, environmental variations that impact the PUF behavior. Variations of temperature gradients during manufacturing can bias the PUF responses. Variations of temperature or thermal noise during PUF operation change the behavior of the circuit, and can introduce errors in PUF responses. Detailed knowledge of the behavior of PUFs operating over various environmental factors is needed to reliably extract and demonstrate uniqueness of the chips. In this work, we present a detailed and exhaustive analysis of the behavior of two PUF designs, a ring oscillator PUF and a timing path violation PUF. We have implemented both PUFs using FPGA fabricated by Xilinx, and analyzed their behavior while varying temperature and supply voltage. Our experiments quantify the robustness of each design, demonstrate their sensitivity to temperature and show the impact which supply voltage has on the uniqueness of the analyzed PUFs.
Although computing students may enjoy when their instructors teach using analogies, it is unknown to what extent these analogies are useful for their learning. This study examines the value of analogies when used to introduce three introductory computing topics. The value of these analogies may be evident during the teaching process itself (short term), in subsequent exams (long term), or in students' ability to apply their understanding to related non-technical areas (transfer). Comparing results between an experimental group (analogy) and control group (no analogy), we find potential value for analogies in short term learning. However, no solid evidence was found to support analogies as valuable for students in the long term or for knowledge transfer. Specific demographic groups were examined and promising preliminary findings are presented.
The web world is been flooded with multi-media sources such as images, videos, animations and audios, which has in turn made the computer vision researchers to focus over extracting the content from the sources. Scene text recognition basically involves two major steps namely Text Localization and Text Recognition. This paper provides end-to-end text recognition approach to extract the characters alone from the complex natural scene. Using Maximal Stable Extremal Region (MSER) the various objects are localized, using Canny Edge detection method edges are identified, further binary classification is done using Connected-Component method which segregates the text and nontext objects and finally the stroke analysis method is applied to analyse the style of the character, leading to the character recognization. The Experimental results were obtained by testing the approach over ICDAR2015 dataset, wherein text was able to be recognized from most of the scene images with good precision value.
In this article, I present the questions that I seek to answer in my PhD research. I posit to analyze natural language text with the help of semantic annotations and mine important events for navigating large text corpora. Semantic annotations such as named entities, geographic locations, and temporal expressions can help us mine events from the given corpora. These events thus provide us with useful means to discover the locked knowledge in them. I pose three problems that can help unlock this knowledge vault in semantically annotated text corpora: i. identifying important events; ii. semantic search; iii. and event analytics.
We study the trade-off between the benefits obtained by communication, vs. the risks due to exposure of the location of the transmitter. To study this problem, we introduce a game between two teams of mobile agents, the P-bots team and the E-bots team. The E-bots attempt to eavesdrop and collect information, while evading the P-bots; the P-bots attempt to prevent this by performing patrol and pursuit. The game models a typical use-case of micro-robots, i.e., their use for (industrial) espionage. We evaluate strategies for both teams, using analysis and simulations.
Software security is an important concern in the world moving towards Information Technology. Detecting software vulnerabilities is a difficult and resource consuming task. Therefore, automatic vulnerability prediction would help development teams to predict vulnerability-prone components and prioritize security inspection efforts. Software source code metrics and data mining techniques have been recently used to predict vulnerability-prone components. Some of previous studies used a set of unit complexity and coupling metrics to predict vulnerabilities. In this study, first, we compare the predictability power of these two groups of metrics in cross-project vulnerability prediction. In cross-project vulnerability prediction we create the prediction model based on datasets of completely different projects and try to detect vulnerabilities in another project. The experimental results show that unit complexity metrics are stronger vulnerability predictors than coupling metrics. Then, we propose a new set of coupling metrics which are called Included Vulnerable Header (IVH) metrics. These new coupling metrics, which consider interaction of application modules with outside of the application, predict vulnerabilities highly better than regular coupling metrics. Furthermore, adding IVH metrics to the set of complexity metrics improves Recall of the best predictor from 60.9% to 87.4% and shows the best set of metrics for cross-project vulnerability prediction.
In this paper, we introduce Entropy/IP: a system that discovers Internet address structure based on analyses of a subset of IPv6 addresses known to be active, i.e., training data, gleaned by readily available passive and active means. The system is completely automated and employs a combination of information-theoretic and machine learning techniques to probabilistically model IPv6 addresses. We present results showing that our system is effective in exposing structural characteristics of portions of the active IPv6 Internet address space, populated by clients, services, and routers. In addition to visualizing the address structure for exploration, the system uses its models to generate candidate addresses for scanning. For each of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates for scanning. We achieve some success in 14 datasets, finding up to 40% of the generated addresses to be active. In 11 of these datasets, we find active network identifiers (e.g., /64 prefixes or "subnets") not seen in training. Thus, we provide the first evidence that it is practical to discover subnets and hosts by scanning probabilistically selected areas of the IPv6 address space not known to contain active hosts a priori.
Virtual reality allows users to experience unusual immersive environments. There are still several aspect of design for virtual reality that need more investigation, such as transitioning between environments. Multiple studies have shown that physical movement in a virtual environment supports immersion and presence. Our setup will allow the comparative study of the coupling of virtual camera movements with simultaneous physical movements of the user in terms of user preference and comfort. This work-in-progress uses a within-subject experimental design for evaluating interaction prototypes based on the Oculus Rift DK2 where participants will be tasked with transitioning between different environments; once using physical motion to merely trigger the transition and once with the virtual camera movement being coupled to the physical motion. Qualitative and quantitative data will be collected utilizing questionnaires and in-game metrics. Pretests of a similar setup were used to establish minimal levels of comfort.
The Internet of Things (IoT) is slowly, but steadily, changing the way we interact with our surrounding. Smart cities, smart environments, smart buildings are just a few macroscopic examples of how smart ecosystems are increasingly involved in our daily life, each one offering a different set of information. This information's decentralization and scattering can be exploited, optimizing mobile nodes on-demand information retrieval process. We propose an approach focused on defining competence domains in smart systems where the responsibility of providing a specific information to a mobile node is defined by spatial constraints. By exploiting the interplay and duality of Cloud Computing and Fog Computing we introduce an approach to exploit data spatial allocation in smart systems to optimize mobile nodes information retrieval.
The notion of edge computing introduces new computing functions away from centralized locations and closer to the network edge and thus facilitating new applications and services. This enhanced computing paradigm is provides new opportunities to applications developers, not available otherwise. In this talk, I will discuss why placing computation functions at the extreme edge of our network infrastructure, i.e., in wireless Access Points and home set-top boxes, is particularly beneficial for a large class of emerging applications. I will discuss a specific approach, called ParaDrop, to implement such edge computing functionalities, and use examples from different domains – smarter homes, sustainability, and intelligent transportation – to illustrate the new opportunities around this concept.
Quality assurance for highly-configurable systems is challenging due to the exponentially growing configuration space. Interactions among multiple options can lead to surprising behaviors, bugs, and security vulnerabilities. Analyzing all configurations systematically might be possible though if most options do not interact or interactions follow specific patterns that can be exploited by analysis tools. To better understand interactions in practice, we analyze program traces to characterize and identify where interactions occur on control flow and data. To this end, we developed a dynamic analysis for Java based on variability-aware execution and monitor executions of multiple small to medium-sized programs. We find that the essential configuration complexity of these programs is indeed much lower than the combinatorial explosion of the configuration space indicates. However, we also discover that the interaction characteristics that allow scalable and complete analyses are more nuanced than what is exploited by existing state-of-the-art quality assurance strategies.
In recent years, simple password-based authentication systems have increasingly proven ineffective for many classes of real-world devices. As a result, many researchers have concentrated their efforts on the design of new biometric authentication systems. This trend has been further accelerated by the advent of mobile devices, which offer numerous sensors and capabilities to implement a variety of mobile biometric authentication systems. Along with the advances in biometric authentication, however, attacks have also become much more sophisticated and many biometric techniques have ultimately proven inadequate in face of advanced attackers in practice. In this paper, we investigate the effectiveness of sensor-enhanced keystroke dynamics, a recent mobile biometric authentication mechanism that combines a particularly rich set of features. In our analysis, we consider different types of attacks, with a focus on advanced attacks that draw from general population statistics. Such attacks have already been proven effective in drastically reducing the accuracy of many state-of-the-art biometric authentication systems. We implemented a statistical attack against sensor-enhanced keystroke dynamics and evaluated its impact on detection accuracy. On one hand, our results show that sensor-enhanced keystroke dynamics are generally robust against statistical attacks with a marginal equal-error rate impact (textless0.14%). On the other hand, our results show that, surprisingly, keystroke timing features non-trivially weaken the security guarantees provided by sensor features alone. Our findings suggest that sensor dynamics may be a stronger biometric authentication mechanism against recently proposed practical attacks.
The current trend of large scientific computing problems is to align as much as possible to a Single Programming Multiple Data (or SPMD) scheme when the application algorithms are conducive to parallelization and vectorization. This reduces the complexity of code because the processors or (computational nodes) perform the same instructions which allows for better performance as algorithms work on local data sets instead of continuously transferring data from one locality to another. However, certain applications, such as stencil problems, demonstrate the need to move data to or from remote localities. This involves an additional degree of complexity, as one must know with which localities to exchange data. In order to solve this issue, Fortran has extended its scalar element indexing approach to distributed structures of elements. In this extension, a structure of scalar elements is attributed a ”co-index” and lives in a specific locality. A co-index provides the application with enough information to retrieve the corresponding data reference. In C++, containers present themselves as a ”smarter” alternative of Fortran arrays but there are still no corresponding standardized features similar to the Fortran co-indexing approach. In this paper, we present an implementation of such features in HPX, a general purpose C++ runtime system for applications of any scale. We describe how the combination of the HPX features and the actual C++ Standard makes it easy to define a high performance API similar to Co-Array Fortran.
We created detailed profiles of the energy consumed by common operations done on Java List, Map, and Set abstractions. The results show that the alternative data types for these abstractions differ significantly in terms of energy consumption depending on the operations. For example, an ArrayList consumes less energy than a LinkedList if items are inserted at the middle or at the end, but consumes more energy than a LinkedList if items are inserted at the start of the list. To explain the results, we explored the memory usage and the bytecode executed during an operation. Expensive computation tasks in the analyzed bytecode traces appeared to have an energy impact, but memory usage did not contribute. We evaluated our profiles by using them to selectively replace Collections types used in six applications and libraries. We found that choosing the wrong Collections type, as indicated by our profiles, can cost even 300% more energy than the most efficient choice. Our work shows that the usage context of a data structure and our measured energy profiles can be used to decide between alternative Collections implementations.
The Internet of Things (IoT) is a design implementation of embedded system design that connects a variety of devices, sensors, and physical objects to a larger connected network (e.g. the Internet) which requires human-to-human or human-to-computer interaction. While the IoT is expected to expand the user's connectivity and everyday convenience, there are serious security considerations that come into account when using the IoT for distributed authentication. Furthermore the incorporation of biometrics to IoT design brings about concerns of cost and implementing a 'user-friendly' design. In this paper, we focus on the use of electrocardiogram (ECG) signals to implement distributed biometrics authentication within an IoT system model. Our observations show that ECG biometrics are highly reliable, more secure, and easier to implement than other biometrics.
k-anonymity is an efficient way to anonymize the relational data to protect privacy against re-identification attacks. For the purpose of k-anonymity on transaction data, each item is considered as the quasi-identifier attribute, thus increasing high dimension problem as well as the computational complexity and information loss for anonymity. In this paper, an efficient anonymity system is designed to not only anonymize transaction data with lower information loss but also reduce the computational complexity for anonymity. An extensive experiment is carried to show the efficiency of the designed approach compared to the state-of-the-art algorithms for anonymity in terms of runtime and information loss. Experimental results indicate that the proposed anonymous system outperforms the compared algorithms in all respects.
Contemporary vehicles are getting equipped with an increasing number of Electronic Control Units (ECUs) and wireless connectivities. Although these have enhanced vehicle safety and efficiency, they are accompanied with new vulnerabilities. In this paper, we unveil a new important vulnerability applicable to several in-vehicle networks including Control Area Network (CAN), the de facto standard in-vehicle network protocol. Specifically, we propose a new type of Denial-of-Service (DoS), called the bus-off attack, which exploits the error-handling scheme of in-vehicle networks to disconnect or shut down good/uncompromised ECUs. This is an important attack that must be thwarted, since the attack, once an ECU is compromised, is easy to be mounted on safety-critical ECUs while its prevention is very difficult. In addition to the discovery of this new vulnerability, we analyze its feasibility using actual in-vehicle network traffic, and demonstrate the attack on a CAN bus prototype as well as on two real vehicles. Based on our analysis and experimental results, we also propose and evaluate a mechanism to detect and prevent the bus-off attack.
Software-defined networking (SDN) programs must simultaneously describe static forwarding behavior and dynamic updates in response to events. Event-driven updates are critical to get right, but difficult to implement correctly due to the high degree of concurrency in networks. Existing SDN platforms offer weak guarantees that can break application invariants, leading to problems such as dropped packets, degraded performance, security violations, etc. This paper introduces EVENT-DRIVEN CONSISTENT UPDATES that are guaranteed to preserve well-defined behaviors when transitioning between configurations in response to events. We propose NETWORK EVENT STRUCTURES (NESs) to model constraints on updates, such as which events can be enabled simultaneously and causal dependencies between events. We define an extension of the NetKAT language with mutable state, give semantics to stateful programs using NESs, and discuss provably-correct strategies for implementing NESs in SDNs. Finally, we evaluate our approach empirically, demonstrating that it gives well-defined consistency guarantees while avoiding expensive synchronization and packet buffering.
Failing to properly isolate components in the same address space has resulted in a substantial amount of vulnerabilities. Enforcing the least privilege principle for memory accesses can selectively isolate software components to restrict attack surface and prevent unintended cross-component memory corruption. However, the boundaries and interactions between software components are hard to reason about and existing approaches have failed to stop attackers from exploiting vulnerabilities caused by poor isolation. We present the secure memory views (SMV) model: a practical and efficient model for secure and selective memory isolation in monolithic multithreaded applications. SMV is a third generation privilege separation technique that offers explicit access control of memory and allows concurrent threads within the same process to partially share or fully isolate their memory space in a controlled and parallel manner following application requirements. An evaluation of our prototype in the Linux kernel (TCB textless 1,800 LOC) shows negligible runtime performance overhead in real-world applications including Cherokee web server (textless 0.69%), Apache httpd web server (textless 0.93%), and Mozilla Firefox web browser (textless 1.89%) with at most 12 LOC changes.
Evolutionary Computation (EC) has been used with great success on various real-world problems. One domain abundant with numerous difficult problems is cryptology. Cryptology can be divided into cryptography, that informally speaking considers methods how to ensure secrecy (but also authenticity, privacy, etc.), and cryptanalysis, that deals with methods how to break cryptographic systems. Although not always in an obvious way, EC can be applied to problems from both domains. This tutorial will first give a brief introduction to cryptology intended for general audience (therefore, omitting proofs and mathematics behind many concepts). Afterwards, we concentrate on several topics from cryptography that are successfully tackled up to now with EC and discuss why those topics are suitable to apply EC. However, care must be taken since there exists a number of problems that seem to be impossible to solve with EC and one needs to realize the limitations of the heuristics. We will discuss the choice of appropriate EC techniques (GA, GP, CGP, ES, multi-objective optimization, etc) for various problems and evaluate on the importance of that choice. Furthermore, we will discuss the gap between the cryptographic community and EC community and what does that mean for the results. By doing that, we will give a special emphasis on the perspective that cryptography presents a source of benchmark problems for the EC community. To conclude, we will present a number of topics we consider to be a strong research choice that can have a real-world impact. In that part, we give a special attention to cryptographic problems where cryptographic community successfully applied EC, but where those problems remained out of the focus of EC community. This tutorial will also present some live demos of EC in action when dealing with cryptographic problems. We will present several problems, ways of encoding solutions, impact of the algorithms choice and finally, we will run some experiments to show the results and discuss how to assess them from cryptographic perspective.
Regression test selection (RTS) aims to reduce regression testing time by only re-running the tests affected by code changes. Prior research on RTS can be broadly split into dy namic and static techniques. A recently developed dynamic RTS technique called Ekstazi is gaining some adoption in practice, and its evaluation shows that selecting tests at a coarser, class-level granularity provides better results than selecting tests at a finer, method-level granularity. As dynamic RTS is gaining adoption, it is timely to also evaluate static RTS techniques, some of which were proposed over three decades ago but not extensively evaluated on modern software projects. This paper presents the first extensive study that evaluates the performance benefits of static RTS techniques and their safety; a technique is safe if it selects to run all tests that may be affected by code changes. We implemented two static RTS techniques, one class-level and one method-level, and compare several variants of these techniques. We also compare these static RTS techniques against Ekstazi, a state-of-the-art, class-level, dynamic RTS technique. The experimental results on 985 revisions of 22 open-source projects show that the class-level static RTS technique is comparable to Ekstazi, with similar performance benefits, but at the risk of being unsafe sometimes. In contrast, the method-level static RTS technique performs rather poorly.
In the age of the IoT (Internet of Things), a random number generator plays an important role of generating encryption keys and authenticating a piece of an embedded equipment. The random numbers are required to be uniformly distributed statistically and unpredictable. To satisfy the requirements, a physical true random number generator (TR-NG) is used. In this paper, we implement a TRNG using an SR latch on 40 nm CMOS ASIC. This TRNG generates the random number by exclusive ORing (XORing) the outputs of 256 SR latches. We evaluate the random number generated using statistical tests in accordance with BSI AIS 20/31 and using an IID (Independent and Identically Distributed) test, and the entropy estimation in accordance with NIST SP800-90B changing the supply voltage and environmental temperature within its rated values. As a result, the TRNG passed all the tests except in a few cases. From this experiment, we found that the TRNG has a robustness against environmental change. The power consumption is 18.8 micro Watt at 2.5 MHz. This TRNG is suitable for embedded systems to improve security in IoT systems.
Recently, PIN-TRUST, a method to predict future trust relationships between users is proposed. PIN-TRUST out-performs existing trust prediction methods by exploiting all types of interactions between users and the reciprocation of ones. In this paper, we validate whether its consideration on the reciprocation of interactions is really effective in trust prediction. Furthermore, we consider a new concept, the "uncertainty" of untrustworthy users that is devised to reflect the difficulty on modeling the activities of untrustworthy users in PIN-TRUST. Then, we also validate the effectiveness this uncertainty concepts. Through the validation, we reveal that the consideration of the reciprocation of interactions is effective for trust prediction with PIN-TRUST, and it is necessary to regard the uncertainty of untrustworthy users same as that of other users.
Many solutions for composability and compositionality rely on specifying the interface for a component using bandwidth. Some previous works specify period (P) and budget (Q) as an interface for a component. Q/P provides us with a bandwidth (the share of a processor that this component may request); P specifies the time-granularity of the allocation of this processing capacity. Other works add another parameter deadline which can help to provide tighter bounds on how this processing capacity is distributed. Yet other works use the parameters α and Δ where α is the bandwidth and Δ specifies how smoothly this bandwidth is distributed. It is known [4] that such bandwidth-like interfaces carry a cost: there are tasksets that could be guaranteed to be schedulable if tasks were scheduled directly on the processor, but with bandwidth-like interfaces, it is impossible to guarantee the taskset to be schedulable. And it is also known that this penalty can be infinite, i.e., the use of bandwidth-like interfaces may require the use of a processor that has a speed that is k times faster, and one can show this for any k. This brings the question: "What is the average-case performance penalty of bandwidth-like interfaces?" This paper addresses this question. We answer the question by randomly generating tasksets and then for each of these tasksets, compute a lower bound on how much faster a processor needs to be when a bandwidth-like scheme is used. We do not consider any specific bandwidth-like scheme; instead, we derive an expression that states a lower bound on how much faster a processor needs to be when a bandwidth-like scheme is used. For the distributions considered in this paper, we find that (i) the experimental results depend on the experimental setup, (ii) this lower bound on the penalty was never larger than 4.0, (iii) for one experimental setup, for each taskset, it was greater than 2.4, (iv) the histogram of this penalty appears to be unimodal, and (v) for implicit-deadline sporadic tasks, this lower bound on the penalty was exactly 1.
Fault tolerance is a key challenge to building the first exa\textbackslash-scale system. To understand the potential impacts of failures on next-generation systems, significant effort has been devoted to collecting, characterizing and analyzing failures on current systems. These studies require large volumes of data and complex analysis. Because the occurrence of failures in large-scale systems is unpredictable, failures are commonly modeled as a stochastic process. Failure data from current systems is examined in an attempt to identify the underlying probability distribution and its statistical properties. In this paper, we use modeling to examine the impact of failure distributions on the time-to-solution and the optimal checkpoint interval of applications that use coordinated checkpoint/restart. Using this approach, we show that as failures become more frequent, the failure distribution has a larger influence on application performance. We also show that as failure times are less tightly grouped (i.e., as the standard deviation increases) the underlying probability distribution has a greater impact on application performance. Finally, we show that computing the checkpoint interval based on the assumption that failures are exponentially distributed has a modest impact on application performance even when failures are drawn from a different distribution. Our work provides critical analysis and guidance to the process of analyzing failure data in the context of coordinated checkpoint/restart. Specifically, the data presented in this paper helps to distinguish cases where the failure distribution has a strong influence on application performance from those cases when the failure distribution has relatively little impact.