Biblio
In this article, we describe a neighbour disjoint multipath (NDM) scheme that is shown to be more resilient amidst node or link failures compared to the two well-known node disjoint and edge disjoint multipath techniques. A centralised NDM was first conceptualised in our initial published work utilising the spatial diversity among multiple paths to ensure robustness against localised poor channel quality or node failures. Here, we further introduce a distributed version of our NDM algorithm adapting to the low-power and lossy network (LLN) characteristics. We implement our distributed NDM algorithm in Contiki OS on top of LOADng—a lightweight On-demand Ad hoc Distance Vector Routing protocol. We compare this implementation's performance with a standard IPv6 Routing Protocol for Low power and Lossy Networks (RPL), and also with basic LOADng, running in the Cooja simulator. Standard performance metrics such as packet delivery ratio, end-to-end latency, overhead and average routing table size are identified for the comparison. The results and observations are provided considering a few different application traffic patterns, which serve to quantify the improvements in robustness arising from NDM. The results are confirmed by experiments using a public sensor network testbed with over 100 nodes.
The performance of clustering is a crucial challenge, especially for pattern recognition. The models aggregation has a positive impact on the efficiency of Data clustering. This technique is used to obtain more cluttered decision boundaries by aggregating the resulting clustering models. In this paper, we study an aggregation scheme to improve the stability and accuracy of clustering, which allows to find a reliable and robust clustering model. We demonstrate the advantages of our aggregation method by running Fuzzy C-Means (FCM) clustering on Reuters-21578 corpus. Experimental studies showed that our scheme optimized the bias-variance on the selected model and achieved enhanced clustering for unstructured textual resources.
The threat that malicious insiders pose towards organisations is a significant problem. In this paper, we investigate the task of detecting such insiders through a novel method of modelling a user's normal behaviour in order to detect anomalies in that behaviour which may be indicative of an attack. Specifically, we make use of Hidden Markov Models to learn what constitutes normal behaviour, and then use them to detect significant deviations from that behaviour. Our results show that this approach is indeed successful at detecting insider threats, and in particular is able to accurately learn a user's behaviour. These initial tests improve on existing research and may provide a useful approach in addressing this part of the insider-threat challenge.
A nonrepudiation protocol from a sender S to a set of potential receivers \R1, R2, ..., Rn\ performs two functions. First, this protocol enables S to send to every potential receiver Ri a copy of file F along with a proof that can convince an unbiased judge that F was indeed sent by S to Ri. Second, this protocol also enables each Ri to receive from S a copy of file F and to send back to S a proof that can convince an unbiased judge that F was indeed received by Ri from S. When a nonrepudiation protocol from S to \R1, R2, ..., Rn\ is implemented in a cloud system, the communications between S and the set of potential receivers \R1, R2, ..., Rn\ are not carried out directly. Rather, these communications are carried out through a cloud C. In this paper, we present a nonrepudiation protocol that is implemented in a cloud system and show that this protocol is correct. We also show that this protocol has two clear advantages over nonrepudiation protocols that are not implemented in cloud systems.
Modern malware is designed with mutation characteristics, namely polymorphism and metamorphism, which causes an enormous growth in the number of variants of malware samples. Categorization of malware samples on the basis of their behaviors is essential for the computer security community, because they receive huge number of malware everyday, and the signature extraction process is usually based on malicious parts characterizing malware families. Microsoft released a malware classification challenge in 2015 with a huge dataset of near 0.5 terabytes of data, containing more than 20K malware samples. The analysis of this dataset inspired the development of a novel paradigm that is effective in categorizing malware variants into their actual family groups. This paradigm is presented and discussed in the present paper, where emphasis has been given to the phases related to the extraction, and selection of a set of novel features for the effective representation of malware samples. Features can be grouped according to different characteristics of malware behavior, and their fusion is performed according to a per-class weighting paradigm. The proposed method achieved a very high accuracy (\$\textbackslashapprox\$ 0.998) on the Microsoft Malware Challenge dataset.
In recent years, we have seen a number of successful attacks against high-profile targets, some of which have even caused severe physical damage. These examples have shown us that resourceful and determined attackers can penetrate virtually any system, even those that are secured by the "air-gap." Consequently, in order to minimize the impact of stealthy attacks, defenders have to focus not only on strengthening the first lines of defense but also on deploying effective intrusion-detection systems. Intrusion-detection systems can play a key role in protecting sensitive computer systems since they give defenders a chance to detect and mitigate attacks before they could cause substantial losses. However, an over-sensitive intrusion-detection system, which produces a large number of false alarms, imposes prohibitively high operational costs on a defender since alarms need to be manually investigated. Thus, defenders have to strike the right balance between maximizing security and minimizing costs. Optimizing the sensitivity of intrusion detection systems is especially challenging in the case when multiple inter-dependent computer systems have to be defended against a strategic attacker, who can target computer systems in order to maximize losses and minimize the probability of detection. We model this scenario as an attacker-defender security game and study the problem of finding optimal intrusion detection thresholds.
Cellular towers capture logs of mobile subscribers whenever their devices connect to the network. When the logs show data traffic at a cell tower generated by a device, it reveals that this device is close to the tower. The logs can then be used to trace the locations of mobile subscribers for different applications, such as studying customer behaviour, improving location-based services, or helping urban planning. However, the logs often suffer from an oscillation phenomenon. Oscillations may happen when a device, even when not moving, does not only connect to the nearest cell tower, but is instead unpredictably switching between multiple cell towers because of random noise, load balancing, or simply dynamic changes in signal strength. Detecting and removing oscillations are a challenge when analyzing location data collected from the cellular network. In this paper, we propose an algorithm called SOL (Stable, Oscillation, Leap periods) aimed at discovering and reducing oscillations in the collected logs. We apply our algorithm on real datasets which contain about 18.9\textasciitildeTB of traffic logs generated by more than 3\textasciitildemillion mobile subscribers covering about 21000 cell towers and collected during 27\textasciitildedays from both GSM and UMTS networks in northern China. Experimental results demonstrate the ability and effectiveness of SOL to reduce oscillations in cellular network logs.
This paper proposes a method for segmentation of nuclei of single/isolated and overlapping/touching immature white blood cells from microscopic images of B-Lineage acute lymphoblastic leukemia (ALL) prepared from peripheral blood and bone marrow aspirate. We propose deep belief network approach for the segmentation of these nuclei. Simulation results and comparison with some of the existing methods demonstrate the efficacy of the proposed method.
With the massive amounts of data available today, it is common to store and process data using multiple machines. Parallel programming platforms such as MapReduce and its variants are popular frameworks for handling such large data. We present the first provably efficient algorithms to compute, store, and query data structures for range queries and approximate nearest neighbor queries in a popular parallel computing abstraction that captures the salient features of MapReduce and other massively parallel communication (MPC) models. In particular, we describe algorithms for \$kd\$-trees, range trees, and BBD-trees that only require O(1) rounds of communication for both preprocessing and querying while staying competitive in terms of running time and workload to their classical counterparts. Our algorithms are randomized, but they can be made deterministic at some increase in their running time and workload while keeping the number of rounds of communication to be constant.
Emerging and future HealthCare policies are fueling up an application-driven shift toward long-term monitoring of biosignals by means of embedded ultra-low power Wireless Body Sensor Networks (WBSNs). In order to break out, these applications needed the emergence of new technologies to allow the development of extremely power-efficient bio-sensing nodes. The PHIDIAS project aims at unlocking the development of ultra-low power bio-sensing WBSNs by tackling multiple and interlocking technological breakthroughs: (i) the development of new signal processing models and methods based on the recently proposed Compressive Sampling paradigm, which allows the design of energy-minimal computational architectures and analog front-ends, (ii) the efficient hardware implementation of components, both analog and digital, building upon an innovative ultra-low-power signal processing front-end, (iii) the evaluation of the global power reduction using a system wide integration of hardware and software components focused on compressed-sensing-based bio-signals analysis. PHIDIAS brought together a mixed consortium of academic and industrial research partners representing pan-European excellence in different fields impacting the energy-aware optimization of WBSNs, including experts in signal processing and digital/analog IC design. In this way, PHIDIAS pioneered a unique holistic approach, ensuring that key breakthroughs worked out in a cooperative way toward the global objective of the project.
Phishing is one of the most dangerous information security threats present in the world today, with losses toping 5.9 billion dollars in 2013. Evolving from the original concept of phishing, spear phishing also attempts to scam individuals online, however it uses personalized mail to yield a far higher success rate. This paper suggests an increased threat of spear phishing success due to the presence of social media. Assessing this new threat is important not only to the individuals, but also to companies whose employees may specifically be targeted through their social media accounts. The paper presents the design and implementation of an architecture to determine phishing susceptibility of a user through their social media accounts, and methods to reduce the threat. Preliminary testing shows that social media provides a publicly accessible resource to assess targeted individuals for phishing attacks through their accounts.
Cryptographic hash functions are used to protect the integrity of information. Hash functions are designed by using existing block ciphers as compression functions. This is due to challenges and difficulties that are encountered in constructing new hash functions from the scratch. However, the key generations for encryption process result to huge computational cost which affects the efficiency of the hash function. This paper proposes a new, secure and efficient compression function based on a pseudorandom function, that takes in two 2n-bits inputs and produce one n-bit output (2n-to-n bit). In addition, a new keyed hash function with three variants is proposed (PinTar 128 bits, 256 bits and 512 bits) which uses the proposed compression as its underlying building block. Statistical analysis shows that the compression function is an efficient one way random function. Similarly, statistical analysis of the keyed hash function shows that the proposed keyed function has strong avalanche property and is resistant to key exhaustive search attack. The proposed key hash function can be used as candidate for developing security systems.
Continuing progress and integration levels in silicon technologies make possible complete end-user systems consisting of extremely high number of cores on a single chip targeting either embedded or high-performance computing. However, without new paradigms of energy- and thermally-efficient designs, producing information and communication systems capable of meeting the computing, storage and communication demands of the emerging applications will be unlikely. The broad topic of power and thermal management of massive multicore chips is actively being pursued by a number of researchers worldwide, from a variety of different perspectives, ranging from workload modeling to efficient on-chip network infrastructure design to resource allocation. Successful solutions will likely adopt and encompass elements from all or at least several levels of abstraction. Starting from these ideas, we consider a holistic approach in establishing the Power-Thermal-Performance (PTP) trade-offs of massive multicore processors by considering three inter-related but varying angles, viz., on-chip traffic modeling, novel Networks-on-Chip (NoC) architecture and resource allocation/mapping On-line workload (mathematical modeling, analysis and prediction) learning is fundamental for endowing the many-core platforms with self-optimizing capabilities [2][3]. This built-in intelligence capability of many-cores calls for monitoring the interactions between the set of running applications and the architectural (core and uncore) components, the online construction of mathematical models for the observed workloads, and determining the best resource allocation decisions given the limited amount of information about user-to-application-to-system dynamics. However, workload modeling is not a trivial task. Centralized approaches for analyzing and mining workloads can easily run into scalability issues with increasing number of monitored processing elements and uncore (routers and interface queues) components since it can either lead to significant traffic and energy overhead or require dedicated system infrastructure. In contrast, learning the most compact mathematical representation of the workload can be done in a distributed manner (within the proximity of the observation /sensing) as long as the mathematical techniques are flexible and exploit the mathematical characteristics of the workloads (degree of periodicity, degree of fractal and temporal scaling) [3]. As one can notice, this strategy does not postulate a-priori the mathematical expressions (e.g., a specific order of the autoregressive moving average (ARMA) model). Instead, the periodicity and fractality of the observed computation (e.g., instructions per cycles, last level cache misses, branch prediction successes and failures, TLB access/misses) and communication (request-reply latency, queues utilization, memory queuing delay) metrics dictate the number of coefficients, the linearity or nonlinearity of the dynamical state equations and the noise terms (e.g., Gaussian distributed) [3]. In other words, dedicated minimal logic can be allocated to interact with the local sensor to analyze the incoming workload at run-time, determine the required number of parameters and their values as a function of their characteristics and communicate only the workload model parameters to a hierarchical optimization module (autonomous control architecture). For instance, capturing the fractal characteristics of the core and uncore workloads led to the development of more efficient power management strategy [1] than those based on PID or model predictive control. In order to develop a compact and accurate mathematical framework for analyzing and modeling the incoming workload, we describe a general probabilistic approach that models the statistics of the increments in the magnitude of a stochastic process (associated with a specific workload metric) and the intervals of time (inter-event times) between successive changes in the stochastic process [3]. We show that the statistics of these two components of the stochastic process allows us to derive state equations and capture either short-range or long-range memory properties. To test the benefits of this new workload modeling approach, we describe its integration into a multi-fractal optimal control framework for solving the power management for a 64-core NoC-based manycore platform and contrast it with a mono-fractal and non-fractal schemes [3]. A scalable, low power, and high-bandwidth on-chip communication infrastructure is essential to sustain the predicted growth in the number of embedded cores in a single die. New interconnection fabrics are key for continued performance improvements and energy reduction of manycore chips, and an efficient and robust NoC architecture is one of the key steps towards achieving that goal. An NoC architecture that incorporates emerging interconnect paradigms will be an enabler for low-power, high-bandwidth manycore chips. Innovative interconnect paradigms based on optical technologies, RF/wireless methods, carbon nanotubes, or 3D integration are promising alternatives that may indeed overcome obstacles that impede continued advances of the manycore paradigm. These innovations will open new opportunities for research in NoC designs with emerging interconnect infrastructures. In this regard, wireless NoC (WiNoC) is a promising direction to design energy efficient multicore architectures. WiNoC not only helps in improving the energy efficiency and performance, it also opens up opportunities for implementing power management strategies. WiNoCs enable implementation of the two most popular power management mechanisms, viz., dynamic voltage and frequency scaling (DVFS) and voltage frequency island (VFI). The wireless links in the WiNoC establish one-hop shortcuts between the distant nodes and facilitate energy savings in data exchange [3]. The wireless shortcuts attract a significant amount of the overall traffic within the network. The amount of traffic detoured is substantial and the low power wireless links enable energy savings. However, the overall energy dissipation within the network is still dominated by the data traversing the wireline links. Hence, by incorporating DVFS on these wireline links we can save more energy. Moreover, by incorporating suitable congestion aware routing with DVFS, we can avoid thermal hotspots in the system [4]. It should be noted that for large system size the hardware overhead in terms of on-chip voltage regulators and synchronizers is much more in DVFS than in VFI. WiNoC-enabled VFI designs mitigate some of the full-system performance degradation inherent in VFI-partitioned multicore designs, and it also help in eliminating it entirely for certain applications [5]. The VFI-partitioned designs used in conjunction with a novel NoC architecture like WiNoC can achieve significant energy savings while minimizing the impact on the achievable performance. On-chip power density and temperature trends are continuously increasing due to high integration density of nano-scale transistors and failure of Dennard Scaling as a result of diminishing voltage scaling. Hence, all computing is temperature-constrained computing and therefore, employing thermal management techniques that keep chip temperatures within safe limits along with meeting the constraints of spatial/temporal thermal gradients and avoid wear-out effects [8] is key. We introduced the novel concept of Dark Silicon Patterning, i.e. spatio-temporal control of power states of different cores [9] Sophisticated patterning and thread-to-core mapping decisions are made considering the knowledge of process variations and lateral heat dissipation of power-gated cores in order to enhance the performance of multi-threaded workloads through dynamic core count scaling (DCCS). This is enabled by a lightweight online prediction of chip's thermal profile for a given patterning candidate. We also present an enhanced temperature-aware resource management technique that, besides active and dark states of cores, also exploit various grey states (i.e., using different voltage-frequency levels) in order to achieve a high performance for mixed ILP-TLP workloads under peak temperature constraints. High ILP applications benefit from high V-f and boosting levels, while high TLP applications benefit from As the scaling trends move from multi-core to many-core processors, the centralized solutions become infeasible, and thereby require distributed techniques. In [6], we proposed an agent-based distributed temperature-aware resource management technique called TAPE. It assigns a so-called agent to each core, a software or hardware entity that acts on behalf of the core. Following the principles of economic theory, these agents negotiate with each other to trade their power budgets in order to fulfil the performance requirements of their tasks, while keep the TPeak≤Tcritical. In case of thermal violations, task migration or V-f throttling is triggered, and a penalty is applied to the trading process to improve the decision making.
Cyber Physical Systems (CPS) security testbeds serve as a platform for evaluating and validating novel CPS security tools and technologies, accelerating the transition of state-of-the-art research to industrial practice. The engineering of CPS security testbeds requires significant investments in money, time and modeling efforts to provide a scalable, high-fidelity, real-time attack-defense platform. Therefore, there is a strong need in academia and industry to create remotely accessible testbeds that support a range of use-cases pertaining to CPS security of the grid, including vulnerability assessments, impact analysis, product testing, attack-defense exercises, and operator training. This paper describes the implementation architecture, and capabilities of a remote access and experimental orchestration framework developed for the PowerCyber CPS security testbed at Iowa State University (ISU). The paper then describes several engineering challenges in the development of such remotely accessible testbeds for Smart Grid CPS security experimentation. Finally, the paper provides a brief case study with some screenshots showing a particular use case scenario on the remote access framework.
With the prevalence of personal Bluetooth devices, potential breach of user privacy has been an increasing concern. To date, sniffing Bluetooth traffic has been widely considered an extremely intricate task due to Bluetooth's indiscoverable mode, vendor-dependent adaptive hopping behavior, and the interference in the open 2.4 GHz band. In this paper, we present BlueEar -a practical Bluetooth traffic sniffer. BlueEar features a novel dual-radio architecture where two Bluetooth-compliant radios coordinate with each other on learning the hopping sequence of indiscoverable Bluetooth networks, predicting adaptive hopping behavior, and mitigating the impacts of RF interference. Experiment results show that BlueEar can maintain a packet capture rate higher than 90% consistently in real-world environments, where the target Bluetooth network exhibits diverse hopping behaviors in the presence of dynamic interference from coexisting Wi-Fi devices. In addition, we discuss the privacy implications of the BlueEar system, and present a practical countermeasure that effectively reduces the packet capture rate of the sniffer to 20%. The proposed countermeasure can be easily implemented on the Bluetooth master device while requiring no modification to slave devices like keyboards and headsets.
Here we model the indirect costs of deploying security controls in small-to-medium enterprises (SMEs) to manage cyber threats. SMEs may not have the in-house skills and collective capacity to operate controls efficiently, resulting in inadvertent data leakage and exposure to compromise. Aside from financial costs, attempts to maintain security can impact morale, system performance, and retraining requirements, which are modelled here. Managing the overall complexity and effectiveness of an SME's security controls has the potential to reduce unintended leakage. The UK Cyber Essentials Scheme informs basic control definitions, and Available Responsibility Budget (ARB) is modelled to understand how controls can be prioritised for both security and usability. Human factors of security and practical experience of security management for SMEs inform the modelling of deployment challenges across a set of SME archetypes differing in size, complexity, and use of IT. Simple combinations of controls are matched to archetypes, balancing capabilities to protect data assets with the effort demands placed upon employees. Experiments indicate that two-factor authentication can be readily adopted by many SMEs and their employees to protect core assets, followed by correct access privileges and anti-malware software. Service and technology providers emerge as playing an important role in improving access to usable security controls for SMEs.
Using stolen or weak credentials to bypass authentication is one of the top 10 network threats, as shown in recent studies. Disguising as legitimate users, attackers use stealthy techniques such as rootkits and covert channels to gain persistent access to a target system. However, such attacks are often detected after the system misuse stage, i.e., the attackers have already executed attack payloads such as: i) stealing secrets, ii) tampering with system services, and ii) disrupting the availability of production services.
In this talk, we analyze a real-world credential stealing attack observed at the National Center for Supercomputing Applications. We show the disadvantages of traditional detection techniques such as signature-based and anomaly-based detection for such attacks. Our approach is a complement to existing detection techniques. We investigate the use of Probabilistic Graphical Model, specifically Factor Graphs, to integrate security logs from multiple sources for a more accurate detection. Finally, we propose a security testbed architecture to: i) simulate variants of known attacks that may happen in the future, ii) replay such attack variants in an isolated environment, and iii) collect and share security logs of such replays for the security research community.
Pesented at the Illinois Information Trust Institute Joint Trust and Security and Science of Security Seminar, May 3, 2016.
With the help of rapidly developing technology, DNA sequencing is becoming less expensive. As a consequence, the research in genomics has gained speed in paving the way to personalized (genomic) medicine, and geneticists need large collections of human genomes to further increase this speed. Furthermore, individuals are using their genomes to learn about their (genetic) predispositions to diseases, their ancestries, and even their (genetic) compatibilities with potential partners. This trend has also caused the launch of health-related websites and online social networks (OSNs), in which individuals share their genomic data (e.g., OpenSNP or 23andMe). On the other hand, genomic data carries much sensitive information about its owner. By analyzing the DNA of an individual, it is now possible to learn about his disease predispositions (e.g., for Alzheimer's or Parkinson's), ancestries, and physical attributes. The threat to genomic privacy is magnified by the fact that a person's genome is correlated to his family members' genomes, thus leading to interdependent privacy risks. This short tutorial will help computer scientists better understand the privacy and security challenges in today's genomic era. We will first highlight the significance of genomic data and the threats for genomic privacy. Then, we will present the high level descriptions of the proposed solutions to protect the privacy of genomic data and we will discuss future research directions. No prerequisite knowledge on biology or genomics is required for the attendees of this proposal. We only require the attendees to have a slight background on cryptography and statistics.
To appear
Unease over data privacy will retard consumer acceptance of IoT deployments. The primary source of discomfort is a lack of user control over raw data that is streamed directly from sensors to the cloud. This is a direct consequence of the over-centralization of today's cloud-based IoT hub designs. We propose a solution that interposes a locally-controlled software component called a privacy mediator on every raw sensor stream. Each mediator is in the same administrative domain as the sensors whose data is being collected, and dynamically enforces the current privacy policies of the owners of the sensors or mobile users within the domain. This solution necessitates a logical point of presence for mediators within the administrative boundaries of each organization. Such points of presence are provided by cloudlets, which are small locally-administered data centers at the edge of the Internet that can support code mobility. The use of cloudlet-based mediators aligns well with natural personal and organizational boundaries of trust and responsibility.
In recent times, we have seen a proliferation of personal data. This can be attributed not just to a larger proportion of our lives moving online, but also through the rise of ubiquitous sensing through mobile and IoT devices. Alongside this surge, concerns over privacy, trust, and security are expressed more and more as different parties attempt to take advantage of this rich assortment of data. The Databox seeks to enable all the advantages of personal data analytics while at the same time enforcing **accountability** and **control** in order to protect a user's privacy. In this work, we propose and delineate a personal networked device that allows users to **collate**, **curate**, and **mediate** their personal data.
Named-Data Networking (NDN) is the most prominent proposal for a clean-slate proposal of Future Internet. Nevertheless, NDN routing schemes present scalability concerns due to the required number of stored routes and of control messages. In this work, we present a controller-based routing protocol using a formal method to unambiguously specify, and validate to prove its correctness. Our proposal codes signaling information on content names, avoiding control message overhead, and reduces router memory requirements, storing only the routes for simultaneously consumed prefixes. Additionally, the protocol installs a new route on all routers in a path with a single route request to the controller, avoiding replication of routing information and automating router provisioning. As a result, we provide a protocol proposal description using the Specification and Description Language and we validate the protocol, proving that CRoS behavior is free of dead or live locks. Furthermore, the protocol validation guarantees that the scheme ensures a valid working path from consumer to producer, even if it does not assure the shortest path.



