Biblio
As one of the security components in cyber situational awareness systems, Intrusion Detection System (IDS) is implemented by many organizations in their networks to address the impact of network attacks. Regardless of the tools and technologies used to generate security alarms, IDS can provide a situation overview of network traffic. With the security alarm data generated, most organizations do not have the right techniques and further analysis to make this alarm data more valuable for the security team to handle attacks and reduce risk to the organization. This paper proposes the IDS Metrics Framework for cyber situational awareness system that includes the latest technologies and techniques that can be used to create valuable metrics for security advisors in making the right decisions. This metrics framework consists of the various tools and techniques used to evaluate the data. The evaluation of the data is then used as a measurement against one or more reference points to produce an outcome that can be very useful for the decision making process of cyber situational awareness system. This metric offers an additional Graphical User Interface (GUI) tools that produces graphical displays and provides a great platform for analysis and decision-making by security teams.
The purpose of this paper is to examine the influence of lexical entrainment while communicating with a conversational agent. We consider two types of cognitive information processing:top-down processing, which depends on prior knowledge, and bottom-up processing, which depends on one's partners' behavior. Each works mutually complementarily in interpersonal cognition. It was hypothesized that we will separate each method of processing because of the agent's behavior. We designed a word choice task where participants and the agent described pictures and selected them alternately and held two factors constant:First, the expectation about the agent's intelligence by the experimenter's instruction as top-down processing; second, the agent's behavior, manipulating the degree of intellectual impression, as bottom-up processing. The results show that people select words differently because of the diversity of expressed behavior and thus supported our hypothesis. The findings obtained in this study could bring about new guidelines for a human-to-agent language interface.
Using mobile sinks to collect sensed data in WSNs (Wireless Sensor Network) is an effective technique for significantly improving the network lifetime. We investigate the problem of collecting sensed data using a mobile sink in a WSN with unreachable regions such that the network lifetime is maximized and the total tour length is minimized, and propose a polynomial-time heuristic, an ILP-based (Integer Linear Programming) heuristic and an MINLP-based (Mixed-Integer Non-Linear Programming) algorithm for constructing a shortest path routing forest for the sensor nodes in unreachable regions, two energy-efficient heuristics for partitioning the sensor nodes in reachable regions into disjoint clusters, and an efficient approach to convert the tour construction problem into a TSP (Travelling Salesman Problem). We have performed extensive simulations on 100 instances with 100, 150, 200, 250 and 300 sensor nodes in an urban area and a forest area. The simulation results show that the average lifetime of all the network instances achieved by the polynomial-time heuristic is 74% of that achieved by the ILP-based heuristic and 65% of that obtained by the MINLP-based algorithm, and our tour construction heuristic significantly outperforms the state-of-the-art tour construction heuristic EMPS.
The number of resource-limited wireless devices utilized in many areas of Internet of Things is growing rapidly; there is a concern about privacy and security. Various lightweight block ciphers are proposed; this work presents a modified lightweight block cipher algorithm. A Linear Feedback Shift Register is used to replace the key generation function in the XTEA1 Algorithm. Using the same evaluation conditions, we analyzed the software implementation of the modified XTEA using FELICS (Fair Evaluation of Lightweight Cryptographic Systems) a benchmarking framework which calculates RAM footprint, ROM occupation and execution time on three largely used embedded devices: 8-bit AVR microcontroller, 16-bit MSP microcontroller and 32-bit ARM microcontroller. Implementation results show that it provides less software requirements compared to original XTEA. We enhanced the security level and the software performance.
Embedded electronic devices and sensors such as smartphones, smart watches, medical implants, and Wireless Sensor Nodes (WSN) are making the “Internet of Things” (IoT) a reality. Such devices often require cryptographic services such as authentication, integrity and non-repudiation, which are provided by Public-Key Cryptography (PKC). As these devices are severely resource-constrained, choosing a suitable cryptographic system is challenging. Pairing Based Cryptography (PBC) is among the best candidates to implement PKC in lightweight devices. In this research, we present a fast and energy efficient implementation of PBC based on Barreto-Naehrig (BN) curves and optimal Ate pairing using hardware/software co-design. Our solution consists of a hardware-based Montgomery multiplier, and pairing software running on an ARM Cortex A9 processor in a Zynq-7020 System-on-Chip (SoC). The multiplier is protected against simple power analysis (SPA) and differential power analysis (DPA), and can be instantiated with a variable number of processing elements (PE). Our solution improves performance (in terms of latency) over an open-source software PBC implementation by factors of 2.34 and 2.02, for 256- and 160-bit field sizes, respectively, as measured in the Zynq-7020 SoC.
Text analytics systems often rely heavily on detecting and linking entity mentions in documents to knowledge bases for downstream applications such as sentiment analysis, question answering and recommender systems. A major challenge for this task is to be able to accurately detect entities in new languages with limited labeled resources. In this paper we present an accurate and lightweight, multilingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) Lightweight named entity recognition with competitive accuracy; 2) Candidate entity retrieval that uses search click-log data and entity embeddings to achieve high precision with a low memory footprint; and 3) efficient entity disambiguation. Our system achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data.
Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.
Multi-state logic presents a promising avenue for more-than-Moore scaling, since efficient implementation of multi-valued logic (MVL) can significantly reduce switching and interconnection requirements and result in significant benefits compared to binary CMOS. So far, traditional approaches lag behind binary CMOS due to: (a) reliance on logic decomposition approaches [4][5][6] that result in many multi-valued minterms [4], complex polynomials [5], and decision diagrams [6], which are difficult to implement, and (b) emulation of multi-valued computation and communication through binary switches and medium that require data conversion, and large circuits. In this paper, we propose a fundamentally different approach for MVL decomposition, merging concepts from data science and nanoelectronics to tackle the problems, (a) First, we do linear regression on all inputs and outputs of a multivalued function, and find an expression that fits most input and output combinations. For unmatched combinations, we do successive regressions to find linear expressions. Next, using our novel visual pattern matching technique, we find conditions based on input and output conditions to select each expression. These expressions along with associated selection criteria ensure that for all possible inputs of a specific function, correct output can be reached. Our selection of regression model to find linear expressions, coefficients and conditions allow efficient hardware implementation. We discuss an approach for solving problem (b) and show an example of quaternary sum circuit. Our estimates show 65.6% saving of switching components compared with a 4-bit CMOS adder.
This paper introduces a new efficient algorithm for computing Grobner-bases named M4GB. Like Faugere's algorithm F4 it is an extension of Buchberger's algorithm that describes: how to store already computed (tail-)reduced multiples of basis polynomials to prevent redundant work in the reduction step; and how to exploit efficient linear algebra for the reduction step. In comparison to F4 it removes further redundant work in the processing of reducible monomials. Furthermore, instead of translating the reduction of many critical pairs into the row reduction of some large matrix, our algorithm is described more natively and is efficient while processing critical pairs one by one. This feature implies that typically M4GB has to process fewer critical pairs than F4, and reduces the time and data complexity 'staircase' related to the increasing degree of regularity for a sequence of problems one observes for F4. To achieve high efficiency, M4GB has been designed specifically to operate only on tail-reduced polynomials, i.e., polynomials of which all terms except the leading term are non-reducible. This allows it to perform full-reduction directly in the computation of a term polynomial multiplication, where all computations are done over coefficient vectors over the non-reducible monomials. We have implemented a version of our new algorithm tailored for dense overdefined polynomial systems as a proof of concept and made our source code publicly available. We have made a comparison of our implementation against the implementations of FGBlib, Magma and OpenF4 on various dense Fukuoka MQ challenge problems that we were able to compute in reasonable time and memory. We observed that M4GB uses the least total CPU time and the least memory of all these implementations for those MQ problems, often by a significant factor. In the Fukuoka MQ challenges, the starting challenges of Type V and Type VI have 16 equations which was chosen based on an extrapolated computational runtime of more than a month using Magma. M4GB allowed us to set new records for these Fukuoka MQ challenges breaking Type V (F28) up to 18 equations and Type VI (F31) up to 19 equations, each can be computed within up to 11 days on our dual Xeon system.
Live migration is one of the key technologies to improve data center utilization, power efficiency, and maintenance. Various live migration algorithms have been proposed; each exhibiting distinct characteristics in terms of completion time, amount of data transferred, virtual machine (VM) downtime, and VM performance degradation. To make matters worse, not only the migration algorithm but also the applications running inside the migrated VM affect the different performance metrics. With service-level agreements and operational constraints in place, choosing the optimal live migration technique has so far been an open question. In this work, we propose an adaptive machine learning-based model that is able to predict with high accuracy the key characteristics of live migration in dependence of the migration algorithm and the workload running inside the VM. We discuss the important input parameters for accurately modeling the target metrics, and describe how to profile them with little overhead. Compared to existing work, we are not only able to model all commonly used migration algorithms but also predict important metrics that have not been considered so far such as the performance degradation of the VM. In a comparison with the state-of-the-art, we show that the proposed model outperforms existing work by a factor 2 to 5.
Denial of service (DOS) attacks are a serious threat to network security. These attacks are often sourced from virtual machines in the cloud, rather than from the attacker's own machine, to achieve anonymity and higher network bandwidth. Past research focused on analyzing traffic on the destination (victim's) side with predefined thresholds. These approaches have significant disadvantages. They are only passive defenses after the attack, they cannot use the outbound statistical features of attacks, and it is hard to trace back to the attacker with these approaches. In this paper, we propose a DOS attack detection system on the source side in the cloud, based on machine learning techniques. This system leverages statistical information from both the cloud server's hypervisor and the virtual machines, to prevent network packages from being sent out to the outside network. We evaluate nine machine learning algorithms and carefully compare their performance. Our experimental results show that more than 99.7% of four kinds of DOS attacks are successfully detected. Our approach does not degrade performance and can be easily extended to broader DOS attacks.
The application of machine learning for the detection of malicious network traffic has been well researched over the past several decades; it is particularly appealing when the traffic is encrypted because traditional pattern-matching approaches cannot be used. Unfortunately, the promise of machine learning has been slow to materialize in the network security domain. In this paper, we highlight two primary reasons why this is the case: inaccurate ground truth and a highly non-stationary data distribution. To demonstrate and understand the effect that these pitfalls have on popular machine learning algorithms, we design and carry out experiments that show how six common algorithms perform when confronted with real network data. With our experimental results, we identify the situations in which certain classes of algorithms underperform on the task of encrypted malware traffic classification. We offer concrete recommendations for practitioners given the real-world constraints outlined. From an algorithmic perspective, we find that the random forest ensemble method outperformed competing methods. More importantly, feature engineering was decisive; we found that iterating on the initial feature set, and including features suggested by domain experts, had a much greater impact on the performance of the classification system. For example, linear regression using the more expressive feature set easily outperformed the random forest method using a standard network traffic representation on all criteria considered. Our analysis is based on millions of TLS encrypted sessions collected over 12 months from a commercial malware sandbox and two geographically distinct, large enterprise networks.
Tactical Mobile Ad-hoc NETworks (T-MANETs) are mainly used in self-configuring automatic vehicles and robots (also called nodes) for the rescue and military operations. A high dynamic network architecture, nodes unreliability, nodes misbehavior as well as an open wireless medium make it very difficult to assume the nodes cooperation in the `ad-hoc network or comply with routing rules. The routing protocols in the T-MANET are unprotected and subsequently result in various kinds of nodes misbehavior's (such as selfishness and denial of service). This paper introduces a comprehensive analysis of the packet dropping attack includes three types of misbehavior conducted by insiders in the T-MANETs namely black hole, gray hole, and selfish behaviours. An insider threat model is appended to a state-of-the-art routing protocol (such as DSR) and analyze the effect of packet dropping attack on the performance evaluation of DSR in the T-MANET. This paper contributes to the existing knowledge in a way it allows further security research to understand the behaviours of the main threats in MANETs which depends on nods defection in the packet forwarding. The simulation of the packet dropping attack is conducted using the Network Simulator 2 (NS2). It has been found that the network throughput has dropped considerably for black and gray hole attacks whereas the selfish nodes delay the network flow. Moreover, the packet drop rate and energy consumption rate are higher for black and gray hole attacks.
Communication networks can be the targets of organized and distributed attacks such as flooding-type DDOS attack in which malicious users aim to cripple a network server or a network domain. For the attack to have a major effect on the network, malicious users must act in a coordinated and time correlated manner. For instance, the members of the flooding attack increase their message transmission rates rapidly but also synchronously. Even though detection and prevention of the flooding attacks are well studied at network and transport layers, the emergence and wide deployment of new systems such as VoIP (Voice over IP) have turned flooding attacks at the session layer into a new defense challenge. In this study a structured sparsity based group anomaly detection system is proposed that not only can detect synchronized attacks, but also identify the malicious groups from normal users by jointly estimating their members, structure, starting and end points. Although we mainly focus on security on SIP (Session Initiation Protocol) servers/proxies which are widely used for signaling in VoIP systems, the proposed scheme can be easily adapted for any type of communication network system at any layer.
This paper presents a method to extract important byte sequences in malware samples by application of convolutional neural network (CNN) to images converted from binary data. This method, by combining a technique called the attention mechanism into CNN, enables calculation of an "attention map," which shows regions having higher importance for classification in the image. The extracted region with higher importance can provide useful information for human analysts who investigate the functionalities of unknown malware samples. Results of our evaluation experiment using malware dataset show that the proposed method provides higher classification accuracy than a conventional method. Furthermore, analysis of malware samples based on the calculated attention map confirmed that the extracted sequences provide useful information for manual analysis.
There are currently few methods that can be applied to malware classification problems which don't require domain knowledge to apply. In this work, we develop our new SHWeL feature vector representation, by extending the recently proposed Lempel-Ziv Jaccard Distance. These SHWeL vectors improve upon LZJD's accuracy, outperform byte n-grams, and allow us to build efficient algorithms for both training (a weakness of byte n-grams) and inference (a weakness of LZJD). Furthermore, our new SHWeL method also allows us to directly tackle the class imbalance problem, which is common for malware-related tasks. Compared to existing methods like SMOTE, SHWeL provides significantly improved accuracy while reducing algorithmic complexity to O(N). Because our approach is developed without the use of domain knowledge, it can be easily re-applied to any new domain where there is a need to classify byte sequences.
Information and communication technologies are extensively used to monitor and control electric microgrids. Although, such innovation enhance self healing, resilience, and efficiency of the energy infrastructure, it brings emerging security threats to be a critical challenge. In the context of microgrid, the cyber vulnerabilities may be exploited by malicious users for manipulate system parameters, meter measurements and price information. In particular, malware may be used to acquire direct access to monitor and control devices in order to destabilize the microgrid ecosystem. In this paper, we exploit a sandbox to analyze security vulnerability to malware of involved embedded smart-devices, by monitoring at different abstraction levels potential malicious behaviors. In this direction, the CoSSMic project represents a relevant case study.
Open Source Software developer communities are susceptible to challenges related to volatility, distributed coordination and the interplay between commercial and ideological interests. Here, community managers play a vital role in growing, shepherding, and coordinating the developers' work. This study investigates the varied tasks that community managers perform to ensure the health and vitality of their communities. We describe the challenges managers face while directing the community and seeking support for their work from the analysis tools provided by state-of-the-art software platforms. Our results describe seven roles that community managers may play, highlighting the versatile and people-centric nature of the community manager's work. Managers experience hardship of connecting their goals, questions and metrics that define a community's health and effects of their actions. Our results voice common concerns among community managers, and can be used to help them structure the management activity and to find a theoretical frame for further research on how health of developer communities could be understood.
The extensive use of information and communication technologies in power grid systems make them vulnerable to cyber-attacks. One class of cyber-attack is advanced persistent threats where highly skilled attackers can steal user authentication information's and then move laterally in the network, from host to host in a hidden manner, until they reach an attractive target. Once the presence of the attacker has been detected in the network, appropriate actions should be taken quickly to prevent the attacker going deeper. This paper presents a game theoretic approach to optimize the defense against an invader attempting to use a set of known vulnerabilities to reach critical nodes in the network. First, the network is modeled as a vulnerability multi-graph where the nodes represent physical hosts and edges the vulnerabilities that the attacker can exploit to move laterally from one host to another. Secondly, a two-player zero-sum Markov game is built where the states of the game represent the nodes of the vulnerability multi-graph graph and transitions correspond to the edge vulnerabilities that the attacker can exploit. The solution of the game gives the optimal strategy to disconnect vulnerable services and thus slow down the attack.
The application of mobile Wireless Sensor Networks (WSNs) with a big amount of participants poses many challenges. For instance, high transmission loss rates which are caused i.a. by collisions might occur. Additionally, WSNs frequently operate under harsh conditions, where a high probability of link or node failures is inherently given. This leads to reliable data maintenance being a key issue. Existing approaches which were developed to keep data dependably in WSNs often either perform well in highly dynamic or in completely static scenarios, or require complex calculations. Herein, we present the Network Coding based Multicast Growth Codes (MCGC), which represent a solution for reliable data maintenance in large-scale WSNs. MCGC are able to tolerate high fault rates and reconstruct more originally collected data in a shorter period of time than compared existing approaches. Simulation results show performance improvements of up to 75% in comparison to Growth Codes (GC). These results are achieved independently of the systems' dynamics and despite of high fault probabilities.
Lithium Ion batteries usually degrade to an unacceptable capacity level after hundreds or even thousands of cycles. The continuously observed capacity fade data over time and their internal structure can be informative for constructing capacity fade models. This paper applies a mean-covariance decomposition modeling method to analyze the capacity fade data. The proposed approach directly examines the variances and correlations in data of interest and express the correlation matrix in hyper-spherical coordinates using angles and trigonometric functions. The proposed method is applied to model and predict key batteries performance metrics using testing data under various testing conditions.
Edge Computing is a scheme to improve the performance, latency and security guidelines for IoT applications. However, edge deployment of an application also comes with additional complexity in management, an increased attack surface for security vulnerability, and could potentially result in a more expensive solution. As a result, the conditions under which an edge deployment of IoT applications delivers a better solution is not always obvious. Metrics which would be able to predict whether or not an IoT application is suitable for edge deployment can provide useful insights to address this question. In this paper, we examine the key performance indicators for IoT applications, namely the responsiveness, scalability and cost models for different types of IoT applications. Our analysis identifies that network centrality of an IoT application is a key characteristic which determines whether or not an IoT application is a good candidate for edge deployment. We discuss the different measures of network centrality that can be used to characterize applications, and the relative performance of edge deployment compared to centralized deployment for various IoT applications.
The increasing complexity and ubiquity in user connectivity, computing environments, information content, and software, mobile, and web applications transfers the responsibility of privacy management to the individuals. Hence, making it extremely difficult for users to maintain the intelligent and targeted level of privacy protection that they need and desire, while simultaneously maintaining their ability to optimally function. Thus, there is a critical need to develop intelligent, automated, and adaptable privacy management systems that can assist users in managing and protecting their sensitive data in the increasingly complex situations and environments that they find themselves in. This work is a first step in exploring the development of such a system, specifically how user personality traits and other characteristics can be used to help automate determination of user sharing preferences for a variety of user data and situations. The Big-Five personality traits of openness, conscientiousness, extroversion, agreeableness, and neuroticism are examined and used as inputs into several popular machine learning algorithms in order to assess their ability to elicit and predict user privacy preferences. Our results show that the Big-Five personality traits can be used to significantly improve the prediction of user privacy preferences in a number of contexts and situations, and so using machine learning approaches to automate the setting of user privacy preferences has the potential to greatly reduce the burden on users while simultaneously improving the accuracy of their privacy preferences and security.
Microdata is collected by companies in order to enhance their quality of service as well as the accuracy of their recommendation systems. These data often become publicly available after they have been sanitized. Recent reidentification attacks on publicly available, sanitized datasets illustrate the privacy risks involved in microdata collections. Currently, users have to trust the provider that their data will be safe in case data is published or if a privacy breach occurs. In this work, we empower users by developing a novel, user-centric tool for privacy measurement and a new lightweight privacy metric. The goal of our tool is to estimate users' privacy level prior to sharing their data with a provider. Hence, users can consciously decide whether to contribute their data. Our tool estimates an individuals' privacy level based on published popularity statistics regarding the items in the provider's database, and the users' microdata. In this work, we describe the architecture of our tool as well as a novel privacy metric, which is necessary for our setting where we do not have access to the provider's database. Our tool is user friendly, relying on smart visual results that raise privacy awareness. We evaluate our tool using three real world datasets, collected from major providers. We demonstrate strong correlations between the average anonymity set per user and the privacy score obtained by our metric. Our results illustrate that our tool which uses minimal information from the provider, estimates users' privacy levels comparably well, as if it had access to the actual database.
With the increasing use of mobile phones in contemporary society, more and more networked computers are connected to each other. This has brought along security issues. To solve these issues, both research and development communities are trying to build more secure software. However, there is the question that how the secure software is defined and how the security could be measured. In this paper, we study this problem by studying what kinds of security measurement tools (i.e. metrics) are available, and what these tools and metrics reveal about the security of software. As the result of the study, we noticed that security verification activities fall into two main categories, evaluation and assurance. There exist 34 metrics for measuring the security, from which 29 are assurance metrics and 5 are evaluation metrics. Evaluating and studying these metrics, lead us to the conclusion that the general quality of the security metrics are not in a satisfying level that could be suitably used in daily engineering work flows. They have both theoretical and practical issues that require further research, and need to be improved.