Biblio
The Internet of Things (IoT) connects not only computers and mobile devices, but it also interconnects smart buildings, homes, and cities, as well as electrical grids, gas, and water networks, automobiles, airplanes, etc. However, IoT applications introduce grand security challenges due to the increase in the attack surface. Current security approaches do not handle cybersecurity from a holistic point of view; hence a systematic cybersecurity mechanism needs to be adopted when designing IoTbased applications. In this work, we present a risk management framework to deploy secure IoT-based applications for Smart Infrastructures at the design time and the runtime. At the design time, we propose a risk management method that is appropriate for smart infrastructures. At the design time, our framework relies on the Anomaly Behavior Analysis (ABA) methodology enabled by the Autonomic Computing paradigm and an intrusion detection system to detect any threat that can compromise IoT infrastructures by. Our preliminary experimental results show that our framework can be used to detect threats and protect IoT premises and services.
Community question answering (cQA) has become an important issue due to the popularity of cQA archives on the Web. This paper focuses on addressing the lexical gap problem in question retrieval. Question retrieval in cQA archives aims to find the existing questions that are semantically equivalent or relevant to the queried questions. However, the lexical gap problem brings a new challenge for question retrieval in cQA. In this paper, we propose to model and learn distributed word representations with metadata of category information within cQA pages for question retrieval using two novel category powered models. One is a basic category powered model called MB-NET and the other one is an enhanced category powered model called ME-NET which can better learn the distributed word representations and alleviate the lexical gap problem. To deal with the variable size of word representation vectors, we employ the framework of fisher kernel to transform them into the fixed-length vectors. Experimental results on large-scale English and Chinese cQA data sets show that our proposed approaches can significantly outperform state-of-the-art retrieval models for question retrieval in cQA. Moreover, we further conduct our approaches on large-scale automatic evaluation experiments. The evaluation results show that promising and significant performance improvements can be achieved.
Machine learning has been widely used and achieved considerable results in various research areas. On the other hand, machine learning becomes a big threat when malicious attackers make use it for the wrong purpose. As such a threat, self-evolving botnets have been considered in the past. The self-evolving botnets autonomously predict vulnerabilities by implementing machine learning with computing resources of zombie computers. Furthermore, they evolve based on the vulnerability, and thus have high infectivity. In this paper, we consider several models of Markov chains to counter the spreading of the self-evolving botnets. Through simulation experiments, this paper shows the behaviors of these models.
At the core of its nature, security is a highly contextual and dynamic challenge. However, current security policy approaches are usually static, and slow to adapt to ever-changing requirements, let alone catching up with reality. In a 2012 Sophos survey, it was stated that a unique malware is created every half a second. This gives a glimpse of the unsustainable nature of a global problem, any improvement in terms of closing the "time window to adapt" would be a significant step forward. To exacerbate the situation, a simple change in threat and attack vector or even an implementation of the so-called "bring-your-own-device" paradigm will greatly change the frequency of changed security requirements and necessary solutions required for each new context. Current security policies also typically overlook the direct and indirect costs of implementation of policies. As a result, technical teams often fail to have the ability to justify the budget to the management, from a business risk viewpoint. This paper considers both the adaptive and cost-benefit aspects of security, and introduces a novel context-aware technique for designing and implementing adaptive, optimized security policies. Our approach leverages the capabilities of stochastic programming models to optimize security policy planning, and our preliminary results demonstrate a promising step towards proactive, context-aware security policies.
Defending key network infrastructure, such as Internet backbone links or the communication channels of critical infrastructure, is paramount, yet challenging. The inherently complex nature and quantity of network data impedes detecting attacks in real world settings. In this paper, we utilize features of network flows, characterized by their entropy, together with an extended version of the original Replicator Neural Network (RNN) and deep learning techniques to learn models of normality. This combination allows us to apply anomaly-based intrusion detection on arbitrarily large amounts of data and, consequently, large networks. Our approach is unsupervised and requires no labeled data. It also accurately detects network-wide anomalies without presuming that the training data is completely free of attacks. The evaluation of our intrusion detection method, on top of real network data, indicates that it can accurately detect resource exhaustion attacks and network profiling techniques of varying intensities. The developed method is efficient because a normality model can be learned by training an RNN within a few seconds only.
In this paper we use car games as a simulator for real automobiles, and generate driving logs that contain the vehicle data. This includes values for parameters like gear used, speed, left turns taken, right turns taken, accelerator, braking and so on. From these parameters we have derived some more additional parameters and analyzed them. As the input from automobile driver is only routine driving, no explicit feedback is required; hence there are more chances of being able to accurately profile the driver. Experimentation and analysis from this logged data shows possibility that driver profiling can be done from vehicle data. Since the profiles are unique, these can be further used for a wide range of applications and can successfully exhibit typical driving characteristics of each user.
Detecting early trends indicating cognitive decline can allow older adults to better manage their health, but current assessments present barriers precluding the use of such continuous monitoring by consumers. To explore the effects of cognitive status on computer interaction patterns, the authors collected typed text samples from older adults with and without pre-mild cognitive impairment (PreMCI) and constructed statistical models from keystroke and linguistic features for differentiating between the two groups. Using both feature sets, they obtained a 77.1 percent correct classification rate with 70.6 percent sensitivity, 83.3 percent specificity, and a 0.808 area under curve (AUC). These results are in line with current assessments for MC–a more advanced disease–but using an unobtrusive method. This research contributes a combination of features for text and keystroke analysis and enhances understanding of how clinicians or older adults themselves might monitor for PreMCI through patterns in typed text. It has implications for embedded systems that can enable healthcare providers and consumers to proactively and continuously monitor changes in cognitive function.
Cloud computing is a service-based computing resources sourcing model that is changing the way in which companies deploy and operate information and communication technologies (ICT). This model introduces several advantages compared with traditional environments along with typical outsourcing benefits reshaping the ICT services supply chain by creating a more dynamic ICT environment plus a broader variety of service offerings. This leads to higher risk of disruption and brings additional challenges for organisational resilience, defined herein as the ability of organisations to survive and also to thrive when exposed to disruptive incidents. This paper draws on supply chain theory and supply chain resilience concepts in order to identify a set of coordination mechanisms that positively impact ICT operational resilience processes within cloud supply chains and packages them into a conceptual model.
This paper proposes an improved mesh simplification algorithm based on quadric error metrics (QEM) to efficiently processing the huge data in 3D image processing. This method fully uses geometric information around vertices to avoid model edge from being simplified and to keep details. Meanwhile, the differences between simplified triangular meshes and equilateral triangles are added as weights of errors to decrease the possibilities of narrow triangle and then to avoid the visual mutation. Experiments show that our algorithm has obvious advantages over the time cost, and can better save the visual characteristics of model, which is suitable for solving most image processing, that is, "Real-time interactive" problem.
Many standard optimization methods for segmentation and reconstruction compute ML model estimates for appearance or geometry of segments, e.g. Zhu-Yuille [23], Torr [20], Chan-Vese [6], GrabCut [18], Delong et al. [8]. We observe that the standard likelihood term in these formu-lations corresponds to a generalized probabilistic K-means energy. In learning it is well known that this energy has a strong bias to clusters of equal size [11], which we express as a penalty for KL divergence from a uniform distribution of cardinalities. However, this volumetric bias has been mostly ignored in computer vision. We demonstrate signif- icant artifacts in standard segmentation and reconstruction methods due to this bias. Moreover, we propose binary and multi-label optimization techniques that either (a) remove this bias or (b) replace it by a KL divergence term for any given target volume distribution. Our general ideas apply to continuous or discrete energy formulations in segmenta- tion, stereo, and other reconstruction problems.
Intro: Computer network defense has models for attacks and incidents comprised of multiple attacks after the fact. However, we lack an evidence-based model the likelihood and intensity of attacks and incidents. Purpose: We propose a model of global capability advancement, the adversarial capability chain (ACC), to fit this need. The model enables cyber risk analysis to better understand the costs for an adversary to attack a system, which directly influences the cost to defend it. Method: The model is based on four historical studies of adversarial capabilities: capability to exploit Windows XP, to exploit the Android API, to exploit Apache, and to administer compromised industrial control systems. Result: We propose the ACC with five phases: Discovery, Validation, Escalation, Democratization, and Ubiquity. We use the four case studies as examples as to how the ACC can be applied and used to predict attack likelihood and intensity.
Most existing approaches focus on examining the values are dangerous for information flow within inter-suspicious modules of cloud applications (apps) in a host by using malware threat analysis, rather than the risk posed by suspicious apps were connected to the cloud computing server. Accordingly, this paper proposes a taint propagation analysis model incorporating a weighted spanning tree analysis scheme to track data with taint marking using several taint checking tools. In the proposed model, Android programs perform dynamic taint propagation to analyse the spread of and risks posed by suspicious apps were connected to the cloud computing server. In determining the risk of taint propagation, risk and defence capability are used for each taint path for assisting a defender in recognising the attack results against network threats caused by malware infection and estimate the losses of associated taint sources. Finally, a case of threat analysis of a typical cyber security attack is presented to demonstrate the proposed approach. Our approach verified the details of an attack sequence for malware infection by incorporating a finite state machine (FSM) to appropriately reflect the real situations at various configuration settings and safeguard deployment. The experimental results proved that the threat analysis model allows a defender to convert the spread of taint propagation to loss and practically estimate the risk of a specific threat by using behavioural analysis with real malware infection.
Language vector space models (VSMs) have recently proven to be effective across a variety of tasks. In VSMs, each word in a corpus is represented as a real-valued vector. These vectors can be used as features in many applications in machine learning and natural language processing. In this paper, we study the effect of vector space representations in cyber security. In particular, we consider a passive traffic analysis attack (Website Fingerprinting) that threatens users' navigation privacy on the web. By using anonymous communication, Internet users (such as online activists) may wish to hide the destination of web pages they access for different reasons such as avoiding tyrant governments. Traditional website fingerprinting studies collect packets from the users' network and extract features that are used by machine learning techniques to reveal the destination of certain web pages. In this work, we propose the packet to vector (P2V) approach where we model website fingerprinting attack using word vector representations. We show how the suggested model outperforms previous website fingerprinting works.
The rate at which cyber-attacks are increasing globally portrays a terrifying picture upfront. The main dynamics of such attacks could be studied in terms of the actions of attackers and defenders in a cyber-security game. However currently little research has taken place to study such interactions. In this paper we use behavioral game theory and try to investigate the role of certain actions taken by attackers and defenders in a simulated cyber-attack scenario of defacing a website. We choose a Reinforcement Learning (RL) model to represent a simulated attacker and a defender in a 2×4 cyber-security game where each of the 2 players could take up to 4 actions. A pair of model participants were computationally simulated across 1000 simulations where each pair played at most 30 rounds in the game. The goal of the attacker was to deface the website and the goal of the defender was to prevent the attacker from doing so. Our results show that the actions taken by both the attackers and defenders are a function of attention paid by these roles to their recently obtained outcomes. It was observed that if attacker pays more attention to recent outcomes then he is more likely to perform attack actions. We discuss the implication of our results on the evolution of dynamics between attackers and defenders in cyber-security games.
The digital forensics refers to the application of scientific techniques in investigation of a crime, specifically to identify or validate involvement of some suspect in an activity leading towards that crime. Network forensics particularly deals with the monitoring of network traffic with an aim to trace some suspected activity from normal traffic or to identify some abnormal pattern in the traffic that may give clue towards some attack. Network forensics, quite valuable phenomenon in investigation process, presents certain challenges including problems in accessing network devices of cloud architecture, handling large amount network traffic, and rigorous processing required to analyse the huge volume of data, of which large proportion may prove to be irrelevant later on. Cloud Computing technology offers services to its clients remotely from a shared pool of resources, as per clients customized requirement, any time, from anywhere. Cloud Computing has attained tremendous popularity recently, leading to its vast and rapid deployment, however Privacy and Security concerns have also increased in same ratio, since data and application is outsourced to a third party. Security concerns about cloud architecture have come up as the prime barrier hindering the major shift of industry towards cloud model, despite significant advantages of cloud architecture. Cloud computing architecture presents aggravated and specific challenges in the network forensics. In this paper, I have reviewed challenges and issues faced in conducting network forensics particularly in the cloud computing environment. The study covers limitations that a network forensic expert may confront during investigation in cloud environment. I have categorized challenges presented to network forensics in cloud computing into various groups. Challenges in each group can be handled appropriately by either Forensic experts, Cloud service providers or Forensic tools whereas leftover challenges are declared as be- ond the control.
Cyber crime investigation is the integration of two technologies named theoretical methodology and second practical tools. First is the theoretical digital forensic methodology that encompasses the steps to investigate the cyber crime. And second technology is the practically development of the digital forensic tool which sequentially and systematically analyze digital devices to extract the evidence to prove the crime. This paper explores the development of digital forensic framework, combine the advantages of past twenty five forensic models and generate a algorithm to create a new digital forensic model. The proposed model provides the following advantages, a standardized method for investigation, the theory of model can be directly convert into tool, a history lookup facility, cost and time minimization, applicable to any type of digital crime investigation.
We present Cloud-COVER (Controls and Orderings for Vulnerabilities and ExposuRes), a cloud security threat modelling tool. Cloud-COVER takes input from a user about their deployment, requiring information about the data, instances, connections, their properties, and the importance of various security attributes. This input is used to analyse the relevant threats, and the way they propagate through the system. They are then presented to the user, ordered according to the security attributes they have prioritised, along with the best countermeasures to secure against the dangers listed.
Interface-confinement is a common mechanism that secures untrusted code by executing it inside a sandbox. The sandbox limits (confines) the code's interaction with key system resources to a restricted set of interfaces. This practice is seen in web browsers, hypervisors, and other security-critical systems. Motivated by these systems, we present a program logic, called System M, for modeling and proving safety properties of systems that execute adversary-supplied code via interface-confinement. In addition to using computation types to specify effects of computations, System M includes a novel invariant type to specify the properties of interface-confined code. The interpretation of invariant type includes terms whose effects satisfy an invariant. We construct a step-indexed model built over traces and prove the soundness of System M relative to the model. System M is the first program logic that allows proofs of safety for programs that execute adversary-supplied code without forcing the adversarial code to be available for deep static analysis. System M can be used to model and verify protocols as well as system designs. We demonstrate the reasoning principles of System M by verifying the state integrity property of the design of Memoir, a previously proposed trusted computing system.
Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.
Pervasive Computing is one of the latest and more advanced paradigms currently available in the computers arena. Its ability to provide the distribution of computational services within environments where people live, work or socialize leads to make issues such as privacy, trust and identity more challenging compared to traditional computing environments. In this work we review these general issues and propose a Pervasive Computing architecture based on a simple but effective trust model that is better able to cope with them. The proposed architecture combines some Artificial Intelligence techniques to achieve close resemblance with human-like decision making. Accordingly, Apriori algorithm is first used in order to extract the behavioral patterns adopted from the users during their network interactions. Naïve Bayes classifier is then used for final decision making expressed in term of probability of user trustworthiness. To validate our approach we applied it to some typical ubiquitous computing scenarios. The obtained results demonstrated the usefulness of such approach and the competitiveness against other existing ones.
The data processing capabilities of MapReduce systems pioneered with the on-demand scalability of cloud computing have enabled the Big Data revolution. However, the data controllers/owners worried about the privacy and accountability impact of storing their data in the cloud infrastructures as the existing cloud computing solutions provide very limited control on the underlying systems. The intuitive approach - encrypting data before uploading to the cloud - is not applicable to MapReduce computation as the data analytics tasks are ad-hoc defined in the MapReduce environment using general programming languages (e.g, Java) and homomorphic encryption methods that can scale to big data do not exist. In this paper, we address the challenges of determining and detecting unauthorized access to data stored in MapReduce based cloud environments. To this end, we introduce alarm raising honeypots distributed over the data that are not accessed by the authorized MapReduce jobs, but only by the attackers and/or unauthorized users. Our analysis shows that unauthorized data accesses can be detected with reasonable performance in MapReduce based cloud environments.
Information Technology experts cite security and privacy concerns as the major challenges in the adoption of cloud computing. On Platform-as-a-Service (PaaS) clouds, customers are faced with challenges of selecting service providers and evaluating security implementations based on their security needs and requirements. This study aims to enable cloud customers the ability to quantify their security requirements in order to identify critical areas in PaaS cloud architectures were security provisions offered by CSPs could be assessed. With the use of an adaptive security mapping matrix, the study uses a quantitative approach to presents findings of numeric data that shows critical architectures within the PaaS environment where security can be evaluated and security controls assessed to meet these security requirements. The matrix can be adapted across different types of PaaS cloud models based on individual security requirements and service level objectives identified by PaaS cloud customers.
Language vector space models (VSMs) have recently proven to be effective across a variety of tasks. In VSMs, each word in a corpus is represented as a real-valued vector. These vectors can be used as features in many applications in machine learning and natural language processing. In this paper, we study the effect of vector space representations in cyber security. In particular, we consider a passive traffic analysis attack (Website Fingerprinting) that threatens users' navigation privacy on the web. By using anonymous communication, Internet users (such as online activists) may wish to hide the destination of web pages they access for different reasons such as avoiding tyrant governments. Traditional website fingerprinting studies collect packets from the users' network and extract features that are used by machine learning techniques to reveal the destination of certain web pages. In this work, we propose the packet to vector (P2V) approach where we model website fingerprinting attack using word vector representations. We show how the suggested model outperforms previous website fingerprinting works.
So far, cloud storage has been accepted by an increasing number of people, which is not a fresh notion any more. It brings cloud users a lot of conveniences, such as the relief of local storage and location independent access. Nevertheless, the correctness and completeness as well as the privacy of outsourced data are what worry could users. As a result, most people are unwilling to store data in the cloud, in case that the sensitive information concerning something important is disclosed. Only when people feel worry-free, can they accept cloud storage more easily. Certainly, many experts have taken this problem into consideration, and tried to solve it. In this paper, we survey the solutions to the problems concerning auditing in cloud computing and give a comparison of them. The methods and performances as well as the pros and cons are discussed for the state-of-the-art auditing protocols.
Privacy analysis is essential in the society. Data privacy preservation for access control, guaranteed service in wireless sensor networks are important parts. In programs' verification, we not only consider about these kinds of safety and liveness properties but some security policies like noninterference, and observational determinism which have been proposed as hyper properties. Fairness is widely applied in verification for concurrent systems, wireless sensor networks and embedded systems. This paper studies verification and analysis for proving security-relevant properties and hyper properties by proposing deductive proof rules under fairness requirements (constraints).