Biblio
Taint analysis has been used in numerous scripting languages such as Perl and Ruby to defend against various form of code injection attacks, such as cross-site scripting (XSS) and SQL-injection. However, most taint analysis systems simply fail when tainted information is used in a possibly unsafe manner. In this paper, we explore how precise taint tracking can be used in order to secure web content. Rather than simply crashing, we propose that a library-writer defined sanitization function can instead be used on the tainted portions of a string. With this approach, library writers or framework developers can design their tools to be resilient, even if inexperienced developers misuse these libraries in unsafe ways. In other words, developer mistakes do not have to result in system crashes to guarantee security. We implement both coarse-grained and precise taint tracking in JavaScript, and show how our precise taint tracking API can be used to defend against SQL injection and XSS attacks. We further evaluate the performance of this approach, showing that precise taint tracking involves an overhead of approximately 22%.
To assure cyber security of an enterprise, typically SIEM (Security Information and Event Management) system is in place to normalize security events from different preventive technologies and flag alerts. Analysts in the security operation center (SOC) investigate the alerts to decide if it is truly malicious or not. However, generally the number of alerts is overwhelming with majority of them being false positive and exceeding the SOC's capacity to handle all alerts. Because of this, potential malicious attacks and compromised hosts may be missed. Machine learning is a viable approach to reduce the false positive rate and improve the productivity of SOC analysts. In this paper, we develop a user-centric machine learning framework for the cyber security operation center in real enterprise environment. We discuss the typical data sources in SOC, their work flow, and how to leverage and process these data sets to build an effective machine learning system. The paper is targeted towards two groups of readers. The first group is data scientists or machine learning researchers who do not have cyber security domain knowledge but want to build machine learning systems for security operations center. The second group of audiences are those cyber security practitioners who have deep knowledge and expertise in cyber security, but do not have machine learning experiences and wish to build one by themselves. Throughout the paper, we use the system we built in the Symantec SOC production environment as an example to demonstrate the complete steps from data collection, label creation, feature engineering, machine learning algorithm selection, model performance evaluations, to risk score generation.
Creating and implementing fault-tolerant distributed algorithms is a challenging task in highly safety-critical industries. Using formal methods supports design and development of complex algorithms. However, formal methods are often perceived as an unjustifiable overhead. This paper presents the experience and insights when using TLA+ and PlusCal to model and develop fault-tolerant and safety-critical modules for TAS Control Platform, a platform for railway control applications up to safety integrity level (SIL) 4. We show how formal methods helped us improve the correctness of the algorithms, improved development efficiency and how part of the gap between model and implementation has been closed by translation to C code. Additionally, we describe how we gained trust in the formal model and tools by following a specific design process called property-driven design, which also implicitly addresses software quality metrics such as code coverage metrics.
While the growth of cloud-based technologies has benefited the society tremendously, it has also increased the surface area for cyber attacks. Given that cloud services are prevalent today, it is critical to devise systems that detect intrusions. One form of security breach in the cloud is when cyber-criminals compromise Virtual Machines (VMs) of unwitting users and, then, utilize user resources to run time-consuming, malicious, or illegal applications for their own benefit. This work proposes a method to detect unusual resource usage trends and alert the user and the administrator in real time. We experiment with three categories of methods: simple statistical techniques, unsupervised classification, and regression. So far, our approach successfully detects anomalous resource usage when experimenting with typical trends synthesized from published real-world web server logs and cluster traces. We observe the best results with unsupervised classification, which gives an average F1-score of 0.83 for web server logs and 0.95 for the cluster traces.
Verifying that hardware design implementations adhere to specifications is a time intensive and sometimes intractable problem due to the massive size of the system's state space. Formal methods techniques can be used to prove certain tractable specification properties; however, they are expensive, and often require subject matter experts to develop and solve. Nonetheless, hardware verification is a critical process to ensure security and safety properties are met, and encapsulates problems associated with trust and reliability. For complex designs where coverage of the entire state space is unattainable, prioritizing regions most vulnerable to security or reliability threats would allow efficient allocation of valuable verification resources. Stackelberg security games model interactions between a defender, whose goal is to assign resources to protect a set of targets, and an attacker, who aims to inflict maximum damage on the targets after first observing the defender's strategy. In equilibrium, the defender has an optimal security deployment strategy, given the attacker's best response. We apply this Stackelberg security framework to synthesized hardware implementations using the design's network structure and logic to inform defender valuations and verification costs. The defender's strategy in equilibrium is thus interpreted as a prioritization of the allocation of verification resources in the presence of an adversary. We demonstrate this technique on several open-source synthesized hardware designs.
Friendly jamming is a physical layer security technique that utilizes extra available nodes to jam any eavesdroppers. This paper considers the use of additional available nodes as friendly jammers in order to improve the security performance of a route through a wireless area network. One of the unresolved technical challenges is the combining of security metrics with typical service quality metrics. In this context, this paper considers the problem of routing through a D2D network while jointly minimizing the secrecy outage probability (SOP) and connection outage probability (COP), using friendly jamming to improve the SOP of each link. The jamming powers are determined to place nulls at friendly receivers while maximizing the power to eavesdroppers. Then the route metrics are derived, and the problem is framed as a convex optimization problem. We also consider that not all network users equally value SOP and COP, and so introduce an auxiliary variable to tune the optimization between the two metrics.
Distributed Denial of Service (DDoS) attacks are a popular and inexpensive form of cyber attacks. Application layer DDoS attacks utilize legitimate application layer requests to overwhelm a web server. These attacks are a major threat to Internet applications and web services. The main goal of these attacks is to make the services unavailable to legitimate users by overwhelming the resources on a web server. They look valid in connection and protocol characteristics, which makes them difficult to detect. In this paper, we propose a detection method for the application layer DDoS attacks, which is based on user behavior anomaly detection. We extract instances of user behaviors requesting resources from HTTP web server logs. We apply the Principle Component Analysis (PCA) subspace anomaly detection method for the detection of anomalous behavior instances. Web server logs from a web server hosting a student resource portal were collected as experimental data. We also generated nine different HTTP DDoS attacks through penetration testing. Our performance results on the collected data show that using PCAsubspace anomaly detection on user behavior data can detect application layer DDoS attacks, even if they are trying to mimic a normal user's behavior at some level.
Climate change has affected the cultivation in all countries with extreme drought, flooding, higher temperature, and changes in the season thus leaving behind the uncontrolled production. Consequently, the smart farm has become part of the crucial trend that is needed for application in certain farm areas. The aims of smart farm are to control and to enhance food production and productivity, and to increase farmers' profits. The advantages in applying smart farm will improve the quality of production, supporting the farm workers, and better utilization of resources. This study aims to explore the research trends and identify research clusters on smart farm using bibliometric analysis that has supported farming to improve the quality of farm production. The bibliometric analysis is the method to explore the relationship of the articles from a co-citation network of the articles and then science mapping is used to identify clusters in the relationship. This study examines the selected research articles in the smart farm field. The area of research in smart farm is categorized into two clusters that are soil carbon emission from farming activity, food security and farm management by using a VOSviewer tool with keywords related to research articles on smart farm, agriculture, supply chain, knowledge management, traceability, and product lifecycle management from Web of Science (WOS) and Scopus online database. The major cluster of smart farm research is the soil carbon emission from farming activity which impacts on climate change that affects food production and productivity. The contribution is to identify the trends on smart farm to develop research in the future by means of bibliometric analysis.
The spread of Internet of Things (IoT) botnets like those utilizing the Mirai malware were successful enough to power some of the most powerful DDoS attacks that have been seen thus far on the Internet. Two such attacks occurred on October 21, 2016 and September 20, 2016. Since there are an estimated three billion IoT devices currently connected to the Internet, these attacks highlight the need to understand the spread of IoT worms like Mirai and the vulnerability that they create for the Internet. In this work, we describe the spread of IoT worms using a proposed model known as the IoT Botnet with Attack Information (IoT-BAI), which utilizes a variation of the Susceptible-Exposed-Infected-Recovered-Susceptible (SEIRS) epidemic model [14]. The IoT-BAI model has shown that it may be possible to mitigate the frequency of IoT botnet attacks with improved user information which may positively affect user behavior. Additionally, the IoT-BAI model has shown that increased vulnerability to attack can be caused by new hosts entering the IoT population on a daily basis. Models like IoT-BAI could be used to predict user behavior after significant events in the network like a significant botnet attack.
This paper presents a framework for privacy-preserving video delivery system to fulfill users' privacy demands. The proposed framework leverages the inference channels in sensitive behavior prediction and object tracking in a video surveillance system for the sequence privacy protection. For such a goal, we need to capture different pieces of evidence which are used to infer the identity. The temporal, spatial and context features are extracted from the surveillance video as the observations to perceive the privacy demands and their correlations. Taking advantage of quantifying various evidence and utility, we let users subscribe videos with a viewer-dependent pattern. We implement a prototype system for off-line and on-line requirements in two typical monitoring scenarios to construct extensive experiments. The evaluation results show that our system can efficiently satisfy users' privacy demands while saving over 25% more video information compared to traditional video privacy protection schemes.
Address clustering tries to construct the one-to-many mapping from entities to addresses in the Bitcoin system. Simple heuristics based on the micro-structure of transactions have proved very effective in practice. In this paper we describe the primary reasons behind this effectiveness: address reuse, avoidable merging, super-clusters with high centrality,, the incremental growth of address clusters. We quantify their impact during Bitcoin's first seven years of existence.
Currently, different forms of ransomware are increasingly threatening Internet users. Modern ransomware encrypts important user data, and it is only possible to recover it once a ransom has been paid. In this article we show how software-defined networking can be utilized to improve ransomware mitigation. In more detail, we analyze the behavior of popular ransomware - CryptoWall - and, based on this knowledge, propose two real-time mitigation methods. Then we describe the design of an SDN-based system, implemented using OpenFlow, that facilitates a timely reaction to this threat, and is a crucial factor in the case of crypto ransomware. What is important is that such a design does not significantly affect overall network performance. Experimental results confirm that the proposed approach is feasible and efficient.
Abstract—Lateral movement-based attacks are increasingly leading to compromises in large private and government networks, often resulting in information exfiltration or service disruption. Such attacks are often slow and stealthy and usually evade existing security products. To enable effective detection of such attacks, we present a new approach based on graph-based modeling of the security state of the target system and correlation of diverse indicators of anomalous host behavior. We believe that irrespective of the specific attack vectors used, attackers typically establish a command and control channel to operate, and move in the target system to escalate their privileges and reach sensitive areas. Accordingly, we identify important features of command and control and lateral movement activities and extract them from internal and external communication traffic. Driven by the analysis of the features, we propose the use of multiple anomaly detection techniques to identify compromised hosts. These methods include Principal Component Analysis, k-means clustering, and Median Absolute Deviation-based utlier detection. We evaluate the accuracy of identifying compromised hosts by using injected attack traffic in a real enterprise network dataset, for various attack communication models. Our results show that the proposed approach can detect infected hosts with high accuracy and a low false positive rate.
Voice-controlled intelligent personal assistants, such as Cortana, Google Now, Siri and Alexa, are increasingly becoming a part of users' daily lives, especially on mobile devices. They introduce a significant change in information access, not only by introducing voice control and touch gestures but also by enabling dialogues where the context is preserved. This raises the need for evaluation of their effectiveness in assisting users with their tasks. However, in order to understand which type of user interactions reflect different degrees of user satisfaction we need explicit judgements. In this paper, we describe a user study that was designed to measure user satisfaction over a range of typical scenarios of use: controlling a device, web search, and structured search dialogue. Using this data, we study how user satisfaction varied with different usage scenarios and what signals can be used for modeling satisfaction in the different scenarios. We find that the notion of satisfaction varies across different scenarios, and show that, in some scenarios (e.g. making a phone call), task completion is very important while for others (e.g. planning a night out), the amount of effort spent is key. We also study how the nature and complexity of the task at hand affects user satisfaction, and find that preserving the conversation context is essential and that overall task-level satisfaction cannot be reduced to query-level satisfaction alone. Finally, we shed light on the relative effectiveness and usefulness of voice-controlled intelligent agents, explaining their increasing popularity and uptake relative to the traditional query-response interaction.
This demo dramatically illustrates how replacing 'Classic' TCP congestion control (Reno, Cubic, etc.) with a 'Scalable' alternative like Data Centre TCP (DCTCP) keeps queuing delay ultra-low; not just for a select few light applications like voice or gaming, but even when a variety of interactive applications all heavily load the same (emulated) Internet access. DCTCP has so far been confined to data centres because it is too aggressive–-it starves Classic TCP flows. To allow DCTCP to be exploited on the public Internet, we developed DualQ Coupled Active Queue Management (AQM), which allows the two TCP types to safely co-exist. Visitors can test all these claims. As well as running Web-based apps, they can pan and zoom a panoramic video of a football stadium on a touch-screen, and experience how their personalized HD scene seems to stick to their finger, even though it is encoded on the fly on servers accessed via an emulated delay, representing 'the cloud'. A pair of VR goggles can be used at the same time, making a similar point. The demo provides a dashboard so that visitors can not only experience the interactivity of each application live, but they can also quantify it via a wide range of performance stats, updated live. It also includes controls so visitors can configure different TCP variants, AQMs, network parameters and background loads and immediately test the effect.
We present a task-based domain-decomposition preconditioner for partial differential equations (PDEs) resilient to silent data corruption (SDC) and hard faults. The algorithm exploits a reformulation of the PDE as a sampling problem, followed by a regression-based solution update that is resilient to SDC. We adopt a server-client model implemented using the User Level Fault Mitigation MPI (MPI-ULFM). All state information is held by the servers, while clients only serve as computational units. The task-based nature of the algorithm and the capabilities of ULFM are complemented at the algorithm level to support missing tasks, making the application resilient to hard faults affecting the clients. Weak and strong scaling tests up to \textasciitilde115k cores show an excellent performance of the application with efficiencies above 90%, demonstrating the suitability to run at large scale. We demonstrate the resilience of the application for a 2D elliptic PDE by injecting SDC using a random single bit-flip model, and hard faults in the form of clients crashing. We show that in all cases, the application converges to the right solution. We analyze the overhead caused by the faults, and show that, for the test problem considered, the overhead incurred due to SDC is minimal compared to that from the hard faults.
Interactive systems are developed according to requirements, which may be, for instance, documentation, prototypes, diagrams, etc. The informal nature of system requirements may be a source of problems: it may be the case that a system does not implement the requirements as expected, thus, a way to validate whether an implementation follows the requirements is needed. We propose a novel approach to validating a system using formal models of the system. In this approach, a set of traces generated from the execution of the real interactive system is searched over the state space of the formal model. The scalability of the approach is demonstrated by an application to an industrial system in the nuclear plant domain. The combination of trace analysis and formal methods provides feedback that can bring improvements to both the real interactive system and the formal model.
Spectrum sensing (signal detection) under low signal to noise ratio is a fundamental problem in cognitive radio networks. In this paper, we have analyzed maximum eigenvalue detection (MED) and energy detection (ED) techniques known as semi-blind spectrum sensing techniques. Simulations are performed by using independent and identically distributed (iid) signals to verify the results. Maximum eigenvalue detection algorithm exploits correlation in received signal samples and hence, performs same as energy detection algorithm under high signal to noise ratio. Energy detection performs well under low signal to noise ratio for iid signals and its performance reaches maximum eigenvalue detection under high signal to noise ratio. Both algorithms don't need any prior knowledge of primary user signal for detection and hence can be used in various applications.
Online services are increasingly dependent on user participation. Whether it's online social networks or crowdsourcing services, understanding user behavior is important yet challenging. In this paper, we build an unsupervised system to capture dominating user behaviors from clickstream data (traces of users' click events), and visualize the detected behaviors in an intuitive manner. Our system identifies "clusters" of similar users by partitioning a similarity graph (nodes are users; edges are weighted by clickstream similarity). The partitioning process leverages iterative feature pruning to capture the natural hierarchy within user clusters and produce intuitive features for visualizing and understanding captured user behaviors. For evaluation, we present case studies on two large-scale clickstream traces (142 million events) from real social networks. Our system effectively identifies previously unknown behaviors, e.g., dormant users, hostile chatters. Also, our user study shows people can easily interpret identified behaviors using our visualization tool.
In a wireless system, a signal map shows the signal strength at different locations termed reference points (RPs). As access points (APs) and their transmission power may change over time, keeping an updated signal map is important for applications such as Wi-Fi optimization and indoor localization. Traditionally, the signal map is obtained by a full site survey, which is time-consuming and costly. We address in this paper how to efficiently update a signal map given sparse samples randomly crowdsourced in the space (e.g., by signal monitors, explicit human input, or implicit user participation). We propose Compressive Signal Reconstruction (CSR), a novel learning system employing Bayesian compressive sensing (BCS) for online signal map update. CSR does not rely on any path loss model or line of sight, and is generic enough to serve as a plug-in of any wireless system. Besides signal map update, CSR also computes the estimation error of signals in terms of confidence interval. CSR models the signal correlation with a kernel function. Using it, CSR constructs a sensing matrix based on the newly sampled signals. The sensing matrix is then used to compute the signal change at all the RPs with any BCS algorithm. We have conducted extensive experiments on CSR in our university campus. Our results show that CSR outperforms other state-of-the-art algorithms by a wide margin (reducing signal error by about 30% and sampling points by 20%).