Biblio
In this talk, I will discuss my lab's work in the emerging field of adversarial stylometry and machine learning. Machine learning algorithms are increasingly being used in security and privacy domains, in areas that go beyond intrusion or spam detection. For example, in digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. We have applied stylometry to difficult domains such as underground hacker forums, open source projects (code), and tweets. I will discuss our Doppelgnger Finder algorithm, which enables us to group Sybil accounts on underground forums and detect blogs from Twitter feeds and reddit comments. In addition, I will discuss our work attributing unknown source code and binaries.
Access control offers mechanisms to control and limit the actions or operations that are performed by a user on a set of resources in a system. Many access control models exist that are able to support this basic requirement. One of the properties examined in the context of these models is their ability to successfully restrict access to resources. Nevertheless, considering only restriction of access may not be enough in some environments, as in critical infrastructures. The protection of systems in this type of environment requires a new line of enquiry. It is essential to ensure that appropriate access is always possible, even when users and resources are subjected to challenges of various sorts. Resilience in access control is conceived as the ability of a system not to restrict but rather to ensure access to resources. In order to demonstrate the application of resilience in access control, we formally define an attribute based access control model (ABAC) based on guidelines provided by the National Institute of Standards and Technology (NIST). We examine how ABAC-based resilience policies can be specified in temporal logic and how these can be formally verified. The verification of resilience is done using an automated model checking technique, which eventually may lead to reducing the overall complexity required for the verification of resilience policies and serve as a valuable tool for administrators.
As increasingly more enterprises are deploying cloud file-sharing services, this adds a new channel for potential insider threats to company data and IPs. In this paper, we introduce a two-stage machine learning system to detect anomalies. In the first stage, we project the access logs of cloud file-sharing services onto relationship graphs and use three complementary graph-based unsupervised learning methods: OddBall, PageRank and Local Outlier Factor (LOF) to generate outlier indicators. In the second stage, we ensemble the outlier indicators and introduce the discrete wavelet transform (DWT) method, and propose a procedure to use wavelet coefficients with the Haar wavelet function to identify outliers for insider threat. The proposed system has been deployed in a real business environment, and demonstrated effectiveness by selected case studies.
Logic locking is an intellectual property (IP) protection technique that prevents IP piracy, reverse engineering and overbuilding attacks by the untrusted foundry or end-users. Existing logic locking techniques are all based on locking the functionality; the design/chip is nonfunctional unless the secret key has been loaded. Existing techniques are vulnerable to various attacks, such as sensitization, key-pruning, and signal skew analysis enabled removal attacks. In this paper, we propose a tenacious and traceless logic locking technique, TTlock, that locks functionality and provably withstands all known attacks, such as SAT-based, sensitization, removal, etc. TTLock protects a secret input pattern; the output of a logic cone is flipped for that pattern, where this flip is restored only when the correct key is applied. Experimental results confirm our theoretical expectations that the computational complexity of attacks launched on TTLock grows exponentially with increasing key-size, while the area, power, and delay overhead increases only linearly. In this paper, we also coin ``parametric locking," where the design/chip behaves as per its specifications (performance, power, reliability, etc.) only with the secret key in place, and an incorrect key downgrades its parametric characteristics. We discuss objectives and challenges in parametric locking.
Recently, cellular operators have started migrating to IPv6 in response to the increasing demand for IP addresses. With the introduction of IPv6, cellular middleboxes, such as firewalls for preventing malicious traffic from the Internet and stateful NAT64 boxes for providing backward compatibility with legacy IPv4 services, have become crucial to maintain stability of cellular networks. This paper presents security problems of the currently deployed IPv6 middleboxes of five major operators. To this end, we first investigate several key features of the current IPv6 deployment that can harm the safety of a cellular network as well as its customers. These features combined with the currently deployed IPv6 middlebox allow an adversary to launch six different attacks. First, firewalls in IPv6 cellular networks fail to block incoming packets properly. Thus, an adversary could fingerprint cellular devices with scanning, and further, she could launch denial-of-service or over-billing attacks. Second, vulnerabilities in the stateful NAT64 box, a middlebox that maps an IPv6 address to an IPv4 address (and vice versa), allow an adversary to launch three different attacks: 1) NAT overflow attack that allows an adversary to overflow the NAT resources, 2) NAT wiping attack that removes active NAT mappings by exploiting the lack of TCP sequence number verification of firewalls, and 3) NAT bricking attack that targets services adopting IP-based blacklisting by preventing the shared external IPv4 address from accessing the service. We confirmed the feasibility of these attacks with an empirical analysis. We also propose effective countermeasures for each attack.
Today, the proportion of software in society as a whole is steadily increasing. In addition to size of software increasing, the number of cases dealing with personal information is also increasing. This shows the importance of weekly software security verification. However, software security is very difficult in cases where libraries do not have source code. To solve this problem, it is necessary to develop a technique for checking existing binary security weaknesses. To this end, techniques for analyzing security weaknesses using intermediate languages are actively being discussed. In this paper, we propose a system that translate binary code to intermediate language to effectively analyze existing security weaknesses within binary code.
In the last years, networking scenarios have been evolving, hand-in-hand with new and varied applications with heterogeneous Quality of Service (QoS) requirements. These requirements must be efficiently and effectively delivered. Given its static layered structure and almost complete lack of built-in QoS support, the current TCP/IP-based Internet hinders such an evolution. In contrast, the clean-slate Recursive InterNetwork Architecture (RINA) proposes a new recursive and programmable networking model capable of evolving with the network requirements, solving in this way most, if not all, TCP/IP protocol stack limitations. Network providers can better deliver communication services across their networks by taking advantage of the RINA architecture and its support for QoS. This support allows providing complete information of the QoS needs of the supported traffic flows, and thus, fulfilment of these needs becomes possible. In this work, we focus on the importance of path selection to better ensure QoS guarantees in long-haul RINA networks. We propose and evaluate a programmable strategy for path selection based on flow QoS parameters, such as the maximum allowed latency and packet losses, comparing its performance against simple shortest-path, fastest-path and connection-oriented solutions.
The Cloud Computing is a developing IT concept that faces some issues, which are slowing down its evolution and adoption by users across the world. The lack of security has been the main concern. Organizations and entities need to ensure, inter alia, the integrity and confidentiality of their outsourced sensible data within a cloud provider server. Solutions have been examined in order to strengthen security models (strong authentication, encryption and fragmentation before storing, access control policies...). More particularly, data remanence is undoubtedly a major threat. How could we be sure that data are, when is requested, truly and appropriately deleted from remote servers? In this paper, we aim to produce a survey about this interesting subject and to address the problem of residual data in a cloud-computing environment, which is characterized by the use of virtual machines instantiated in remote servers owned by a third party.
The Internet of Things (IoT) is transforming the way we live and work by increasing the connectedness of people and things on a scale that was once unimaginable. However, the vulnerabilities in the IoT supply chain have raised serious concerns about the security and trustworthiness of IoT devices and components within them. Testing for device provenance, detection of counterfeit integrated circuits (ICs) and systems, and traceability of IoT devices are challenging issues to address. In this article, we develop a novel radio-frequency identification (RFID)-based system suitable for counterfeit detection, traceability, and authentication in the IoT supply chain called CDTA. CDTA is composed of different types of on-chip sensors and in-system structures that collect necessary information to detect multiple counterfeit IC types (recycled, cloned, etc.), track and trace IoT devices, and verify the overall system authenticity. Central to CDTA is an RFID tag employed as storage and a channel to read the information from different types of chips on the printed circuit board (PCB) in both power-on and power-off scenarios. CDTA sensor data can also be sent to the remote server for authentication via an encrypted Ethernet channel when the IoT device is deployed in the field. A novel board ID generator is implemented by combining outputs of physical unclonable functions (PUFs) embedded in the RFID tag and different chips on the PCB. A light-weight RFID protocol is proposed to enable mutual authentication between RFID readers and tags. We also implement a secure interchip communication on the PCB. Simulations and experimental results using Spartan 3E FPGAs demonstrate the effectiveness of this system. The efficiency of the radio-frequency (RF) communication has also been verified via a PCB prototype with a printed slot antenna.
While the clean slate approach proposed by Software Defined Networking (SDN) promises radical changes in the stagnant state of network management, SDN innovation has not gone beyond the intra-domain level. For the inter-domain ecosystem to benefit from the advantages of SDN, Internet Exchange Points (IXPs) are the ideal place: a central interconnection hub through which a large share of the Internet can be affected. In this demo, we showcase the ENDEAVOUR platform: a new software defined exchange approach readily deployable in commercial IXPs. We demonstrate here our implementations of traffic engineering and Distributed Denial of Service mitigation, as well as how member networks cash in on the advanced SDN-features of ENDEAVOUR, typically not available in legacy networks.
In recent years, the emerging Internet-of-Things (IoT) has led to rising concerns about the security of networked embedded devices. In this work, we propose the SIPHON architecture–-a Scalable high-Interaction Honeypot platform for IoT devices. Our architecture leverages IoT devices that are physically at one location and are connected to the Internet through so-called $\backslash$emph\wormholes\ distributed around the world. The resulting architecture allows exposing few physical devices over a large number of geographically distributed IP addresses. We demonstrate the proposed architecture in a large scale experiment with 39 wormhole instances in 16 cities in 9 countries. Based on this setup, five physical IP cameras, one NVR and one IP printer are presented as 85 real IoT devices on the Internet, attracting a daily traffic of 700MB for a period of two months. A preliminary analysis of the collected traffic indicates that devices in some cities attracted significantly more traffic than others (ranging from 600 000 incoming TCP connections for the most popular destination to less than 50 000 for the least popular). We recorded over 400 brute-force login attempts to the web-interface of our devices using a total of 1826 distinct credentials, from which 11 attempts were successful. Moreover, we noted login attempts to Telnet and SSH ports some of which used credentials found in the recently disclosed Mirai malware.
In distributed systems, there is often a need to combine the heterogeneous access control policies to offer more comprehensive services to users in the local or national level. A large scale healthcare system is usually distributed in a computer network and might require sophisticated access control policies to protect the system. Therefore, the need for integrating the electronic healthcare systems might be important to provide a comprehensive care for patients while preserving patients' privacy and data security. However, there are major impediments in healthcare systems concerning not well-defined and flexible access control policy implementations, hindering the progress towards secure integrated systems. In this paper, we introduce an access control policy combination framework for EHR systems that preserves patients' privacy and ensures data security. We achieve our goal through an access control mechanism which handles multiple access control policies through a similarity analysis phase. In that phase, we evaluate different XACML policies to decide whether or not a policy combination is applicable. We have provided a case study to show the applicability of our proposed approach based on XACML. Our study results can be applied to the electronic health record (EHR) access control policy, which fosters interoperability and scalability among healthcare providers while preserving patients' privacy and data security.
The survey of related works on insider information security (IS) threats is presented. Special attention is paid to works that consider the insiders' behavioral models as it is very up-to-date for behavioral intrusion detection. Three key research directions are defined: 1) the problem analysis in general, including the development of taxonomy for insiders, attacks and countermeasures; 2) study of a specific IS threat with forecasting model development; 3) early detection of a potential insider. The models for the second and third directions are analyzed in detail. Among the second group the works on three IS threats are examined, namely insider espionage, cyber sabotage and unintentional internal IS violation. Discussion and a few directions for the future research conclude the paper.
Policies govern choices in the behavior of systems. They are applied to human behavior as well as to the behavior of autonomous systems but are defined differently in each case. Generally humans have the ability to interpret the intent behind the policies, to bring about their desired effects, even occasionally violating them when the need arises. In contrast, policies for automated systems fully define the prescribed behavior without ambiguity, conflicts or omissions. The increasing use of AI techniques and machine learning in autonomous systems such as drones promises to blur these boundaries and allows us to conceive in a similar way more flexible policies for the spectrum of human-autonomous systems collaborations. In coalition environments this spectrum extends across the boundaries of authority in pursuit of a common coalition goal and covers collaborations between human and autonomous systems alike. In social sciences, social exchange theory has been applied successfully to explain human behavior in a variety of contexts. It provides a framework linking the expected rewards, costs, satisfaction and commitment to explain and anticipate the choices that individuals make when confronted with various options. We discuss here how it can be used within coalition environments to explain joint decision making and to help formulate policies re-framing the concepts where appropriate. Social exchange theory is particularly attractive within this context as it provides a theory with “measurable” components that can be readily integrated in machine reasoning processes.
While most organizations continue to invest in traditional network defences, a formidable security challenge has been brewing within their own boundaries. Malicious insiders with privileged access in the guise of a trusted source have carried out many attacks causing far reaching damage to financial stability, national security and brand reputation for both public and private sector organizations. Growing exposure and impact of the whistleblower community and concerns about job security with changing organizational dynamics has further aggravated this situation. The unpredictability of malicious attackers, as well as the complexity of malicious actions, necessitates the careful analysis of network, system and user parameters correlated with insider threat problem. Thus it creates a high dimensional, heterogeneous data analysis problem in isolating suspicious users. This research work proposes an insider threat detection framework, which utilizes the attributed graph clustering techniques and outlier ranking mechanism for enterprise users. Empirical results also confirm the effectiveness of the method by achieving the best area under curve value of 0.7648 for the receiver operating characteristic curve.
The Internet of Things (IoT) is a new paradigm in which every-day objects are interconnected between each other and to the Internet. This paradigm is receiving much attention of the scientific community and it is applied in many fields. In some applications, it is useful to prove that a number of objects are simultaneously present in a group. For example, an individual might want to authorize NFC payment with his mobile only if k of his devices are present to ensure that he is the right person. This principle is known as Grouping-Proofs. However, existing Grouping-Proofs schemes are mostly designed for RFID systems and don't fulfill the IoT characteristics. In this paper, we propose a Threshold Grouping-Proofs for IoT applications. Our scheme uses the Key-Policy Attribute-Based Encryption (KP-ABE) protocol to encrypt a message so that it can be decrypted only if at least k objects are simultaneously present in the same location. A security analysis and performance evaluation is conducted to show the effectiveness of our proposal solution.
Figuring innovations and development of web diminishes the exertion required for different procedures. Among them the most profited businesses are electronic frameworks, managing an account, showcasing, web based business and so on. This framework mostly includes the data trades ceaselessly starting with one host then onto the next. Amid this move there are such a variety of spots where the secrecy of the information and client gets loosed. Ordinarily the zone where there is greater likelihood of assault event is known as defenceless zones. Electronic framework association is one of such place where numerous clients performs there undertaking as indicated by the benefits allotted to them by the director. Here the aggressor makes the utilization of open ranges, for example, login or some different spots from where the noxious script is embedded into the framework. This scripts points towards trading off the security imperatives intended for the framework. Few of them identified with clients embedded scripts towards web communications are SQL infusion and cross webpage scripting (XSS). Such assaults must be distinguished and evacuated before they have an effect on the security and classification of the information. Amid the most recent couple of years different arrangements have been incorporated to the framework for making such security issues settled on time. Input approvals is one of the notable fields however experiences the issue of execution drops and constrained coordinating. Some other component, for example, disinfection and polluting will create high false report demonstrating the misclassified designs. At the center, both include string assessment and change investigation towards un-trusted hotspots for totally deciphering the effect and profundity of the assault. This work proposes an enhanced lead based assault discovery with specifically message fields for viably identifying the malevolent scripts. The work obstructs the ordinary access for malignant so- rce utilizing and hearty manage coordinating through unified vault which routinely gets refreshed. At the underlying level of assessment, the work appears to give a solid base to further research.
Collaborative Filtering (CF) is a successful technique that has been implemented in recommender systems and Privacy Preserving Collaborative Filtering (PPCF) aroused increasing concerns of the society. Current solutions mainly focus on cryptographic methods, obfuscation methods, perturbation methods and differential privacy methods. But these methods have some shortcomings, such as unnecessary computational cost, lower data quality and hard to calibrate the magnitude of noise. This paper proposes a (k, p, I)-anonymity method that improves the existing k-anonymity method in PPCF. The method works as follows: First, it applies Latent Factor Model (LFM) to reduce matrix sparsity. Then it improves Maximum Distance to Average Vector (MDAV) microaggregation algorithm based on importance partitioning to increase homogeneity among records in each group which can retain better data quality and (p, I)-diversity model where p is attacker's prior knowledge about users' ratings and I is the diversity among users in each group to improve the level of privacy preserving. Theoretical and experimental analyses show that our approach ensures a higher level of privacy preserving based on lower information loss.
Cryptanalysis (the study of methods to read encrypted information without knowledge of the encryption key) has traditionally been separated into mathematical analysis of weaknesses in cryptographic algorithms, on the one hand, and side-channel attacks which aim to exploit weaknesses in the implementation of encryption and decryption algorithms. Mathematical analysis generally makes assumptions about the algorithm with the aim of reconstructing the key relating plain text to cipher text through brute-force methods. Complexity issues tend to dominate the systematic search for keys. To date, there has been very little research on a third cryptanalysis method: learning the key through convergence based on associations between plain text and cipher text. Recent advances in deep learning using multi-layered artificial neural networks (ANNs) provide an opportunity to reassess the role of deep learning architectures in next generation cryptanalysis methods based on neurocryptography (NC). In this paper, we explore the capability of deep ANNs to decrypt encrypted messages with minimum knowledge of the algorithm. From the experimental results, it can be concluded that DNNs can encrypt and decrypt to levels of accuracy that are not 100% because of the stochastic aspects of ANNs. This aspect may however be useful if communication is under cryptanalysis attack, since the attacker will not know for certain that key K used for encryption and decryption has been found. Also, uncertainty concerning the architecture used for encryption and decryption adds another layer of uncertainty that has no counterpart in traditional cryptanalysis.
Tactical networks are generally simple ad-hoc networks in design, however, this simple design often gets complicated, when heterogeneous wireless technologies have to work together to enable seamless multi-hop communications across multiple sessions. In recent years, there has been some significant advances in computational, radio, localization, and networking te, and session's rate i.e., aggregate capacity averaged over a 4-time-slot frame)chnologies, which motivate a clean slate design of the control plane for multi-hop tactical wireless networks. In this paper, we develop a global network optimization framework, which characterizes the control plane for multi-hop wireless tactical networks. This framework abstracts the underlying complexity of tactical wireless networks and orchestrates the the control plane functions. Specifically, we develop a cross-layer optimization framework, which characterizes the interaction between the physical, link, and network layers. By applying the framework to a throughput maximization problem, we show how the proposed framework can be utilized to solve a broad range of wireless multi-hop tactical networking problems.
This paper proposes a context-aware, graph-based approach for identifying anomalous user activities via user profile analysis, which obtains a group of users maximally similar among themselves as well as to the query during test time. The main challenges for the anomaly detection task are: (1) rare occurrences of anomalies making it difficult for exhaustive identification with reasonable false-alarm rate, and (2) continuously evolving new context-dependent anomaly types making it difficult to synthesize the activities apriori. Our proposed query-adaptive graph-based optimization approach, solvable using maximum flow algorithm, is designed to fully utilize both mutual similarities among the user models and their respective similarities with the query to shortlist the user profiles for a more reliable aggregated detection. Each user activity is represented using inputs from several multi-modal resources, which helps to localize anomalies from time-dependent data efficiently. Experiments on public datasets of insider threats and gesture recognition show impressive results.
In previous work, we proposed a solution to facilitate access to computer science related courses and learning materials using cloud computing and mobile technologies. The solution was positively evaluated by the participants, but most of them indicated that it lacks support for laboratory activities. As it is well known that many of computer science subjects (e.g. Computer Networks, Information Security, Systems Administration, etc.) require a suitable and flexible environment where students can access a set of computers and network devices to successfully complete their hands-on activities. To achieve this criteria, we created a cloud-based virtual laboratory based on OpenStack cloud platform to facilitate access to virtual machine both locally and remotely. Cloud-based virtual labs bring a lot of advantages, such as increased manageability, scalability, high availability and flexibility, to name a few. This arrangement has been tested in a case-study exercise with a group of students as part of Computer Networks and System Administration courses at Kabul Polytechnic University in Afghanistan. To measure success, we introduced a level test to be completed by participants prior and after the experiment. As a result, the learners achieved an average of 17.1 % higher scores in the post level test after completing the practical exercises. Lastly, we distributed a questionnaire after the experiment and students provided positive feedback on the effectiveness and usefulness of the proposed solution.