Biblio
Active authentication is the problem of continuously verifying the identity of a person based on behavioral aspects of their interaction with a computing device. In this paper, we collect and analyze behavioral biometrics data from 200 subjects, each using their personal Android mobile device for a period of at least 30 days. This data set is novel in the context of active authentication due to its size, duration, number of modalities, and absence of restrictions on tracked activity. The geographical colocation of the subjects in the study is representative of a large closed-world environment such as an organization where the unauthorized user of a device is likely to be an insider threat: coming from within the organization. We consider four biometric modalities: 1) text entered via soft keyboard, 2) applications used, 3) websites visited, and 4) physical location of the device as determined from GPS (when outdoors) or WiFi (when indoors). We implement and test a classifier for each modality and organize the classifiers as a parallel binary decision fusion architecture. We are able to characterize the performance of the system with respect to intruder detection time and to quantify the contribution of each modality to the overall performance.
Phishing is a major concern on the Internet today and many users are falling victim because of criminal's deceitful tactics. Blacklisting is still the most common defence users have against such phishing websites, but is failing to cope with the increasing number. In recent years, researchers have devised modern ways of detecting such websites using machine learning. One such method is to create machine learnt models of URL features to classify whether URLs are phishing. However, there are varying opinions on what the best approach is for features and algorithms. In this paper, the objective is to evaluate the performance of the Random Forest algorithm using a lexical only dataset. The performance is benchmarked against other machine learning algorithms and additionally against those reported in the literature. Initial results from experiments indicate that the Random Forest algorithm performs the best yielding an 86.9% accuracy.
Advanced Persistent Threat (APT) attacks became a major network threat in recent years. Among APT attack techniques, sending a phishing email with malicious documents attached is considered one of the most effective ones. Although many users have the impression that documents are harmless, a malicious document may in fact contain shellcode to attack victims. To cope with the problem, we design and implement a malicious document detector called Forensor to differentiate malicious documents. Forensor integrates several open-source tools and methods. It first introspects file format to retrieve objects inside the documents, and then automatically decrypts simple encryption methods, e.g., XOR, rot and shift, commonly used in malware to discover potential shellcode. The emulator is used to verify the presence of shellcode. If shellcode is discovered, the file is considered malicious. The experiment used 9,000 benign files and more than 10,000 malware samples from a well-known sample sharing website. The result shows no false negative and only 2 false positives.
The prevalence and effectiveness of phishing attacks, despite the presence of a vast array of technical defences, are due largely to the fact that attackers are ruthlessly targeting what is often referred to as the weakest link in the system - the human. This paper reports the results of an investigation into how end users behave when faced with phishing websites and how this behaviour exposes them to attack. Specifically, the paper presents a proof of concept computer model for simulating human behaviour with respect to phishing website detection based on the ACT-R cognitive architecture, and draws conclusions as to the applicability of this architecture to human behaviour modelling within a phishing detection scenario. Following the development of a high-level conceptual model of the phishing website detection process, the study draws upon ACT-R to model and simulate the cognitive processes involved in judging the validity of a representative webpage based primarily around the characteristics of the HTTPS padlock security indicator. The study concludes that despite the low-level nature of the architecture and its very basic user interface support, ACT-R possesses strong capabilities which map well onto the phishing use case, and that further work to more fully represent the range of human security knowledge and behaviours in an ACT-R model could lead to improved insights into how best to combine technical and human defences to reduce the risk to end users from phishing attacks.
In the last decade, numerous fake websites have been developed on the World Wide Web to mimic trusted websites, with the aim of stealing financial assets from users and organizations. This form of online attack is called phishing, and it has cost the online community and the various stakeholders hundreds of million Dollars. Therefore, effective counter measures that can accurately detect phishing are needed. Machine learning (ML) is a popular tool for data analysis and recently has shown promising results in combating phishing when contrasted with classic anti-phishing approaches, including awareness workshops, visualization and legal solutions. This article investigates ML techniques applicability to detect phishing attacks and describes their pros and cons. In particular, different types of ML techniques have been investigated to reveal the suitable options that can serve as anti-phishing tools. More importantly, we experimentally compare large numbers of ML techniques on real phishing datasets and with respect to different metrics. The purpose of the comparison is to reveal the advantages and disadvantages of ML predictive models and to show their actual performance when it comes to phishing attacks. The experimental results show that Covering approach models are more appropriate as anti-phishing solutions, especially for novice users, because of their simple yet effective knowledge bases in addition to their good phishing detection rate.
Genetic Algorithms are group of mathematical models in computational science by exciting evolution in AI techniques nowadays. These algorithms preserve critical information by applying data structure with simple chromosome recombination operators by encoding solution to a specific problem. Genetic algorithms they are optimizer, in which range of problems applied to it are quite broad. Genetic Algorithms with its global search includes basic principles like selection, crossover and mutation. Data structures, algorithms and human brain inspiration are found for classification of data and for learning which works using Neural Networks. Artificial Intelligence (AI) it is a field, where so many tasks performed naturally by a human. When AI conventional methods are used in a computer it was proved as a complicated task. Applying Neural Networks techniques will create an internal structure of rules by which a program can learn by examples, to classify different inputs than mining techniques. This paper proposes a phishing websites classifier using improved polynomial neural networks in genetic algorithm.
This article presents introduction to HTTP Security Headers - new security topic in communication over Internet. It is emphasized that HTTPS protocol and SSL/TLS certificates alone do not offer sufficient level of security for communication among people and devices. In the world of web applications and Internet of Things (IoT), it is vital to bring communication security at higher level, what could be realised via few simple steps. HTTP Response Headers used for different purposes in the past are now the effective way how to propagate security policies from servers to clients (from web servers to web browsers). First improvement is enforcing HTTPS protocol for communication everywhere it is possible and promote this protocol as first and only option for secure connection over the Internet. It is emphasized that HTTP protocol for communication is not suitable anymore.
Attacks against websites are increasing rapidly with the expansion of web services. An increasing number of diversified web services make it difficult to prevent such attacks due to many known vulnerabilities in websites. To overcome this problem, it is necessary to collect the most recent attacks using decoy web honeypots and to implement countermeasures against malicious threats. Web honeypots collect not only malicious accesses by attackers but also benign accesses such as those by web search crawlers. Thus, it is essential to develop a means of automatically identifying malicious accesses from mixed collected data including both malicious and benign accesses. Specifically, detecting vulnerability scanning, which is a preliminary process, is important for preventing attacks. In this study, we focused on classification of accesses for web crawling and vulnerability scanning since these accesses are too similar to be identified. We propose a feature vector including features of collective accesses, e.g., intervals of request arrivals and the dispersion of source port numbers, obtained with multiple honeypots deployed in different networks for classification. Through evaluation using data collected from 37 honeypots in a real network, we show that features of collective accesses are advantageous for vulnerability scanning and crawler classification.
Software development and web applications have become fundamental in our lives. Millions of users access these applications to communicate, obtain information and perform transactions. However, these users are exposed to many risks; commonly due to the developer's lack of experience in security protocols. Although there are many researches about web security and hacking protection, there are plenty of vulnerable websites. This article focuses in analyzing 3 main hacking techniques: XSS, CSRF, and SQL Injection over a representative group of Colombian websites. Our goal is to obtain information about how Colombian companies and organizations give (or not) relevance to security; and how the final user could be affected.
Cybersecurity is a problem of growing relevance that impacts all facets of society. As a result, many researchers have become interested in studying cybercriminals and online hacker communities in order to develop more effective cyber defenses. In particular, analysis of hacker community contents may reveal existing and emerging threats that pose great risk to individuals, businesses, and government. Thus, we are interested in developing an automated methodology for identifying tangible and verifiable evidence of potential threats within hacker forums, IRC channels, and carding shops. To identify threats, we couple machine learning methodology with information retrieval techniques. Our approach allows us to distill potential threats from the entirety of collected hacker contents. We present several examples of identified threats found through our analysis techniques. Results suggest that hacker communities can be analyzed to aid in cyber threat detection, thus providing promising direction for future work.
Language vector space models (VSMs) have recently proven to be effective across a variety of tasks. In VSMs, each word in a corpus is represented as a real-valued vector. These vectors can be used as features in many applications in machine learning and natural language processing. In this paper, we study the effect of vector space representations in cyber security. In particular, we consider a passive traffic analysis attack (Website Fingerprinting) that threatens users' navigation privacy on the web. By using anonymous communication, Internet users (such as online activists) may wish to hide the destination of web pages they access for different reasons such as avoiding tyrant governments. Traditional website fingerprinting studies collect packets from the users' network and extract features that are used by machine learning techniques to reveal the destination of certain web pages. In this work, we propose the packet to vector (P2V) approach where we model website fingerprinting attack using word vector representations. We show how the suggested model outperforms previous website fingerprinting works.
The rate at which cyber-attacks are increasing globally portrays a terrifying picture upfront. The main dynamics of such attacks could be studied in terms of the actions of attackers and defenders in a cyber-security game. However currently little research has taken place to study such interactions. In this paper we use behavioral game theory and try to investigate the role of certain actions taken by attackers and defenders in a simulated cyber-attack scenario of defacing a website. We choose a Reinforcement Learning (RL) model to represent a simulated attacker and a defender in a 2×4 cyber-security game where each of the 2 players could take up to 4 actions. A pair of model participants were computationally simulated across 1000 simulations where each pair played at most 30 rounds in the game. The goal of the attacker was to deface the website and the goal of the defender was to prevent the attacker from doing so. Our results show that the actions taken by both the attackers and defenders are a function of attention paid by these roles to their recently obtained outcomes. It was observed that if attacker pays more attention to recent outcomes then he is more likely to perform attack actions. We discuss the implication of our results on the evolution of dynamics between attackers and defenders in cyber-security games.
The development of internet comes with the other domain that is cyber-crime. The record and intelligently can be exposed to a user of illegal activity so that it has become important to make the technology reliable. Phishing techniques include domain of email messages. Phishing emails have hosted such a phishing website, where a click on the URL or the malware code as executing some actions to perform is socially engineered messages. Lexically analyzing the URLs can enhance the performance and help to differentiate between the original email and the phishing URL. As assessed in this study, in addition to textual analysis of phishing URL, email classification is successful and results in a highly precise anti phishing.
Language vector space models (VSMs) have recently proven to be effective across a variety of tasks. In VSMs, each word in a corpus is represented as a real-valued vector. These vectors can be used as features in many applications in machine learning and natural language processing. In this paper, we study the effect of vector space representations in cyber security. In particular, we consider a passive traffic analysis attack (Website Fingerprinting) that threatens users' navigation privacy on the web. By using anonymous communication, Internet users (such as online activists) may wish to hide the destination of web pages they access for different reasons such as avoiding tyrant governments. Traditional website fingerprinting studies collect packets from the users' network and extract features that are used by machine learning techniques to reveal the destination of certain web pages. In this work, we propose the packet to vector (P2V) approach where we model website fingerprinting attack using word vector representations. We show how the suggested model outperforms previous website fingerprinting works.
Recent years, HTML5 is widely adopted in popular browsers. Unfortunately, as a new Web standard, HTML5 may expand the Cross Site Scripting (XSS) attack surface as well as improve the interactivity of the page. In this paper, we identified 14 XSS attack vectors related to HTML5 by a systematic analysis about new tags and attributes. Based on these vectors, a XSS test vector repository is constructed and a dynamic XSS vulnerability detection tool focusing on Webmail systems is implemented. By applying the tool to some popular Webmail systems, seven exploitable XSS vulnerabilities are found. The evaluation result shows that our tool can efficiently detect XSS vulnerabilities introduced by HTML5.
Cross-Site Scripting (XSS) is a common attack technique that lets attackers insert the code in the output application of web page which is referred to the web browser of visitor and then the inserted code executes automatically and steals the sensitive information. In order to prevent the users from XSS attack, many client- side solutions have been implemented; most of them being used are the filters that sanitize the malicious input. However, many of these filters do not provide prevention to the newly designed sophisticated attacks such as multiple points of injection, injection into script etc. This paper proposes and implements an approach based on encoding unfiltered reflections for detecting vulnerable web applications which can be exploited using above mentioned sophisticated attacks. Results prove that the proposed approach provides accurate higher detection rate of exploits. In addition to this, an implementation of blocking the execution of malicious scripts have contributed to XSS-Me: an open source Mozilla Firefox security extension that detects for reflected XSS vulnerabilities which can be considered as an effective solution if it is integrated inside the browser rather than being enforced as an extension.
We propose a novel phishing detection architecture based on transparent virtualization technologies and isolation of the own components. The architecture can be deployed as a security extension for virtual machines (VMs) running in the cloud. It uses fine-grained VM introspection (VMI) to extract, filter and scale a color-based fingerprint of web pages which are processed by a browser from the VM's memory. By analyzing the human perceptual similarity between the fingerprints, the architecture can reveal and mitigate phishing attacks which are based on redirection to spoofed web pages and it can also detect “Man-in-the-Browser” (MitB) attacks. To the best of our knowledge, the architecture is the first anti-phishing solution leveraging virtualization technologies. We explain details about the design and the implementation and we show results of an evaluation with real-world data.
In the early days of the web, content was designed and hosted by a single person, group, or organization. No longer. Webpages are increasingly composed of content from myriad unrelated "third-party" websites in the business of advertising, analytics, social networking, and more. Third-party services have tremendous value: they support free content and facilitate web innovation. But third-party services come at a privacy cost: researchers, civil society organizations, and policymakers have increasingly called attention to how third parties can track a user's browsing activities across websites. This paper surveys the current policy debate surrounding third-party web tracking and explains the relevant technology. It also presents the FourthParty web measurement platform and studies we have conducted with it. Our aim is to inform researchers with essential background and tools for contributing to public understanding and policy debates about web tracking.
We consider the setting of HTTP traffic over encrypted tunnels, as used to conceal the identity of websites visited by a user. It is well known that traffic analysis (TA) attacks can accurately identify the website a user visits despite the use of encryption, and previous work has looked at specific attack/countermeasure pairings. We provide the first comprehensive analysis of general-purpose TA countermeasures. We show that nine known countermeasures are vulnerable to simple attacks that exploit coarse features of traffic (e.g., total time and bandwidth). The considered countermeasures include ones like those standardized by TLS, SSH, and IPsec, and even more complex ones like the traffic morphing scheme of Wright et al. As just one of our results, we show that despite the use of traffic morphing, one can use only total upstream and downstream bandwidth to identify – with 98% accuracy - which of two websites was visited. One implication of what we find is that, in the context of website identification, it is unlikely that bandwidth-efficient, general-purpose TA countermeasures can ever provide the type of security targeted in prior work.
- « first
- ‹ previous
- 1
- 2
- 3
- 4