Biblio
Cyber threat intelligence (CTI) is vital for enabling effective cybersecurity decisions by providing timely, relevant, and actionable information about emerging threats. Monitoring the dark web to generate CTI is one of the upcoming trends in cybersecurity. As a result, developing CTI capabilities with the dark web investigation is a significant focus for cybersecurity companies like Deepwatch, DarkOwl, SixGill, ThreatConnect, CyLance, ZeroFox, and many others. In addition, the dark web marketplace (DWM) monitoring tools are of much interest to law enforcement agencies (LEAs). The fact that darknet market participants operate anonymously and online transactions are pseudo-anonymous makes it challenging to identify and investigate them. Therefore, keeping up with the DWMs poses significant challenges for LEAs today. Nevertheless, the offerings on the DWM give insights into the dark web economy to LEAs. The present work is one such attempt to describe and analyze dark web market data collected for CTI using a dark web crawler. After processing and labeling, authors have 53 DWMs with their product listings and pricing.
The Internet-of-Things (IoT) paradigm at large continues to be compromised, hindering the privacy, dependability, security, and safety of our nations. While the operational security communities (i.e., CERTS, SOCs, CSIRT, etc.) continue to develop capabilities for monitoring cyberspace, tools which are IoT-centric remain at its infancy. To this end, we address this gap by innovating an actionable Cyber Threat Intelligence (CTI) feed related to Internet-scale infected IoT devices. The feed analyzes, in near real-time, 3.6TB of daily streaming passive measurements ( ≈ 1M pps) by applying a custom-developed learning methodology to distinguish between compromised IoT devices and non-IoT nodes, in addition to labeling the type and vendor. The feed is augmented with third party information to provide contextual information. We report on the operation, analysis, and shortcomings of the feed executed during an initial deployment period. We make the CTI feed available for ingestion through a public, authenticated API and a front-end platform.
Bitcoin is gaining traction as an alternative store of value. Its market capitalization transcends all other cryptocurrencies in the market. But its high monetary value also makes it an attractive target to cyber criminal actors. Hacking campaigns usually target the weakest points in an ecosystem. In Bitcoin, these are currently the exchange platforms. As each exchange breach potentially decreases Bitcoin's market value by billions, it is a threat not only to direct victims, but to everyone owning Bitcoin. Based on an extensive analysis of 36 breaches of Bitcoin exchanges, we show the attack patterns used to exploit Bitcoin exchange platforms using an industry standard for reporting intelligence on cyber security breaches. Based on this we are able to provide an overview of the most common attack vectors, showing that all except three hacks were possible due to relatively lax security. We also show that while the security regimen of Bitcoin exchanges is not on par with other financial service providers, the use of stolen credentials, which does not require any hacking, is decreasing. We also show that the amount of BTC taken during a breach is decreasing, as well as the exchanges that terminate after being breached. With exchanges being targeted by nation-state hacking groups, security needs to be a first concern.
The legacy security defense mechanisms cannot resist where emerging sophisticated threats such as zero-day and malware campaigns have profoundly changed the dimensions of cyber-attacks. Recent studies indicate that cyber threat intelligence plays a crucial role in implementing proactive defense operations. It provides a knowledge-sharing platform that not only increases security awareness and readiness but also enables the collaborative defense to diminish the effectiveness of potential attacks. In this paper, we propose a secure distributed model to facilitate cyber threat intelligence sharing among diverse participants. The proposed model uses blockchain technology to assure tamper-proof record-keeping and smart contracts to guarantee immutable logic. We use an open-source permissioned blockchain platform, Hyperledger Fabric, to implement the blockchain application. We also utilize the flexibility and management capabilities of Software-Defined Networking to be integrated with the proposed sharing platform to enhance defense perspectives against threats in the system. In the end, collaborative DDoS attack mitigation is taken as a case study to demonstrate our approach.
In recent years, cyberattack techniques have become more and more sophisticated each day. Even if defense measures are taken against cyberattacks, it is difficult to prevent them completely. It can also be said that people can only fight defensively against cyber criminals. To address this situation, it is necessary to predict cyberattacks and take appropriate measures in advance, and the use of intelligence is important to make this possible. In general, many malicious hackers share information and tools that can be used for attacks on the dark web or in the specific communities. Therefore, we assume that a lot of intelligence, including this illegal content exists in cyber space. By using the threat intelligence, detecting attacks in advance and developing active defense is expected these days. However, such intelligence is currently extracted manually. In order to do this more efficiently, we apply machine learning to various forum posts that exist on the dark web, with the aim of extracting forum posts containing threat information. By doing this, we expect that detecting threat information in cyber space in a timely manner will be possible so that the optimal preventive measures will be taken in advance.
Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.
The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information that -given the appropriate tools and methods-may be identified, crawled and subsequently leveraged to actionable cyber-threat intelligence. In this work, we focus on the information gathering task, and present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web. The proposed architecture adopts a two-phase approach to data harvesting. Initially a machine learning-based crawler is used to direct the harvesting towards websites of interest, while in the second phase state-of-the-art statistical language modelling techniques are used to represent the harvested information in a latent low-dimensional feature space and rank it based on its potential relevance to the task at hand. The proposed architecture is realised using exclusively open-source tools, and a preliminary evaluation with crowdsourced results demonstrates its effectiveness.
In recent years, cyber attack techniques are increasingly sophisticated, and blocking the attack is more and more difficult, even if a kind of counter measure or another is taken. In order for a successful handling of this situation, it is crucial to have a prediction of cyber attacks, appropriate precautions, and effective utilization of cyber intelligence that enables these actions. Malicious hackers share various kinds of information through particular communities such as the dark web, indicating that a great deal of intelligence exists in cyberspace. This paper focuses on forums on the dark web and proposes an approach to extract forums which include important information or intelligence from huge amounts of forums and identify traits of each forum using methodologies such as machine learning, natural language processing and so on. This approach will allow us to grasp the emerging threats in cyberspace and take appropriate measures against malicious activities.
To manage cybersecurity risks in practice, a simple yet effective method to assess suchs risks for individual systems is needed. With time-to-compromise (TTC), McQueen et al. (2005) introduced such a metric that measures the expected time that a system remains uncompromised given a specific threat landscape. Unlike other approaches that require complex system modeling to proceed, TTC combines simplicity with expressiveness and therefore has evolved into one of the most successful cybersecurity metrics in practice. We revisit TTC and identify several mathematical and methodological shortcomings which we address by embedding all aspects of the metric into the continuous domain and the possibility to incorporate information about vulnerability characteristics and other cyber threat intelligence into the model. We propose $\beta$-TTC, a formal extension of TTC which includes information from CVSS vectors as well as a continuous attacker skill based on a $\beta$-distribution. We show that our new metric (1) remains simple enough for practical use and (2) gives more realistic predictions than the original TTC by using data from a modern and productively used vulnerability database of a national CERT.
To manage cybersecurity risks in practice, a simple yet effective method to assess suchs risks for individual systems is needed. With time-to-compromise (TTC), McQueen et al. (2005) introduced such a metric that measures the expected time that a system remains uncompromised given a specific threat landscape. Unlike other approaches that require complex system modeling to proceed, TTC combines simplicity with expressiveness and therefore has evolved into one of the most successful cybersecurity metrics in practice. We revisit TTC and identify several mathematical and methodological shortcomings which we address by embedding all aspects of the metric into the continuous domain and the possibility to incorporate information about vulnerability characteristics and other cyber threat intelligence into the model. We propose β-TTC, a formal extension of TTC which includes information from CVSS vectors as well as a continuous attacker skill based on a β-distribution. We show that our new metric (1) remains simple enough for practical use and (2) gives more realistic predictions than the original TTC by using data from a modern and productively used vulnerability database of a national CERT.
Cyber threats and attacks have significantly increased in complexity and quantity throughout this past year. In this paper, the top fifteen cyber threats and trends are articulated in detail to provide awareness throughout the community and raising awareness. Specific attack vectors, mitigation techniques, kill chain and threat agents addressing Smart Digital Environments (SDE), including Internet of Things (IoT), are discussed. Due to the rising number of IoT and embedded firmware devices within ubiquitous computing environments such as smart homes, smart businesses and smart cities, the top fifteen cyber threats are being used in a comprehensive manner to take advantage of vulnerabilities and launch cyber operations using multiple attack vectors. What began as ubiquitous, or pervasive, computing is now matured to smart environments where the vulnerabilities and threats are widespread.
Tactics Techniques and Procedures (TTPs) in cyber domain is an important threat information that describes the behavior and attack patterns of an adversary. Timely identification of associations between TTPs can lead to effective strategy for diagnosing the Cyber Threat Actors (CTAs) and their attack vectors. This study profiles the prevalence and regularities in the TTPs of CTAs. We developed a machine learning-based framework that takes as input Cyber Threat Intelligence (CTI) documents, selects the most prevalent TTPs with high information gain as features and based on them mine interesting regularities between TTPs using Association Rule Mining (ARM). We evaluated the proposed framework with publicly available TTPbased CTI documents. The results show that there are 28 TTPs more prevalent than the other TTPs. Our system identified 155 interesting association rules among the TTPs of CTAs. A summary of these rules is given to effectively investigate threats in the network.
Traditional security controls, such as firewalls, anti-virus and IDS, are ill-equipped to help IT security and response teams keep pace with the rapid evolution of the cyber threat landscape. Cyber Threat Intelligence (CTI) can help remediate this problem by exploiting non-traditional information sources, such as hacker forums and "dark-web" social platforms. Security and response teams can use the collected intelligence to identify emerging threats. Unfortunately, when manual analysis is used to extract CTI from non-traditional sources, it is a time consuming, error-prone and resource intensive process. We address these issues by using a hybrid Machine Learning model that automatically searches through hacker forum posts, identifies the posts that are most relevant to cyber security and then clusters the relevant posts into estimations of the topics that the hackers are discussing. The first (identification) stage uses Support Vector Machines and the second (clustering) stage uses Latent Dirichlet Allocation. We tested our model, using data from an actual hacker forum, to automatically extract information about various threats such as leaked credentials, malicious proxy servers, malware that evades AV detection, etc. The results demonstrate our method is an effective means for quickly extracting relevant and actionable intelligence that can be integrated with traditional security controls to increase their effectiveness.
Hacker forums and other social platforms may contain vital information about cyber security threats. But using manual analysis to extract relevant threat information from these sources is a time consuming and error-prone process that requires a significant allocation of resources. In this paper, we explore the potential of Machine Learning methods to rapidly sift through hacker forums for relevant threat intelligence. Utilizing text data from a real hacker forum, we compared the text classification performance of Convolutional Neural Network methods against more traditional Machine Learning approaches. We found that traditional machine learning methods, such as Support Vector Machines, can yield high levels of performance that are on par with Convolutional Neural Network algorithms.