Visible to the public Biblio

Filters: Keyword is cyber threat intelligence  [Clear All Filters]
2023-06-02
Dalvi, Ashwini, Patil, Gunjan, Bhirud, S G.  2022.  Dark Web Marketplace Monitoring - The Emerging Business Trend of Cybersecurity. 2022 International Conference on Trends in Quantum Computing and Emerging Business Technologies (TQCEBT). :1—6.

Cyber threat intelligence (CTI) is vital for enabling effective cybersecurity decisions by providing timely, relevant, and actionable information about emerging threats. Monitoring the dark web to generate CTI is one of the upcoming trends in cybersecurity. As a result, developing CTI capabilities with the dark web investigation is a significant focus for cybersecurity companies like Deepwatch, DarkOwl, SixGill, ThreatConnect, CyLance, ZeroFox, and many others. In addition, the dark web marketplace (DWM) monitoring tools are of much interest to law enforcement agencies (LEAs). The fact that darknet market participants operate anonymously and online transactions are pseudo-anonymous makes it challenging to identify and investigate them. Therefore, keeping up with the DWMs poses significant challenges for LEAs today. Nevertheless, the offerings on the DWM give insights into the dark web economy to LEAs. The present work is one such attempt to describe and analyze dark web market data collected for CTI using a dark web crawler. After processing and labeling, authors have 53 DWMs with their product listings and pricing.

2023-04-28
Suryotrisongko, Hatma, Ginardi, Hari, Ciptaningtyas, Henning Titi, Dehqan, Saeed, Musashi, Yasuo.  2022.  Topic Modeling for Cyber Threat Intelligence (CTI). 2022 Seventh International Conference on Informatics and Computing (ICIC). :1–7.
Topic modeling algorithms from the natural language processing (NLP) discipline have been used for various applications. For instance, topic modeling for the product recommendation systems in the e-commerce systems. In this paper, we briefly reviewed topic modeling applications and then described our proposed idea of utilizing topic modeling approaches for cyber threat intelligence (CTI) applications. We improved the previous work by implementing BERTopic and Top2Vec approaches, enabling users to select their preferred pre-trained text/sentence embedding model, and supporting various languages. We implemented our proposed idea as the new topic modeling module for the Open Web Application Security Project (OWASP) Maryam: Open-Source Intelligence (OSINT) framework. We also described our experiment results using a leaked hacker forum dataset (nulled.io) to attract more researchers and open-source communities to participate in the Maryam project of OWASP Foundation.
2023-01-13
Kaiser, Florian K., Andris, Leon J., Tennig, Tim F., Iser, Jonas M., Wiens, Marcus, Schultmann, Frank.  2022.  Cyber threat intelligence enabled automated attack incident response. 2022 3rd International Conference on Next Generation Computing Applications (NextComp). :1—6.
Cyber attacks keep states, companies and individuals at bay, draining precious resources including time, money, and reputation. Attackers thereby seem to have a first mover advantage leading to a dynamic defender attacker game. Automated approaches taking advantage of Cyber Threat Intelligence on past attacks bear the potential to empower security professionals and hence increase cyber security. Consistently, there has been a lot of research on automated approaches in cyber risk management including works on predictive attack algorithms and threat hunting. Combining data on countermeasures from “MITRE Detection, Denial, and Disruption Framework Empowering Network Defense” and adversarial data from “MITRE Adversarial Tactics, Techniques and Common Knowledge” this work aims at developing methods that enable highly precise and efficient automatic incident response. We introduce Attack Incident Responder, a methodology working with simple heuristics to find the most efficient sets of counter-measures for hypothesized attacks. By doing so, the work contributes to narrowing the attackers first mover advantage. Experimental results are promising high average precisions in predicting effiective defenses when using the methodology. In addition, we compare the proposed defense measures against a static set of defensive techniques offering robust security against observed attacks. Furthermore, we combine the approach of automated incidence response to an approach for threat hunting enabling full automation of security operation centers. By this means, we define a threshold in the precision of attack hypothesis generation that must be met for predictive defense algorithms to outperform the baseline. The calculated threshold can be used to evaluate attack hypothesis generation algorithms. The presented methodology for automated incident response may be a valuable support for information security professionals. Last, the work elaborates on the combination of static base defense with adaptive incidence response for generating a bio-inspired artificial immune system for computerized networks.
2022-06-09
Pour, Morteza Safaei, Watson, Dylan, Bou-Harb, Elias.  2021.  Sanitizing the IoT Cyber Security Posture: An Operational CTI Feed Backed up by Internet Measurements. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :497–506.

The Internet-of-Things (IoT) paradigm at large continues to be compromised, hindering the privacy, dependability, security, and safety of our nations. While the operational security communities (i.e., CERTS, SOCs, CSIRT, etc.) continue to develop capabilities for monitoring cyberspace, tools which are IoT-centric remain at its infancy. To this end, we address this gap by innovating an actionable Cyber Threat Intelligence (CTI) feed related to Internet-scale infected IoT devices. The feed analyzes, in near real-time, 3.6TB of daily streaming passive measurements ( ≈ 1M pps) by applying a custom-developed learning methodology to distinguish between compromised IoT devices and non-IoT nodes, in addition to labeling the type and vendor. The feed is augmented with third party information to provide contextual information. We report on the operation, analysis, and shortcomings of the feed executed during an initial deployment period. We make the CTI feed available for ingestion through a public, authenticated API and a front-end platform.

2022-04-12
Yucel, Cagatay, Chalkias, Ioannis, Mallis, Dimitrios, Cetinkaya, Deniz, Henriksen-Bulmer, Jane, Cooper, Alice.  2021.  Data Sanitisation and Redaction for Cyber Threat Intelligence Sharing Platforms. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :343—347.
The recent technological advances and changes in the daily human activities increased the production and sharing of data. In the ecosystem of interconnected systems, data can be circulated among systems for various reasons. This could lead to exchange of private or sensitive information between entities. Data Sanitisation involves processes and practices that remove sensitive and private information from documents before sharing them with entities that should not have access to this information. This paper presents the design and development of a data sanitisation and redaction solution for a Cyber Threat Intelligence sharing platform. The Data Sanitisation and Redaction Plugin has been designed with the purpose of operating as a plugin for the ECHO Project’s Early Warning System platform and enhancing its operative capabilities during information sharing. This plugin aims to provide automated security and privacy-based controls to the concept of CTI sharing over a ticketing system. The plugin has been successfully tested and the results are presented in this paper.
Evangelatos, Pavlos, Iliou, Christos, Mavropoulos, Thanassis, Apostolou, Konstantinos, Tsikrika, Theodora, Vrochidis, Stefanos, Kompatsiaris, Ioannis.  2021.  Named Entity Recognition in Cyber Threat Intelligence Using Transformer-based Models. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :348—353.
The continuous increase in sophistication of threat actors over the years has made the use of actionable threat intelligence a critical part of the defence against them. Such Cyber Threat Intelligence is published daily on several online sources, including vulnerability databases, CERT feeds, and social media, as well as on forums and web pages from the Surface and the Dark Web. Named Entity Recognition (NER) techniques can be used to extract the aforementioned information in an actionable form from such sources. In this paper we investigate how the latest advances in the NER domain, and in particular transformer-based models, can facilitate this process. To this end, the dataset for NER in Threat Intelligence (DNRTI) containing more than 300 pieces of threat intelligence reports from open source threat intelligence websites is used. Our experimental results demonstrate that transformer-based techniques are very effective in extracting cybersecurity-related named entities, by considerably outperforming the previous state- of-the-art approaches tested with DNRTI.
2022-02-09
Ranade, Priyanka, Piplai, Aritran, Mittal, Sudip, Joshi, Anupam, Finin, Tim.  2021.  Generating Fake Cyber Threat Intelligence Using Transformer-Based Models. 2021 International Joint Conference on Neural Networks (IJCNN). :1–9.
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing their models to learn incorrect inputs to serve the attackers' malicious needs. In this paper, we show how to automatically generate fake CTI text descriptions using transformers. Given an initial prompt sentence, a public language model like GPT-2 with fine-tuning can generate plausible CTI text that can mislead cyber-defense systems. We use the generated fake CTI text to perform a data poisoning attack on a Cybersecurity Knowledge Graph (CKG) and a cybersecurity corpus. The attack introduced adverse impacts such as returning incorrect reasoning outputs, representation poisoning, and corruption of other dependent AI-based cyber defense systems. We evaluate with traditional approaches and conduct a human evaluation study with cyber-security professionals and threat hunters. Based on the study, professional threat hunters were equally likely to consider our fake generated CTI and authentic CTI as true.
2021-12-20
Kriaa, Siwar, Chaabane, Yahia.  2021.  SecKG: Leveraging attack detection and prediction using knowledge graphs. 2021 12th International Conference on Information and Communication Systems (ICICS). :112–119.
Advanced persistent threats targeting sensitive corporations, are becoming today stealthier and more complex, coordinating different attacks steps and lateral movements, and trying to stay undetected for long time. Classical security solutions that rely on signature-based detection can be easily thwarted by malware using obfuscation and encryption techniques. More recent solutions are using machine learning approaches for detecting outliers. Nevertheless, the majority of them reason on tabular unstructured data which can lead to missing obvious conclusions. We propose in this paper a novel approach that leverages a combination of both knowledge graphs and machine learning techniques to detect and predict attacks. Using Cyber Threat Intelligence (CTI), we built a knowledge graph that processes event logs in order to not only detect attack techniques, but also learn how to predict them.
2021-10-04
Xu, Yuanchen, Yang, Yingjie, He, Ying.  2020.  A Representation of Business Oriented Cyber Threat Intelligence and the Objects Assembly. 2020 10th International Conference on Information Science and Technology (ICIST). :105–113.
Cyber threat intelligence (CTI) is an effective approach to improving cyber security of businesses. CTI provides information of business contexts affected by cyber threats and the corresponding countermeasures. If businesses can identify relevant CTI, they can take defensive actions before the threats, described in the relevant CTI, take place. However, businesses still lack knowledge to help identify relevant CTI. Furthermore, information in real-world systems is usually vague, imprecise, inconsistent and incomplete. This paper defines a business object that is a business context surrounded by CTI. A business object models the connection knowledge for CTI onto the business. To assemble the business objects, this paper proposes a novel representation of business oriented CTI and a system used for constructing and extracting the business objects. Generalised grey numbers, fuzzy sets and rough sets are used for the representation, and set approximations are used for the extraction of the business objects. We develop a prototype of the system and use a case study to demonstrate how the system works. We then conclude the paper together with the future research directions.
2021-03-09
Oosthoek, K., Doerr, C..  2020.  From Hodl to Heist: Analysis of Cyber Security Threats to Bitcoin Exchanges. 2020 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). :1—9.

Bitcoin is gaining traction as an alternative store of value. Its market capitalization transcends all other cryptocurrencies in the market. But its high monetary value also makes it an attractive target to cyber criminal actors. Hacking campaigns usually target the weakest points in an ecosystem. In Bitcoin, these are currently the exchange platforms. As each exchange breach potentially decreases Bitcoin's market value by billions, it is a threat not only to direct victims, but to everyone owning Bitcoin. Based on an extensive analysis of 36 breaches of Bitcoin exchanges, we show the attack patterns used to exploit Bitcoin exchange platforms using an industry standard for reporting intelligence on cyber security breaches. Based on this we are able to provide an overview of the most common attack vectors, showing that all except three hacks were possible due to relatively lax security. We also show that while the security regimen of Bitcoin exchanges is not on par with other financial service providers, the use of stolen credentials, which does not require any hacking, is decreasing. We also show that the amount of BTC taken during a breach is decreasing, as well as the exchanges that terminate after being breached. With exchanges being targeted by nation-state hacking groups, security needs to be a first concern.

2021-03-04
Hajizadeh, M., Afraz, N., Ruffini, M., Bauschert, T..  2020.  Collaborative Cyber Attack Defense in SDN Networks using Blockchain Technology. 2020 6th IEEE Conference on Network Softwarization (NetSoft). :487—492.

The legacy security defense mechanisms cannot resist where emerging sophisticated threats such as zero-day and malware campaigns have profoundly changed the dimensions of cyber-attacks. Recent studies indicate that cyber threat intelligence plays a crucial role in implementing proactive defense operations. It provides a knowledge-sharing platform that not only increases security awareness and readiness but also enables the collaborative defense to diminish the effectiveness of potential attacks. In this paper, we propose a secure distributed model to facilitate cyber threat intelligence sharing among diverse participants. The proposed model uses blockchain technology to assure tamper-proof record-keeping and smart contracts to guarantee immutable logic. We use an open-source permissioned blockchain platform, Hyperledger Fabric, to implement the blockchain application. We also utilize the flexibility and management capabilities of Software-Defined Networking to be integrated with the proposed sharing platform to enhance defense perspectives against threats in the system. In the end, collaborative DDoS attack mitigation is taken as a case study to demonstrate our approach.

2021-01-15
Kadoguchi, M., Kobayashi, H., Hayashi, S., Otsuka, A., Hashimoto, M..  2020.  Deep Self-Supervised Clustering of the Dark Web for Cyber Threat Intelligence. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1—6.

In recent years, cyberattack techniques have become more and more sophisticated each day. Even if defense measures are taken against cyberattacks, it is difficult to prevent them completely. It can also be said that people can only fight defensively against cyber criminals. To address this situation, it is necessary to predict cyberattacks and take appropriate measures in advance, and the use of intelligence is important to make this possible. In general, many malicious hackers share information and tools that can be used for attacks on the dark web or in the specific communities. Therefore, we assume that a lot of intelligence, including this illegal content exists in cyber space. By using the threat intelligence, detecting attacks in advance and developing active defense is expected these days. However, such intelligence is currently extracted manually. In order to do this more efficiently, we apply machine learning to various forum posts that exist on the dark web, with the aim of extracting forum posts containing threat information. By doing this, we expect that detecting threat information in cyber space in a timely manner will be possible so that the optimal preventive measures will be taken in advance.

Zhang, N., Ebrahimi, M., Li, W., Chen, H..  2020.  A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1—6.

Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.

2020-08-07
Berady, Aimad, Viet Triem Tong, Valerie, Guette, Gilles, Bidan, Christophe, Carat, Guillaume.  2019.  Modeling the Operational Phases of APT Campaigns. 2019 International Conference on Computational Science and Computational Intelligence (CSCI). :96—101.
In the context of Advanced Persistent Threat (APT) attacks, this paper introduces a model, called Nuke, which tries to provide a more operational reading of the attackers' lifecycle in a compromised network. It allows to consider the notions of regression; and repetitiveness of final objectives achievement. By confronting this model with examples of recent attacks (Equifax data breach and TV5Monde sabotage), we emphasize the importance of the attack chronology in the Cyber Threat Intelligence (CTI) reports, as well as the Tactics, Techniques and Procedures (TTP) used by the attacker during his progression.
2020-08-03
Parmar, Manisha, Domingo, Alberto.  2019.  On the Use of Cyber Threat Intelligence (CTI) in Support of Developing the Commander's Understanding of the Adversary. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.
Cyber Threat Intelligence (CTI) is a rapidly developing field which has evolved in direct response to exponential growth in cyber related crimes and attacks. CTI supports Communication and Information System (CIS)Security in order to bolster defenses and aids in the development of threat models that inform an organization's decision making process. In a military organization like NATO, CTI additionally supports Cyberspace Operations by providing the Commander with essential intelligence about the adversary, their capabilities and objectives while operating in and through cyberspace. There have been many contributions to the CTI field; a noteworthy contribution is the ATT&CK® framework by the Mitre Corporation. ATT&CK® contains a comprehensive list of adversary tactics and techniques linked to custom or publicly known Advanced Persistent Threats (APT) which aids an analyst in the characterization of Indicators of Compromise (IOCs). The ATT&CK® framework also demonstrates possibility of supporting an organization with linking observed tactics and techniques to specific APT behavior, which may assist with adversary characterization and identification, necessary steps towards attribution. The NATO Allied Command Transformation (ACT) and the NATO Communication and Information Agency (NCI Agency) have been experimenting with the use of deception techniques (including decoys) to increase the collection of adversary related data. The collected data is mapped to the tactics and techniques described in the ATT&CK® framework, in order to derive evidence to support adversary characterization; this intelligence is pivotal for the Commander to support mission planning and determine the best possible multi-domain courses of action. This paper describes the approach, methodology, outcomes and next steps for the conducted experiments.
2020-07-10
Koloveas, Paris, Chantzios, Thanasis, Tryfonopoulos, Christos, Skiadopoulos, Spiros.  2019.  A Crawler Architecture for Harvesting the Clear, Social, and Dark Web for IoT-Related Cyber-Threat Intelligence. 2019 IEEE World Congress on Services (SERVICES). 2642-939X:3—8.

The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information that -given the appropriate tools and methods-may be identified, crawled and subsequently leveraged to actionable cyber-threat intelligence. In this work, we focus on the information gathering task, and present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web. The proposed architecture adopts a two-phase approach to data harvesting. Initially a machine learning-based crawler is used to direct the harvesting towards websites of interest, while in the second phase state-of-the-art statistical language modelling techniques are used to represent the harvested information in a latent low-dimensional feature space and rank it based on its potential relevance to the task at hand. The proposed architecture is realised using exclusively open-source tools, and a preliminary evaluation with crowdsourced results demonstrates its effectiveness.

2020-01-28
KADOGUCHI, Masashi, HAYASHI, Shota, HASHIMOTO, Masaki, OTSUKA, Akira.  2019.  Exploring the Dark Web for Cyber Threat Intelligence Using Machine Leaning. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI). :200–202.

In recent years, cyber attack techniques are increasingly sophisticated, and blocking the attack is more and more difficult, even if a kind of counter measure or another is taken. In order for a successful handling of this situation, it is crucial to have a prediction of cyber attacks, appropriate precautions, and effective utilization of cyber intelligence that enables these actions. Malicious hackers share various kinds of information through particular communities such as the dark web, indicating that a great deal of intelligence exists in cyberspace. This paper focuses on forums on the dark web and proposes an approach to extract forums which include important information or intelligence from huge amounts of forums and identify traits of each forum using methodologies such as machine learning, natural language processing and so on. This approach will allow us to grasp the emerging threats in cyberspace and take appropriate measures against malicious activities.

2019-10-23
Zieger, Andrej, Freiling, Felix, Kossakowski, Klaus-Peter.  2018.  The $\beta$-Time-to-Compromise Metric for Practical Cyber Security Risk Estimation. 2018 11th International Conference on IT Security Incident Management IT Forensics (IMF). :115-133.

To manage cybersecurity risks in practice, a simple yet effective method to assess suchs risks for individual systems is needed. With time-to-compromise (TTC), McQueen et al. (2005) introduced such a metric that measures the expected time that a system remains uncompromised given a specific threat landscape. Unlike other approaches that require complex system modeling to proceed, TTC combines simplicity with expressiveness and therefore has evolved into one of the most successful cybersecurity metrics in practice. We revisit TTC and identify several mathematical and methodological shortcomings which we address by embedding all aspects of the metric into the continuous domain and the possibility to incorporate information about vulnerability characteristics and other cyber threat intelligence into the model. We propose $\beta$-TTC, a formal extension of TTC which includes information from CVSS vectors as well as a continuous attacker skill based on a $\beta$-distribution. We show that our new metric (1) remains simple enough for practical use and (2) gives more realistic predictions than the original TTC by using data from a modern and productively used vulnerability database of a national CERT.

2019-07-01
Zieger, A., Freiling, F., Kossakowski, K..  2018.  The β-Time-to-Compromise Metric for Practical Cyber Security Risk Estimation. 2018 11th International Conference on IT Security Incident Management IT Forensics (IMF). :115–133.

To manage cybersecurity risks in practice, a simple yet effective method to assess suchs risks for individual systems is needed. With time-to-compromise (TTC), McQueen et al. (2005) introduced such a metric that measures the expected time that a system remains uncompromised given a specific threat landscape. Unlike other approaches that require complex system modeling to proceed, TTC combines simplicity with expressiveness and therefore has evolved into one of the most successful cybersecurity metrics in practice. We revisit TTC and identify several mathematical and methodological shortcomings which we address by embedding all aspects of the metric into the continuous domain and the possibility to incorporate information about vulnerability characteristics and other cyber threat intelligence into the model. We propose β-TTC, a formal extension of TTC which includes information from CVSS vectors as well as a continuous attacker skill based on a β-distribution. We show that our new metric (1) remains simple enough for practical use and (2) gives more realistic predictions than the original TTC by using data from a modern and productively used vulnerability database of a national CERT.

2019-03-15
Kettani, Houssain, Cannistra, Robert M..  2018.  On Cyber Threats to Smart Digital Environments. Proceedings of the 2Nd International Conference on Smart Digital Environment. :183-188.

Cyber threats and attacks have significantly increased in complexity and quantity throughout this past year. In this paper, the top fifteen cyber threats and trends are articulated in detail to provide awareness throughout the community and raising awareness. Specific attack vectors, mitigation techniques, kill chain and threat agents addressing Smart Digital Environments (SDE), including Internet of Things (IoT), are discussed. Due to the rising number of IoT and embedded firmware devices within ubiquitous computing environments such as smart homes, smart businesses and smart cities, the top fifteen cyber threats are being used in a comprehensive manner to take advantage of vulnerabilities and launch cyber operations using multiple attack vectors. What began as ubiquitous, or pervasive, computing is now matured to smart environments where the vulnerabilities and threats are widespread.

Noor, U., Anwar, Z., Noor, U., Anwar, Z., Rashid, Z..  2018.  An Association Rule Mining-Based Framework for Profiling Regularities in Tactics Techniques and Procedures of Cyber Threat Actors. 2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE). :1-6.

Tactics Techniques and Procedures (TTPs) in cyber domain is an important threat information that describes the behavior and attack patterns of an adversary. Timely identification of associations between TTPs can lead to effective strategy for diagnosing the Cyber Threat Actors (CTAs) and their attack vectors. This study profiles the prevalence and regularities in the TTPs of CTAs. We developed a machine learning-based framework that takes as input Cyber Threat Intelligence (CTI) documents, selects the most prevalent TTPs with high information gain as features and based on them mine interesting regularities between TTPs using Association Rule Mining (ARM). We evaluated the proposed framework with publicly available TTPbased CTI documents. The results show that there are 28 TTPs more prevalent than the other TTPs. Our system identified 155 interesting association rules among the TTPs of CTAs. A summary of these rules is given to effectively investigate threats in the network.

Deliu, I., Leichter, C., Franke, K..  2018.  Collecting Cyber Threat Intelligence from Hacker Forums via a Two-Stage, Hybrid Process Using Support Vector Machines and Latent Dirichlet Allocation. 2018 IEEE International Conference on Big Data (Big Data). :5008-5013.

Traditional security controls, such as firewalls, anti-virus and IDS, are ill-equipped to help IT security and response teams keep pace with the rapid evolution of the cyber threat landscape. Cyber Threat Intelligence (CTI) can help remediate this problem by exploiting non-traditional information sources, such as hacker forums and "dark-web" social platforms. Security and response teams can use the collected intelligence to identify emerging threats. Unfortunately, when manual analysis is used to extract CTI from non-traditional sources, it is a time consuming, error-prone and resource intensive process. We address these issues by using a hybrid Machine Learning model that automatically searches through hacker forum posts, identifies the posts that are most relevant to cyber security and then clusters the relevant posts into estimations of the topics that the hackers are discussing. The first (identification) stage uses Support Vector Machines and the second (clustering) stage uses Latent Dirichlet Allocation. We tested our model, using data from an actual hacker forum, to automatically extract information about various threats such as leaked credentials, malicious proxy servers, malware that evades AV detection, etc. The results demonstrate our method is an effective means for quickly extracting relevant and actionable intelligence that can be integrated with traditional security controls to increase their effectiveness.

2019-03-04
Husari, G., Niu, X., Chu, B., Al-Shaer, E..  2018.  Using Entropy and Mutual Information to Extract Threat Actions from Cyber Threat Intelligence. 2018 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–6.
With the rapid growth of the cyber attacks, cyber threat intelligence (CTI) sharing becomes essential for providing advance threat notice and enabling timely response to cyber attacks. Our goal in this paper is to develop an approach to extract low-level cyber threat actions from publicly available CTI sources in an automated manner to enable timely defense decision making. Specifically, we innovatively and successfully used the metrics of entropy and mutual information from Information Theory to analyze the text in the cybersecurity domain. Combined with some basic NLP techniques, our framework, called ActionMiner has achieved higher precision and recall than the state-of-the-art Stanford typed dependency parser, which usually works well in general English but not cybersecurity texts.
2019-02-08
Mavroeidis, Vasileios, Jøsang, Audun.  2018.  Data-Driven Threat Hunting Using Sysmon. Proceedings of the 2Nd International Conference on Cryptography, Security and Privacy. :82-88.
Threat actors can be persistent, motivated and agile, and they leverage a diversified and extensive set of tactics, techniques, and procedures to attain their goals. In response to that, organizations establish threat intelligence programs to improve their defense capabilities and mitigate risk. Actionable threat intelligence is integrated into security information and event management systems (SIEM) forming a threat intelligence platform. A threat intelligence platform aggregates log data from multiple disparate sources by deploying numerous collection agents and provides centralized analysis and reporting of an organization's security events for identifying malicious activity. Sysmon logs is a data source that has received considerable attention for endpoint visibility. Approaches for threat detection using Sysmon have been proposed mainly focusing on search engines (NoSQL database systems). This paper presents a new automated threat assessment system that relies on the analysis of continuous incoming feeds of Sysmon logs. The system is based on a cyber threat intelligence ontology and analyses Sysmon logs to classify software in different threat levels and augment cyber defensive capabilities through situational awareness, prediction, and automated courses of action.
2018-04-11
Deliu, I., Leichter, C., Franke, K..  2017.  Extracting Cyber Threat Intelligence from Hacker Forums: Support Vector Machines versus Convolutional Neural Networks. 2017 IEEE International Conference on Big Data (Big Data). :3648–3656.

Hacker forums and other social platforms may contain vital information about cyber security threats. But using manual analysis to extract relevant threat information from these sources is a time consuming and error-prone process that requires a significant allocation of resources. In this paper, we explore the potential of Machine Learning methods to rapidly sift through hacker forums for relevant threat intelligence. Utilizing text data from a real hacker forum, we compared the text classification performance of Convolutional Neural Network methods against more traditional Machine Learning approaches. We found that traditional machine learning methods, such as Support Vector Machines, can yield high levels of performance that are on par with Convolutional Neural Network algorithms.