Biblio
In the past air-gapped systems that are isolated from networks have been considered to be very secure. Yet there have been reports of such systems being breached. These breaches have shown to use unconventional means for communication also known as covert channels such as Acoustic, Electromagnetic, Magnetic, Electric, Optical, and Thermal to transfer data. In this paper, a review of various attack methods that can compromise an air-gapped system is presented along with a summary of how efficient and dangerous a particular method could be. The capabilities of each covert channel are listed to better understand the threat it poses and also some countermeasures to safeguard against such attack methods are mentioned. These attack methods have already been proven to work and awareness of such covert channels for data exfiltration is crucial in various industries.
Air-gapped networks are isolated from the Internet, since they store and process sensitive information. It has been shown that attackers can exfiltrate data from air-gapped networks by sending acoustic signals generated by computer speakers, however this type of covert channel relies on the existence of loudspeakers in the air-gapped environment. In this paper, we present CD-LEAK - a novel acoustic covert channel that works in constrained environments where loudspeakers are not available to the attacker. Malware installed on a compromised computer can maliciously generate acoustic signals via the optical CD/DVD drives. Binary information can then be modulated over the acoustic signals and be picked up by a nearby Internet connected receiver (e.g., a workstation, hidden microphone, smartphone, laptop, etc.). We examine CD/DVD drives and discuss their acoustical characteristics. We also present signal generation and detection, and data modulation and demodulation algorithms. Based on our proposed method, we developed a transmitter and receiver for PCs and smartphones, and provide the design and implementation details. We examine the channel and evaluate it on various optical drives. We also provide a set of countermeasures against this threat - which has been overlooked.
Advanced Persistent Threat (APT) is a stealthy, continuous and sophisticated method of network attacks, which can cause serious privacy leakage and millions of dollars losses. In this paper, we introduce a new game-theoretic framework of the interaction between a defender who uses limited Security Resources(SRs) to harden network and an attacker who adopts a multi-stage plan to attack the network. The game model is derived from Stackelberg games called a Multi-stage Maze Network Game (M2NG) in which the characteristics of APT are fully considered. The possible plans of the attacker are compactly represented using attack graphs(AGs), but the compact representation of the attacker's strategies presents a computational challenge and reaching the Nash Equilibrium(NE) is NP-hard. We present a method that first translates AGs into Markov Decision Process(MDP) and then achieves the optimal SRs allocation using the policy hill-climbing(PHC) algorithm. Finally, we present an empirical evaluation of the model and analyze the scalability and sensitivity of the algorithm. Simulation results exhibit that our proposed reinforcement learning-based SRs allocation is feasible and efficient.
Cybercrime is growing dramatically in the technological world nowadays. World Wide Web criminals exploit the personal information of internet users and use them to their advantage. Unethical users leverage the dark web to buy and sell illegal products or services and sometimes they manage to gain access to classified government information. A number of illegal activities that can be found in the dark web include selling or buying hacking tools, stolen data, digital fraud, terrorists activities, drugs, weapons, and more. The aim of this project is to collect evidence of any malicious activity in the dark web by using computer security mechanisms as traps called honeypots.
With increasing monitoring and regulation by platforms, communities with criminal interests are moving to the dark web, which hosts content ranging from whistle-blowing and privacy, to drugs, terrorism, and hacking. Using post discussion data from six dark web forums we construct six interaction graphs and use social network analysis tools to study these underground communities. We observe the structure of each network to highlight structural patterns and identify nodes of importance through network centrality analysis. Our findings suggest that in the majority of the forums some members are highly connected and form hubs, while most members have a lower number of connections. When examining the posting activities of central nodes we found that most of the central nodes post in sub-forums with broader topics, such as general discussions and tutorials. These members play different roles in the different forums, and within each forum we identified diverse user profiles.
The regularity of devastating cyber-attacks has made cybersecurity a grand societal challenge. Many cybersecurity professionals are closely examining the international Dark Web to proactively pinpoint potential cyber threats. Despite its potential, the Dark Web contains hundreds of thousands of non-English posts. While machine translation is the prevailing approach to process non-English text, applying MT on hacker forum text results in mistranslations. In this study, we draw upon Long-Short Term Memory (LSTM), Cross-Lingual Knowledge Transfer (CLKT), and Generative Adversarial Networks (GANs) principles to design a novel Adversarial CLKT (A-CLKT) approach. A-CLKT operates on untranslated text to retain the original semantics of the language and leverages the collective knowledge about cyber threats across languages to create a language invariant representation without any manual feature engineering or external resources. Three experiments demonstrate how A-CLKT outperforms state-of-the-art machine learning, deep learning, and CLKT algorithms in identifying cyber-threats in French and Russian forums.
In recent years, cyberattack techniques have become more and more sophisticated each day. Even if defense measures are taken against cyberattacks, it is difficult to prevent them completely. It can also be said that people can only fight defensively against cyber criminals. To address this situation, it is necessary to predict cyberattacks and take appropriate measures in advance, and the use of intelligence is important to make this possible. In general, many malicious hackers share information and tools that can be used for attacks on the dark web or in the specific communities. Therefore, we assume that a lot of intelligence, including this illegal content exists in cyber space. By using the threat intelligence, detecting attacks in advance and developing active defense is expected these days. However, such intelligence is currently extracted manually. In order to do this more efficiently, we apply machine learning to various forum posts that exist on the dark web, with the aim of extracting forum posts containing threat information. By doing this, we expect that detecting threat information in cyber space in a timely manner will be possible so that the optimal preventive measures will be taken in advance.
In this research, we examine and develop an expert system with a mechanism to automate crime category classification and threat level assessment, using the information collected by crawling the dark web. We have constructed a bag of words from 250 posts on the dark web and developed an expert system which takes the frequency of terms as an input and classifies sample posts into 6 criminal category dealing with drugs, stolen credit card, passwords, counterfeit products, child porn and others, and 3 threat levels (high, middle, low). Contrary to prior expectations, our simple and explainable expert system can perform competitively with other existing systems. For short, our experimental result with 1500 posts on the dark web shows 76.4% of recall rate for 6 criminal category classification and 83% of recall rate for 3 threat level discrimination for 100 random-sampled posts.
Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.
Nowadays, there is a flood of data such as naked body photos and child pornography, which is making people bloodless. In addition, people also distribute drugs through unknown dark channels. In particular, most transactions are being made through the Deep Web, the dark path. “Deep Web refers to an encrypted network that is not detected on search engine like Google etc. Users must use Tor to visit sites on the dark web” [4]. In other words, the Dark Web uses Tor's encryption client. Therefore, users can visit multiple sites on the dark Web, but not know the initiator of the site. In this paper, we propose the key idea based on the current status of such crimes and a crime information visual system for Deep Web has been developed. The status of deep web is analyzed and data is visualized using Java. It is expected that the program will help more efficient management and monitoring of crime in unknown web such as deep web, torrent etc.
Personally identifiable information (PII) has become a major target of cyber-attacks, causing severe losses to data breach victims. To protect data breach victims, researchers focus on collecting exposed PII to assess privacy risk and identify at-risk individuals. However, existing studies mostly rely on exposed PII collected from either the dark web or the surface web. Due to the wide exposure of PII on both the dark web and surface web, collecting from only the dark web or the surface web could result in an underestimation of privacy risk. Despite its research and practical value, jointly collecting PII from both sources is a non-trivial task. In this paper, we summarize our effort to systematically identify, collect, and monitor a total of 1,212,004,819 exposed PII records across both the dark web and surface web. Our effort resulted in 5.8 million stolen SSNs, 845,000 stolen credit/debit cards, and 1.2 billion stolen account credentials. From the surface web, we identified and collected over 1.3 million PII records of the victims whose PII is exposed on the dark web. To the best of our knowledge, this is the largest academic collection of exposed PII, which, if properly anonymized, enables various privacy research inquiries, including assessing privacy risk and identifying at-risk populations.
With the arrival of several face-swapping applications such as FaceApp, SnapChat, MixBooth, FaceBlender and many more, the authenticity of digital media content is hanging on a very loose thread. On social media platforms, videos are widely circulated often at a high compression factor. In this work, we analyze several deep learning approaches in the context of deepfakes classification in high compression scenarios and demonstrate that a proposed approach based on metric learning can be very effective in performing such a classification. Using less number of frames per video to assess its realism, the metric learning approach using a triplet network architecture proves to be fruitful. It learns to enhance the feature space distance between the cluster of real and fake videos embedding vectors. We validated our approaches on two datasets to analyze the behavior in different environments. We achieved a state-of-the-art AUC score of 99.2% on the Celeb-DF dataset and accuracy of 90.71% on a highly compressed Neural Texture dataset. Our approach is especially helpful on social media platforms where data compression is inevitable.