Biblio
Tor is a well known and widely used darknet, known for its anonymity. However, while its protocol and relay security have already been extensively studied, to date there is no comprehensive analysis of the structure and privacy of its Web Hidden Service. To fill this gap, we developed a dedicated analysis platform and used it to crawl and analyze over 1.5M URLs hosted in 7257 onion domains. For each page we analyzed its links, resources, and redirections graphs, as well as the language and category distribution. According to our experiments, Tor hidden services are organized in a sparse but highly connected graph, in which around 10% of the onions sites are completely isolated. Our study also measures for the first time the tight connection that exists between Tor hidden services and the Surface Web. In fact, more than 20% of the onion domains we visited imported resources from the Surface Web, and links to the Surface Web are even more prevalent than to other onion domains. Finally, we measured for the first time the prevalence and the nature of web tracking in Tor hidden services, showing that, albeit not as widespread as in the Surface Web, tracking is notably present also in the Dark Web: more than 40% of the scripts are used for this purpose, with the 70% of them being completely new tracking scripts unknown by existing anti-tracking solutions.
The Dark Web is known as the part of the Internet operated by decentralized and anonymous-preserving protocols like Tor. To date, the research community has focused on understanding the size and characteristics of the Dark Web and the services and goods that are offered in its underground markets. However, little is still known about the attacks landscape in the Dark Web. For the traditional Web, it is now well understood how websites are exploited, as well as the important role played by Google Dorks and automated attack bots to form some sort of "background attack noise" to which public websites are exposed. This paper tries to understand if these basic concepts and components have a parallel in the Dark Web. In particular, by deploying a high interaction honeypot in the Tor network for a period of seven months, we conducted a measurement study of the type of attacks and of the attackers behavior that affect this still relatively unknown corner of the Web.
Phishing is a form of online identity theft that deceives unaware users into disclosing their confidential information. While significant effort has been devoted to the mitigation of phishing attacks, much less is known about the entire life-cycle of these attacks in the wild, which constitutes, however, a main step toward devising comprehensive anti-phishing techniques. In this paper, we present a novel approach to sandbox live phishing kits that completely protects the privacy of victims. By using this technique, we perform a comprehensive real-world assessment of phishing attacks, their mechanisms, and the behavior of the criminals, their victims, and the security community involved in the process – based on data collected over a period of five months. Our infrastructure allowed us to draw the first comprehensive picture of a phishing attack, from the time in which the attacker installs and tests the phishing pages on a compromised host, until the last interaction with real victims and with security researchers. Our study presents accurate measurements of the duration and effectiveness of this popular threat, and discusses many new and interesting aspects we observed by monitoring hundreds of phishing campaigns.
Code reuse attacks based on return oriented programming (ROP) are becoming more and more prevalent every year. They started as a way to circumvent operating systems protections against injected code, but they are now also used as a technique to keep the malicious code hidden from detection and analysis systems. This means that while in the past ROP chains were short and simple (and therefore did not require any dedicated tool for their analysis), we recently started to observe very complex algorithms – such as a complete rootkit – implemented entirely as a sequence of ROP gadgets. In this paper, we present a set of techniques to analyze complex code reuse attacks. First, we identify and discuss the main challenges that complicate the reverse engineer of code implemented using ROP. Second, we propose an emulation-based framework to dissect, reconstruct, and simplify ROP chains. Finally, we test our tool on the most complex example available to date: a ROP rootkit containing four separate chains, two of them dynamically generated at runtime.