Biblio
This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior: 1) its result is strongly associated with the protected information type in question, and 2) it is likely to causally affect the final output of the program. For a specific instantiation of this definition, we present a program analysis technique that detects instances of proxy use in a model, and provides a witness that identifies which parts of the corresponding program exhibit the behavior. Recognizing that not all instances of proxy use of a protected information type are inappropriate, we make use of a normative judgment oracle that makes this inappropriateness determination for a given witness. Our repair algorithm uses the witness of an inappropriate proxy use to transform the model into one that provably does not exhibit proxy use, while avoiding changes that unduly affect classification accuracy. Using a corpus of social datasets, our evaluation shows that these algorithms are able to detect proxy use instances that would be difficult to find using existing techniques, and subsequently remove them while maintaining acceptable classification performance.
Internet-of-Things devices often collect and transmit sensitive information like camera footage, health monitoring data, or whether someone is home. These devices protect data in transit with end-to-end encryption, typically using TLS connections between devices and associated cloud services. But these TLS connections also prevent device owners from observing what their own devices are saying about them. Unlike in traditional Internet applications, where the end user controls one end of a connection (e.g., their web browser) and can observe its communication, Internet-of-Things vendors typically control the software in both the device and the cloud. As a result, owners have no way to audit the behavior of their own devices, leaving them little choice but to hope that these devices are transmitting only what they should. This paper presents TLS–Rotate and Release (TLS-RaR), a system that allows device owners (e.g., consumers, security researchers, and consumer watchdogs) to authorize devices, called auditors, to decrypt and verify recent TLS traffic without compromising future traffic. Unlike prior work, TLS-RaR requires no changes to TLS's wire format or cipher suites, and it allows the device's owner to conduct a surprise inspection of recent traffic, without prior notice to the device that its communications will be audited.
Mobile Devices are part of our lives and we store a lot of private information on it as well as use services that handle sensitive information (e.g. mobile health apps). Whenever users install an application on their smartphones they have to decide whether to trust the applications and share private and sensitive data with at least the developer-owned services. But almost all modern apps not only transmit data to the developer owned servers but also send information to advertising-, analyzing and tracking partners. This paper presents an approach for a "privacy- proxy" which enables to filter unwanted data traffic to third party services without installing additional applications on the smartphone. It is based on a firewall using a black list of tracking- and analyzing networks which is automatically updated on a daily basis. The proof of concept has been implemented with open source components on a Raspberry Pi.
Botnets have become one of the most significant cyber threats over the last decade. The diffusion of the "Internet of Things" and its for-profit exploitation, contributed to botnets spread and sophistication, thus providing real, efficient and profitable criminal cyber-services. Recent research on botnet detection focuses on traffic pattern-based detection, and on analyzing the network traffic generated by the infected hosts, in order to find behavioral patterns independent from the specific payloads, architectures and protocols. In this paper we address the periodic behavioral patterns of infected hosts communicating with their Command-and-Control servers. The main novelty introduced is related to the traffic analysis in the frequency domain without using the well-known Fast Fourier Transform. Moreover, the mentioned analysis is performed through the exploitation of the proxy logs, easily deployable on almost every real-world scenario, from enterprise networks to mobile devices.
Botnets are considered one of the most dangerous species of network-based attack today because they involve the use of very large coordinated groups of hosts simultaneously. The behavioral analysis of computer networks is at the basis of the modern botnet detection methods, in order to intercept traffic generated by malwares for which signatures do not exist yet. Defining a pattern of features to be placed at the basis of behavioral analysis, puts the emphasis on the quantity and quality of information to be caught and used to mark data streams as normal or abnormal. The problem is even more evident if we consider extensive computer networks or clouds. With the present paper we intend to show how heuristics applied to large-scale proxy logs, considering a typical phase of the life cycle of botnets such as the search for C&C Servers through AGDs (Algorithmically Generated Domains), may provide effective and extremely rapid results. The present work will introduce some novel paradigms. The first is that some of the elements of the supply chain of botnets could be completed without any interaction with the Internet, mostly in presence of wide computer networks and/or clouds. The second is that behind a large number of workstations there are usually "human beings" and it is unlikely that their behaviors will cause marked changes in the interaction with the Internet in a fairly narrow time frame. Finally, AGDs can highlight, at the moment, common lexical features, detectable quickly and without using any black/white list.