Title | A System for Detecting Third-Party Tracking through the Combination of Dynamic Analysis and Static Analysis |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Sun, Jingxue, Huang, Zhiqiu, Yang, Ting, Wang, Wengjie, Zhang, Yuqing |
Conference Name | IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) |
Date Published | may |
Keywords | API, APIs, Application program interface, application program interfaces, Application Programming Interface (API), composability, compositionality, Conferences, feature extraction, Flash, JavaScript, machine learning, machine learning algorithms, maintenance engineering, privacy, pubcrawl, resilience, Resiliency, static analysis, third-party tracking, web security |
Abstract | With the continuous development of Internet technology, people pay more and more attention to private security. In particular, third-party tracking is a major factor affecting privacy security. So far, the most effective way to prevent third-party tracking is to create a blacklist. However, blacklist generation and maintenance need to be carried out manually which is inefficient and difficult to maintain. In order to generate blacklists more quickly and accurately in this era of big data, this paper proposes a machine learning system MFTrackerDetector against third-party tracking. The system is based on the theory of structural hole and only detects third-party trackers. The system consists of two subsystems, DMTrackerDetector and DFTrackerDetector. DMTrackerDetector is a JavaScript-based subsystem and DFTrackerDetector is a Flash-based subsystem. Because tracking code and non-tracking code often call different APIs, DMTrackerDetector builds a classifier using all the APIs in JavaScript as features and extracts the API features in JavaScript through dynamic analysis. Unlike static analysis method, the dynamic analysis method can effectively avoid code obfuscation. DMTrackerDetector eventually generates a JavaScript-based third-party tracker list named Jlist. DFTrackerDetector constructs a classifier using all the APIs in ActionScript as features and extracts the API features in the flash script through static analysis. DFTrackerDetector finally generates a Flash-based third-party tracker list named Flist. DFTrackerDetector achieved 92.98% accuracy in the Flash test set and DMTrackerDetector achieved 90.79% accuracy in the JavaScript test set. MFTrackerDetector eventually generates a list of third-party trackers, which is a combination of Jlist and Flist. |
DOI | 10.1109/INFOCOMWKSHPS51825.2021.9484564 |
Citation Key | sun_system_2021 |