Visible to the public A System for Detecting Third-Party Tracking through the Combination of Dynamic Analysis and Static Analysis

TitleA System for Detecting Third-Party Tracking through the Combination of Dynamic Analysis and Static Analysis
Publication TypeConference Paper
Year of Publication2021
AuthorsSun, Jingxue, Huang, Zhiqiu, Yang, Ting, Wang, Wengjie, Zhang, Yuqing
Conference NameIEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)
Date Publishedmay
KeywordsAPI, APIs, Application program interface, application program interfaces, Application Programming Interface (API), composability, compositionality, Conferences, feature extraction, Flash, JavaScript, machine learning, machine learning algorithms, maintenance engineering, privacy, pubcrawl, resilience, Resiliency, static analysis, third-party tracking, web security
AbstractWith the continuous development of Internet technology, people pay more and more attention to private security. In particular, third-party tracking is a major factor affecting privacy security. So far, the most effective way to prevent third-party tracking is to create a blacklist. However, blacklist generation and maintenance need to be carried out manually which is inefficient and difficult to maintain. In order to generate blacklists more quickly and accurately in this era of big data, this paper proposes a machine learning system MFTrackerDetector against third-party tracking. The system is based on the theory of structural hole and only detects third-party trackers. The system consists of two subsystems, DMTrackerDetector and DFTrackerDetector. DMTrackerDetector is a JavaScript-based subsystem and DFTrackerDetector is a Flash-based subsystem. Because tracking code and non-tracking code often call different APIs, DMTrackerDetector builds a classifier using all the APIs in JavaScript as features and extracts the API features in JavaScript through dynamic analysis. Unlike static analysis method, the dynamic analysis method can effectively avoid code obfuscation. DMTrackerDetector eventually generates a JavaScript-based third-party tracker list named Jlist. DFTrackerDetector constructs a classifier using all the APIs in ActionScript as features and extracts the API features in the flash script through static analysis. DFTrackerDetector finally generates a Flash-based third-party tracker list named Flist. DFTrackerDetector achieved 92.98% accuracy in the Flash test set and DMTrackerDetector achieved 90.79% accuracy in the JavaScript test set. MFTrackerDetector eventually generates a list of third-party trackers, which is a combination of Jlist and Flist.
DOI10.1109/INFOCOMWKSHPS51825.2021.9484564
Citation Keysun_system_2021