Visible to the public PERT: Payload Encoding Representation from Transformer for Encrypted Traffic Classification

TitlePERT: Payload Encoding Representation from Transformer for Encrypted Traffic Classification
Publication TypeConference Paper
Year of Publication2020
AuthorsHe, H. Y., Yang, Z. Guo, Chen, X. N.
Conference Name2020 ITU Kaleidoscope: Industry-Driven Digital Transformation (ITU K)
Keywordsandroid encryption, cryptography, Deep Learning, dynamic word embedding, Encrypted traffic classification, feature extraction, Human Behavior, Image coding, Metrics, natural language processing, Payloads, pubcrawl, resilience, Resiliency, Scalability, Task Analysis, telecommunication traffic, Traffic identification
Abstract

Traffic identification becomes more important yet more challenging as related encryption techniques are rapidly developing nowadays. In difference to recent deep learning methods that apply image processing to solve such encrypted traffic problems, in this paper, we propose a method named Payload Encoding Representation from Transformer (PERT) to perform automatic traffic feature extraction using a state-of-the-art dynamic word embedding technique. Based on this, we further provide a traffic classification framework in which unlabeled traffic is utilized to pre-train an encoding network that learns the contextual distribution of traffic payload bytes. Then, the downward classification reuses the pre-trained network to obtain an enhanced classification result. By implementing experiments on a public encrypted traffic data set and our captured Android HTTPS traffic, we prove the proposed method can achieve an obvious better effectiveness than other compared baselines. To the best of our knowledge, this is the first time the encrypted traffic classification with the dynamic word embedding alone with its pre-training strategy has been addressed.

DOI10.23919/ITUK50268.2020.9303204
Citation Keyhe_pert_2020