PERT: Payload Encoding Representation from Transformer for Encrypted Traffic Classification

Submitted by grigby1 on Thu, 01/28/2021 - 1:34pm

Title	PERT: Payload Encoding Representation from Transformer for Encrypted Traffic Classification
Publication Type	Conference Paper
Year of Publication	2020
Authors	He, H. Y., Yang, Z. Guo, Chen, X. N.
Conference Name	2020 ITU Kaleidoscope: Industry-Driven Digital Transformation (ITU K)
Keywords	android encryption, cryptography, Deep Learning, dynamic word embedding, Encrypted traffic classification, feature extraction, Human Behavior, Image coding, Metrics, natural language processing, Payloads, pubcrawl, resilience, Resiliency, Scalability, Task Analysis, telecommunication traffic, Traffic identification
Abstract	Traffic identification becomes more important yet more challenging as related encryption techniques are rapidly developing nowadays. In difference to recent deep learning methods that apply image processing to solve such encrypted traffic problems, in this paper, we propose a method named Payload Encoding Representation from Transformer (PERT) to perform automatic traffic feature extraction using a state-of-the-art dynamic word embedding technique. Based on this, we further provide a traffic classification framework in which unlabeled traffic is utilized to pre-train an encoding network that learns the contextual distribution of traffic payload bytes. Then, the downward classification reuses the pre-trained network to obtain an enhanced classification result. By implementing experiments on a public encrypted traffic data set and our captured Android HTTPS traffic, we prove the proposed method can achieve an obvious better effectiveness than other compared baselines. To the best of our knowledge, this is the first time the encrypted traffic classification with the dynamic word embedding alone with its pre-training strategy has been addressed.
DOI	10.23919/ITUK50268.2020.9303204
Citation Key	he_pert_2020

Groups:

Science of Security VO