Visible to the public Disparity Analysis Between the Assembly and Byte Malware Samples with Deep Autoencoders

TitleDisparity Analysis Between the Assembly and Byte Malware Samples with Deep Autoencoders
Publication TypeConference Paper
Year of Publication2022
AuthorsAbdullah, Muhammed Amin, Yu, Yongbin, Cai, Jingye, Imrana, Yakubu, Tettey, Nartey Obed, Addo, Daniel, Sarpong, Kwabena, Agbley, Bless Lord Y., Appiah, Benjamin
Conference Name2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)
KeywordsAssembly file, codes, Data models, deep autoencoder, Human Behavior, information processing, Malware, malware analysis, Media, Metrics, noise reduction, privacy, pubcrawl, raw binary files, resilience, Resiliency, Training
AbstractMalware attacks in the cyber world continue to increase despite the efforts of Malware analysts to combat this problem. Recently, Malware samples have been presented as binary sequences and assembly codes. However, most researchers focus only on the raw Malware sequence in their proposed solutions, ignoring that the assembly codes may contain important details that enable rapid Malware detection. In this work, we leveraged the capabilities of deep autoencoders to investigate the presence of feature disparities in the assembly and raw binary Malware samples. First, we treated the task as outliers to investigate whether the autoencoder would identify and justify features as samples from the same family. Second, we added noise to all samples and used Deep Autoencoder to reconstruct the original samples by denoising. Experiments with the Microsoft Malware dataset showed that the byte samples' features differed from the assembly code samples.
DOI10.1109/ICCWAMTIP56608.2022.10016485
Citation Keyabdullah_disparity_2022