Biblio

List
Filter

Found 35 results

Filters: Keyword is transfer learning [Clear All Filters]

2023-09-20

Samia, Bougareche, Soraya, Zehani, Malika, Mimi. 2022. Fashion Images Classification using Machine Learning, Deep Learning and Transfer Learning Models. 2022 7th International Conference on Image and Signal Processing and their Applications (ISPA). :1—5.

Fashion is the way we present ourselves which mainly focuses on vision, has attracted great interest from computer vision researchers. It is generally used to search fashion products in online shopping malls to know the descriptive information of the product. The main objectives of our paper is to use deep learning (DL) and machine learning (ML) methods to correctly identify and categorize clothing images. In this work, we used ML algorithms (support vector machines (SVM), K-Nearest Neirghbors (KNN), Decision tree (DT), Random Forest (RF)), DL algorithms (Convolutionnal Neurals Network (CNN), AlexNet, GoogleNet, LeNet, LeNet5) and the transfer learning using a pretrained models (VGG16, MobileNet and RestNet50). We trained and tested our models online using google colaboratory with Tensorflow/Keras and Scikit-Learn libraries that support deep learning and machine learning in Python. The main metric used in our study to evaluate the performance of ML and DL algorithms is the accuracy and matrix confusion. The best result for the ML models is obtained with the use of ANN (88.71%) and for the DL models is obtained for the GoogleNet architecture (93.75%). The results obtained showed that the number of epochs and the depth of the network have an effect in obtaining the best results.

2023-07-21

Sadikoğlu, Fahreddin M., Idle Mohamed, Mohamed. 2022. Facial Expression Recognition Using CNN. 2022 International Conference on Artificial Intelligence in Everything (AIE). :95—99.

Facial is the most dynamic part of the human body that conveys information about emotions. The level of diversity in facial geometry and facial look makes it possible to detect various human expressions. To be able to differentiate among numerous facial expressions of emotion, it is crucial to identify the classes of facial expressions. The methodology used in this article is based on convolutional neural networks (CNN). In this paper Deep Learning CNN is used to examine Alex net architectures. Improvements were achieved by applying the transfer learning approach and modifying the fully connected layer with the Support Vector Machine(SVM) classifier. The system succeeded by achieving satisfactory results on icv-the MEFED dataset. Improved models achieved around 64.29 %of recognition rates for the classification of the selected expressions. The results obtained are acceptable and comparable to the relevant systems in the literature provide ideas a background for further improvements.

Lee, Gwo-Chuan, Li, Zi-Yang, Li, Tsai-Wei. 2022. Ensemble Algorithm of Convolution Neural Networks for Enhancing Facial Expression Recognition. 2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII ). :111—115.

Artificial intelligence (AI) cooperates with multiple industries to improve the overall industry framework. Especially, human emotion recognition plays an indispensable role in supporting medical care, psychological counseling, crime prevention and detection, and crime investigation. The research on emotion recognition includes emotion-specific intonation patterns, literal expressions of emotions, and facial expressions. Recently, the deep learning model of facial emotion recognition aims to capture tiny changes in facial muscles to provide greater recognition accuracy. Hybrid models in facial expression recognition have been constantly proposed to improve the performance of deep learning models in these years. In this study, we proposed an ensemble learning algorithm for the accuracy of the facial emotion recognition model with three deep learning models: VGG16, InceptionResNetV2, and EfficientNetB0. To enhance the performance of these benchmark models, we applied transfer learning, fine-tuning, and data augmentation to implement the training and validation of the Facial Expression Recognition 2013 (FER-2013) Dataset. The developed algorithm finds the best-predicted value by prioritizing the InceptionResNetV2. The experimental results show that the proposed ensemble learning algorithm of priorities edges up 2.81% accuracy of the model identification. The future extension of this study ventures into the Internet of Things (IoT), medical care, and crime detection and prevention.

2023-07-10

Kim, Hyun-Jin, Lee, Jonghoon, Park, Cheolhee, Park, Jong-Geun. 2022. Network Anomaly Detection based on Domain Adaptation for 5G Network Security. 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). :976—980.

Currently, research on 5G communication is focusing increasingly on communication techniques. The previous studies have primarily focused on the prevention of communications disruption. To date, there has not been sufficient research on network anomaly detection as a countermeasure against on security aspect. 5g network data will be more complex and dynamic, intelligent network anomaly detection is necessary solution for protecting the network infrastructure. However, since the AI-based network anomaly detection is dependent on data, it is difficult to collect the actual labeled data in the industrial field. Also, the performance degradation in the application process to real field may occur because of the domain shift. Therefore, in this paper, we research the intelligent network anomaly detection technique based on domain adaptation (DA) in 5G edge network in order to solve the problem caused by data-driven AI. It allows us to train the models in data-rich domains and apply detection techniques in insufficient amount of data. For Our method will contribute to AI-based network anomaly detection for improving the security for 5G edge network.

2023-01-20

Omeroglu, Asli Nur, Mohammed, Hussein M. A., Oral, E. Argun, Yucel Ozbek, I.. 2022. Detection of Moving Target Direction for Ground Surveillance Radar Based on Deep Learning. 2022 30th Signal Processing and Communications Applications Conference (SIU). :1–4.

In defense and security applications, detection of moving target direction is as important as the target detection and/or target classification. In this study, a methodology for the detection of different mobile targets as approaching or receding was proposed for ground surveillance radar data, and convolutional neural networks (CNN) based on transfer learning were employed for this purpose. In order to improve the classification performance, the use of two key concepts, namely Deep Convolutional Generative Adversarial Network (DCGAN) and decision fusion, has been proposed. With DCGAN, the number of limited available data used for training was increased, thus creating a bigger training dataset with identical distribution to the original data for both moving directions. This generated synthetic data was then used along with the original training data to train three different pre-trained deep convolutional networks. Finally, the classification results obtained from these networks were combined with decision fusion approach. In order to evaluate the performance of the proposed method, publicly available RadEch dataset consisting of eight ground target classes was utilized. Based on the experimental results, it was observed that the combined use of the proposed DCGAN and decision fusion methods increased the detection accuracy of moving target for person, vehicle, group of person and all target groups, by 13.63%, 10.01%, 14.82% and 8.62%, respectively.

2022-12-01

Abeyagunasekera, Sudil Hasitha Piyath, Perera, Yuvin, Chamara, Kenneth, Kaushalya, Udari, Sumathipala, Prasanna, Senaweera, Oshada. 2022. LISA : Enhance the explainability of medical images unifying current XAI techniques. 2022 IEEE 7th International conference for Convergence in Technology (I2CT). :1—9.

This work proposed a unified approach to increase the explainability of the predictions made by Convolution Neural Networks (CNNs) on medical images using currently available Explainable Artificial Intelligent (XAI) techniques. This method in-cooperates multiple techniques such as LISA aka Local Interpretable Model Agnostic Explanations (LIME), integrated gradients, Anchors and Shapley Additive Explanations (SHAP) which is Shapley values-based approach to provide explanations for the predictions provided by Blackbox models. This unified method increases the confidence in the black-box model’s decision to be employed in crucial applications under the supervision of human specialists. In this work, a Chest X-ray (CXR) classification model for identifying Covid-19 patients is trained using transfer learning to illustrate the applicability of XAI techniques and the unified method (LISA) to explain model predictions. To derive predictions, an image-net based Inception V2 model is utilized as the transfer learning model.

2022-08-26

Rajan, Mohammad Hasnain, Rebello, Keith, Sood, Yajur, Wankhade, Sunil B.. 2021. Graph-Based Transfer Learning for Conversational Agents. 2021 6th International Conference on Communication and Electronics Systems (ICCES). :1335–1341.

Graphs have proved to be a promising data structure to solve complex problems in various domains. Graphs store data in an associative manner which is analogous to the manner in which humans store memories in the brain. Generathe chatbots lack the ability to recall details revealed by the user in long conversations. To solve this problem, we have used graph-based memory to recall-related conversations from the past. Thus, providing context feature derived from query systems to generative systems such as OpenAI GPT. Using graphs to detect important details from the past reduces the total amount of processing done by the neural network. As there is no need to keep on passingthe entire history of the conversation. Instead, we pass only the last few pairs of utterances and the related details from the graph. This paper deploys this system and also demonstrates the ability to deploy such systems in real-world applications. Through the effective usage of knowledge graphs, the system is able to reduce the time complexity from O(n) to O(1) as compared to similar non-graph based implementations of transfer learning- based conversational agents.

2022-07-01

Cody, Tyler, Beling, Peter A.. 2021. Heterogeneous Transfer in Deep Learning for Spectrogram Classification in Cognitive Communications. 2021 IEEE Cognitive Communications for Aerospace Applications Workshop (CCAAW). :1—5.

Machine learning offers performance improvements and novel functionality, but its life cycle performance is understudied. In areas like cognitive communications, where systems are long-lived, life cycle trade-offs are key to system design. Herein, we consider the use of deep learning to classify spectrograms. We vary the label-space over which the network makes classifications, as may emerge with changes in use over a system’s life cycle, and compare heterogeneous transfer learning performance across label-spaces between model architectures. Our results offer an empirical example of life cycle challenges to using machine learning for cognitive communications. They evidence important trade-offs among performance, training time, and sensitivity to the order in which the label-space is changed. And they show that fine-tuning can be used in the heterogeneous transfer of spectrogram classifiers.

2022-06-08

Chen, Lin, Qiu, Huijun, Kuang, Xiaoyun, Xu, Aidong, Yang, Yiwei. 2021. Intelligent Data Security Threat Discovery Model Based on Grid Data. 2021 6th International Conference on Image, Vision and Computing (ICIVC). :458–463.

With the rapid construction and popularization of smart grid, the security of data in smart grid has become the basis for the safe and stable operation of smart grid. This paper proposes a data security threat discovery model for smart grid. Based on the prediction data analysis method, combined with migration learning technology, it analyzes different data, uses data matching process to classify the losses, and accurately predicts the analysis results, finds the security risks in the data, and prevents the illegal acquisition of data. The reinforcement learning and training process of this method distinguish the effective authentication and illegal access to data.

2022-05-10

Zheng, Wei, Abdallah Semasaba, Abubakar Omari, Wu, Xiaoxue, Agyemang, Samuel Akwasi, Liu, Tao, Ge, Yuan. 2021. Representation vs. Model: What Matters Most for Source Code Vulnerability Detection. 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). :647–653.

Vulnerabilities in the source code of software are critical issues in the realm of software engineering. Coping with vulnerabilities in software source code is becoming more challenging due to several aspects of complexity and volume. Deep learning has gained popularity throughout the years as a means of addressing such issues. In this paper, we propose an evaluation of vulnerability detection performance on source code representations and evaluate how Machine Learning (ML) strategies can improve them. The structure of our experiment consists of 3 Deep Neural Networks (DNNs) in conjunction with five different source code representations; Abstract Syntax Trees (ASTs), Code Gadgets (CGs), Semantics-based Vulnerability Candidates (SeVCs), Lexed Code Representations (LCRs), and Composite Code Representations (CCRs). Experimental results show that employing different ML strategies in conjunction with the base model structure influences the performance results to a varying degree. However, ML-based techniques suffer from poor performance on class imbalance handling when used in conjunction with source code representations for software vulnerability detection.

2022-04-25

Sunil, Ajeet, Sheth, Manav Hiren, E, Shreyas, Mohana. 2021. Usual and Unusual Human Activity Recognition in Video using Deep Learning and Artificial Intelligence for Security Applications. 2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT). :1–6.

The main objective of Human Activity Recognition (HAR) is to detect various activities in video frames. Video surveillance is an import application for various security reasons, therefore it is essential to classify activities as usual and unusual. This paper implements the deep learning model that has the ability to classify and localize the activities detected using a Single Shot Detector (SSD) algorithm with a bounding box, which is explicitly trained to detect usual and unusual activities for security surveillance applications. Further this model can be deployed in public places to improve safety and security of individuals. The SSD model is designed and trained using transfer learning approach. Performance evaluation metrics are visualised using Tensor Board tool. This paper further discusses the challenges in real-time implementation.

2022-03-25

Huang, Jiaheng, Chen, Lei. 2021. Transfer Learning Based Multi-objective Particle Swarm Optimization Algorithm. 2021 17th International Conference on Computational Intelligence and Security (CIS). :382—386.

In Particle Swarm Optimization Algorithm (PSO), the learning factors \$c\_1\$ and \$c\_2\$ are used to update the speed and location of a particle. However, the setting of those two important parameters has great effect on the performance of the PSO algorithm, which has limited its range of applications. To avoid the tedious parameter tuning, we introduce a transfer learning based adaptive parameter setting strategy to PSO in this paper. The proposed transfer learning strategy can adjust the two learning factors more effectively according to the environment change. The performance of the proposed algorithm is tested on sets of widely-used benchmark multi-objective test problems for DTLZ. The results comparing and analysis are conduced by comparing it with the state-of-art evolutionary multi-objective optimization algorithm NSGA-III to verify the effectiveness and efficiency of the proposed method.

2022-03-15

Aghakhani, Hojjat, Meng, Dongyu, Wang, Yu-Xiang, Kruegel, Christopher, Vigna, Giovanni. 2021. Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved Transferability. 2021 IEEE European Symposium on Security and Privacy (EuroS P). :159—178.

A recent source of concern for the security of neural networks is the emergence of clean-label dataset poisoning attacks, wherein correctly labeled poison samples are injected into the training dataset. While these poison samples look legitimate to the human observer, they contain malicious characteristics that trigger a targeted misclassification during inference. We propose a scalable and transferable clean-label poisoning attack against transfer learning, which creates poison images with their center close to the target image in the feature space. Our attack, Bullseye Polytope, improves the attack success rate of the current state-of-the-art by 26.75% in end-to-end transfer learning, while increasing attack speed by a factor of 12. We further extend Bullseye Polytope to a more practical attack model by including multiple images of the same object (e.g., from different angles) when crafting the poison samples. We demonstrate that this extension improves attack transferability by over 16% to unseen images (of the same object) without using extra poison samples.

2022-03-10

Yang, Mengde. 2021. A Survey on Few-Shot Learning in Natural Language Processing. 2021 International Conference on Artificial Intelligence and Electromechanical Automation (AIEA). :294—297.

The annotated dataset is the foundation for Supervised Natural Language Processing. However, the cost of obtaining dataset is high. In recent years, the Few-Shot Learning has gradually attracted the attention of researchers. From the definition, in this paper, we conclude the difference in Few-Shot Learning between Natural Language Processing and Computer Vision. On that basis, the current Few-Shot Learning on Natural Language Processing is summarized, including Transfer Learning, Meta Learning and Knowledge Distillation. Furthermore, we conclude the solutions to Few-Shot Learning in Natural Language Processing, such as the method based on Distant Supervision, Meta Learning and Knowledge Distillation. Finally, we present the challenges facing Few-Shot Learning in Natural Language Processing.

2022-02-24

Moskal, Stephen, Yang, Shanchieh Jay. 2021. Translating Intrusion Alerts to Cyberattack Stages Using Pseudo-Active Transfer Learning (PATRL). 2021 IEEE Conference on Communications and Network Security (CNS). :110–118.

Intrusion alerts continue to grow in volume, variety, and complexity. Its cryptic nature requires substantial time and expertise to interpret the intended consequence of observed malicious actions. To assist security analysts in effectively diagnosing what alerts mean, this work develops a novel machine learning approach that translates alert descriptions to intuitively interpretable Action-Intent-Stages (AIS) with only 1% labeled data. We combine transfer learning, active learning, and pseudo labels and develop the Pseudo-Active Transfer Learning (PATRL) process. The PATRL process begins with an unsupervised-trained language model using MITRE ATT&CK, CVE, and IDS alert descriptions. The language model feeds to an LSTM classifier to train with 1% labeled data and is further enhanced with active learning using pseudo labels predicted by the iteratively improved models. Our results suggest PATRL can predict correctly for 85% (top-1 label) and 99% (top-3 labels) of the remaining 99% unknown data. Recognizing the need to build confidence for the analysts to use the model, the system provides Monte-Carlo Dropout Uncertainty and Pseudo-Label Convergence Score for each of the predicted alerts. These metrics give the analyst insights to determine whether to directly trust the top-1 or top-3 predictions and whether additional pseudo labels are needed. Our approach overcomes a rarely tackled research problem where minimal amounts of labeled data do not reflect the truly unlabeled data's characteristics. Combining the advantages of transfer learning, active learning, and pseudo labels, the PATRL process translates the complex intrusion alert description for the analysts with confidence.

2022-02-22

Lanus, Erin, Freeman, Laura J., Richard Kuhn, D., Kacker, Raghu N.. 2021. Combinatorial Testing Metrics for Machine Learning. 2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). :81–84.

This paper defines a set difference metric for comparing machine learning (ML) datasets and proposes the difference between datasets be a function of combinatorial coverage. We illustrate its utility for evaluating and predicting performance of ML models. Identifying and measuring differences between datasets is of significant value for ML problems, where the accuracy of the model is heavily dependent on the degree to which training data are sufficiently representative of data encountered in application. The method is illustrated for transfer learning without retraining, the problem of predicting performance of a model trained on one dataset and applied to another.

2022-02-07

Khetarpal, Anavi, Mallik, Abhishek. 2021. Visual Malware Classification Using Transfer Learning. 2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT). :1–5.

The proliferation of malware attacks causes a hindrance to cybersecurity thus, posing a significant threat to our devices. The variety and number of both known as well as unknown malware makes it difficult to detect it. Research suggests that the ramifications of malware are only becoming worse with time and hence malware analysis becomes crucial. This paper proposes a visual malware classification technique to convert malware executables into their visual representations and obtain grayscale images of malicious files. These grayscale images are then used to classify malicious files into their respective malware families by passing them through deep convolutional neural networks (CNN). As part of deep CNN, we use various ImageNet models and compare their performance.

2021-09-21

Sartoli, Sara, Wei, Yong, Hampton, Shane. 2020. Malware Classification Using Recurrence Plots and Deep Neural Network. 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). :901–906.

In this paper, we introduce a method for visualizing and classifying malware binaries. A malware binary consists of a series of data points of compiled machine codes that represent programming components. The occurrence and recurrence behavior of these components is determined by the common tasks malware samples in a particular family carry out. Thus, we view a malware binary as a series of emissions generated by an underlying stochastic process and use recurrence plots to transform malware binaries into two-dimensional texture images. We observe that recurrence plot-based malware images have significant visual similarities within the same family and are different from samples in other families. We apply deep CNN classifiers to classify malware samples. The proposed approach does not require creating malware signature or manual feature engineering. Our preliminary experimental results show that the proposed malware representation leads to a higher and more stable accuracy in comparison to directly transforming malware binaries to gray-scale images.

2021-03-30

Li, Y., Ji, X., Li, C., Xu, X., Yan, W., Yan, X., Chen, Y., Xu, W.. 2020. Cross-domain Anomaly Detection for Power Industrial Control System. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :383—386.

In recent years, artificial intelligence has been widely used in the field of network security, which has significantly improved the effect of network security analysis and detection. However, because the power industrial control system is faced with the problem of shortage of attack data, the direct deployment of the network intrusion detection system based on artificial intelligence is faced with the problems of lack of data, low precision, and high false alarm rate. To solve this problem, we propose an anomaly traffic detection method based on cross-domain knowledge transferring. By using the TrAdaBoost algorithm, we achieve a lower error rate than using LSTM alone.

2021-02-16

He, J., Tan, Y., Guo, W., Xian, M.. 2020. A Small Sample DDoS Attack Detection Method Based on Deep Transfer Learning. 2020 International Conference on Computer Communication and Network Security (CCNS). :47—50.

When using deep learning for DDoS attack detection, there is a general degradation in detection performance due to small sample size. This paper proposes a small-sample DDoS attack detection method based on deep transfer learning. First, deep learning techniques are used to train several neural networks that can be used for transfer in DDoS attacks with sufficient samples. Then we design a transferability metric to compare the transfer performance of different networks. With this metric, the network with the best transfer performance can be selected among the four networks. Then for a small sample of DDoS attacks, this paper demonstrates that the deep learning detection technique brings deterioration in performance, with the detection performance dropping from 99.28% to 67%. Finally, we end up with a 20.8% improvement in detection performance by deep transfer of the 8LANN network in the target domain. The experiment shows that the detection method based on deep transfer learning proposed in this paper can well improve the performance deterioration of deep learning techniques for small sample DDoS attack detection.

2020-12-11

Mikołajczyk, A., Grochowski, M.. 2019. Style transfer-based image synthesis as an efficient regularization technique in deep learning. 2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR). :42—47.

These days deep learning is the fastest-growing area in the field of Machine Learning. Convolutional Neural Networks are currently the main tool used for the image analysis and classification purposes. Although great achievements and perspectives, deep neural networks and accompanying learning algorithms have some relevant challenges to tackle. In this paper, we have focused on the most frequently mentioned problem in the field of machine learning, that is relatively poor generalization abilities. Partial remedies for this are regularization techniques e.g. dropout, batch normalization, weight decay, transfer learning, early stopping and data augmentation. In this paper we have focused on data augmentation. We propose to use a method based on a neural style transfer, which allows to generate new unlabeled images of high perceptual quality that combine the content of a base image with the appearance of another one. In a proposed approach, the newly created images are described with pseudo-labels, and then used as a training dataset. Real, labeled images are divided into the validation and test set. We validated proposed method on a challenging skin lesion classification case study. Four representative neural architectures are examined. Obtained results show the strong potential of the proposed approach.

2020-11-04

Chacon, H., Silva, S., Rad, P.. 2019. Deep Learning Poison Data Attack Detection. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). :971—978.

Deep neural networks are widely used in many walks of life. Techniques such as transfer learning enable neural networks pre-trained on certain tasks to be retrained for a new duty, often with much less data. Users have access to both pre-trained model parameters and model definitions along with testing data but have either limited access to training data or just a subset of it. This is risky for system-critical applications, where adversarial information can be maliciously included during the training phase to attack the system. Determining the existence and level of attack in a model is challenging. In this paper, we present evidence on how adversarially attacking training data increases the boundary of model parameters using as an example of a CNN model and the MNIST data set as a test. This expansion is due to new characteristics of the poisonous data that are added to the training data. Approaching the problem from the feature space learned by the network provides a relation between them and the possible parameters taken by the model on the training phase. An algorithm is proposed to determine if a given network was attacked in the training by comparing the boundaries of parameters distribution on intermediate layers of the model estimated by using the Maximum Entropy Principle and the Variational inference approach.

2020-11-02

Pan, C., Huang, J., Gong, J., Yuan, X.. 2019. Few-Shot Transfer Learning for Text Classification With Lightweight Word Embedding Based Models. IEEE Access. 7:53296–53304.

Many deep learning architectures have been employed to model the semantic compositionality for text sequences, requiring a huge amount of supervised data for parameters training, making it unfeasible in situations where numerous annotated samples are not available or even do not exist. Different from data-hungry deep models, lightweight word embedding-based models could represent text sequences in a plug-and-play way due to their parameter-free property. In this paper, a modified hierarchical pooling strategy over pre-trained word embeddings is proposed for text classification in a few-shot transfer learning way. The model leverages and transfers knowledge obtained from some source domains to recognize and classify the unseen text sequences with just a handful of support examples in the target problem domain. The extensive experiments on five datasets including both English and Chinese text demonstrate that the simple word embedding-based models (SWEMs) with parameter-free pooling operations are able to abstract and represent the semantic text. The proposed modified hierarchical pooling method exhibits significant classification performance in the few-shot transfer learning tasks compared with other alternative methods.

2020-10-29

Lo, Wai Weng, Yang, Xu, Wang, Yapeng. 2019. An Xception Convolutional Neural Network for Malware Classification with Transfer Learning. 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1—5.

In this work, we applied a deep Convolutional Neural Network (CNN) with Xception model to perform malware image classification. The Xception model is a recently developed special CNN architecture that is more powerful with less over- fitting problems than the current popular CNN models such as VGG16. However only a few use cases of the Xception model can be found in literature, and it has never been used to solve the malware classification problem. The performance of our approach was compared with other methods including KNN, SVM, VGG16 etc. The experiments on two datasets (Malimg and Microsoft Malware Dataset) demonstrated that the Xception model can achieve the highest training accuracy than all other approaches including the champion approach, and highest validation accuracy than all other approaches including VGG16 model which are using image-based malware classification (except the champion solution as this information was not provided). Additionally, we proposed a novel ensemble model to combine the predictions from .bytes files and .asm files, showing that a lower logloss can be achieved. Although the champion on the Microsoft Malware Dataset achieved a bit lower logloss, our approach does not require any features engineering, making it more effective to adapt to any future evolution in malware, and very much less time consuming than the champion's solution.

2020-07-10

Nahmias, Daniel, Cohen, Aviad, Nissim, Nir, Elovici, Yuval. 2019. TrustSign: Trusted Malware Signature Generation in Private Clouds Using Deep Feature Transfer Learning. 2019 International Joint Conference on Neural Networks (IJCNN). :1—8.

This paper presents TrustSign, a novel, trusted automatic malware signature generation method based on high-level deep features transferred from a VGG-19 neural network model pre-trained on the ImageNet dataset. While traditional automatic malware signature generation techniques rely on static or dynamic analysis of the malware's executable, our method overcomes the limitations associated with these techniques by producing signatures based on the presence of the malicious process in the volatile memory. Signatures generated using TrustSign well represent the real malware behavior during runtime. By leveraging the cloud's virtualization technology, TrustSign analyzes the malicious process in a trusted manner, since the malware is unaware and cannot interfere with the inspection procedure. Additionally, by removing the dependency on the malware's executable, our method is capable of signing fileless malware. Thus, we focus our research on in-browser cryptojacking attacks, which current antivirus solutions have difficulty to detect. However, TrustSign is not limited to cryptojacking attacks, as our evaluation included various ransomware samples. TrustSign's signature generation process does not require feature engineering or any additional model training, and it is done in a completely unsupervised manner, obviating the need for a human expert. Therefore, our method has the advantage of dramatically reducing signature generation and distribution time. The results of our experimental evaluation demonstrate TrustSign's ability to generate signatures invariant to the process state over time. By using the signatures generated by TrustSign as input for various supervised classifiers, we achieved 99.5% classification accuracy.