Visible to the public Biblio

Found 232 results

Filters: Keyword is Testing  [Clear All Filters]
2021-03-04
Sun, H., Liu, L., Feng, L., Gu, Y. X..  2014.  Introducing Code Assets of a New White-Box Security Modeling Language. 2014 IEEE 38th International Computer Software and Applications Conference Workshops. :116—121.

This paper argues about a new conceptual modeling language for the White-Box (WB) security analysis. In the WB security domain, an attacker may have access to the inner structure of an application or even the entire binary code. It becomes pretty easy for attackers to inspect, reverse engineer, and tamper the application with the information they steal. The basis of this paper is the 14 patterns developed by a leading provider of software protection technologies and solutions. We provide a part of a new modeling language named i-WBS (White-Box Security) to describe problems of WB security better. The essence of White-Box security problem is code security. We made the new modeling language focus on code more than ever before. In this way, developers who are not security experts can easily understand what they need to really protect.

2021-02-23
Alshamrani, A..  2020.  Reconnaissance Attack in SDN based Environments. 2020 27th International Conference on Telecommunications (ICT). :1—5.
Software Defined Networking (SDN) is a promising network architecture that aims at providing high flexibility through the separation between network logic (control plane) and forwarding functions (data plane). This separation provides logical centralization of controllers, global network overview, ease of programmability, and a range of new SDN-compliant services. In recent years, the adoption of SDN in enterprise networks has been constantly increasing. In the meantime, new challenges arise in different levels such as scalability, management, and security. In this paper, we elaborate on complex security issues in the current SDN architecture. Especially, reconnaissance attack where attackers generate traffic for the goal of exploring existing services, assets, and overall network topology. To eliminate reconnaissance attack in SDN environment, we propose SDN-based solution by utilizing distributed firewall application, security policy, and OpenFlow counters. Distributed firewall application is capable of tracking the flow based on pre-defined states that would monitor the connection to sensitive nodes toward malicious activity. We utilize Mininet to simulate the testing environment. We are able to detect and mitigate this type of attack at early stage and in average around 7 second.
Chen, W., Cao, H., Lv, X., Cao, Y..  2020.  A Hybrid Feature Extraction Network for Intrusion Detection Based on Global Attention Mechanism. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :481—485.
The widespread application of 5G will make intrusion detection of large-scale network traffic a mere need. However, traditional intrusion detection cannot meet the requirements by manually extracting features, and the existing AI methods are also relatively inefficient. Therefore, when performing intrusion detection tasks, they have significant disadvantages of high false alarm rates and low recognition performance. For this challenge, this paper proposes a novel hybrid network, RULA-IDS, which can perform intrusion detection tasks by great amount statistical data from the network monitoring system. RULA-IDS consists of the fully connected layer, the feature extraction layer, the global attention mechanism layer and the SVM classification layer. In the feature extraction layer, the residual U-Net and LSTM are used to extract the spatial and temporal features of the network traffic attributes. It is worth noting that we modified the structure of U-Net to suit the intrusion detection task. The global attention mechanism layer is then used to selectively retain important information from a large number of features and focus on those. Finally, the SVM is used as a classifier to output results. The experimental results show that our method outperforms existing state-of-the-art intrusion detection methods, and the accuracies of training and testing are improved to 97.01% and 98.19%, respectively, and presents stronger robustness during training and testing.
2021-02-16
Wang, Y., Kjerstad, E., Belisario, B..  2020.  A Dynamic Analysis Security Testing Infrastructure for Internet of Things. 2020 Sixth International Conference on Mobile And Secure Services (MobiSecServ). :1—6.
IoT devices such as Google Home and Amazon Echo provide great convenience to our lives. Many of these IoT devices collect data including Personal Identifiable Information such as names, phone numbers, and addresses and thus IoT security is important. However, conducting security analysis on IoT devices is challenging due to the variety, the volume of the devices, and the special skills required for hardware and software analysis. In this research, we create and demonstrate a dynamic analysis security testing infrastructure for capturing network traffic from IoT devices. The network traffic is automatically mirrored to a server for live traffic monitoring and offline data analysis. Using the dynamic analysis security testing infrastructure, we conduct extensive security analysis on network traffic from Google Home and Amazon Echo. Our testing results indicate that Google Home enforces tighter security controls than Amazon Echo while both Google and Amazon devices provide the desired security level to protect user data in general. The dynamic analysis security testing infrastructure presented in the paper can be utilized to conduct similar security analysis on any IoT devices.
2021-02-08
Van, L. X., Dung, L. H., Hoa, D. V..  2020.  Developing Root Problem Aims to Create a Secure Digital Signature Scheme in Data Transfer. 2020 International Conference on Green and Human Information Technology (ICGHIT). :25–30.
This paper presents the proposed method of building a digital signature algorithm which is based on the difficulty of solving root problem and some expanded root problems on Zp. The expanded root problem is a new form of difficult problem without the solution, also originally proposed and applied to build digital signature algorithms. This proposed method enable to build a high-security digital signature platform for practical applications.
2021-02-01
Jiang, H., Du, M., Whiteside, D., Moursy, O., Yang, Y..  2020.  An Approach to Embedding a Style Transfer Model into a Mobile APP. 2020 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). :307–316.
The prevalence of photo processing apps suggests the demands of picture editing. As an implementation of the convolutional neural network, style transfer has been deep investigated and there are supported materials to realize it on PC platform. However, few approaches are mentioned to deploy a style transfer model on the mobile and meet the requirements of mobile users. The traditional style transfer model takes hours to proceed, therefore, based on a Perceptual Losses algorithm [1], we created a feedforward neural network for each style and the proceeding time was reduced to a few seconds. The training data were generated from a pre-trained convolutional neural network model, VGG-19. The algorithm took thousandth time and generated similar output as the original. Furthermore, we optimized the model and deployed the model with TensorFlow Mobile library. We froze the model and adopted a bitmap to scale the inputs to 720×720 and reverted back to the original resolution. The reverting process may create some blur but it can be regarded as a feature of art. The generated images have reliable quality and the waiting time is independent of the content and pattern of input images. The main factor that influences the proceeding time is the input resolution. The average waiting time of our model on the mobile phone, HUAWEI P20 Pro, is less than 2 seconds for 720p images and around 2.8 seconds for 1080p images, which are ten times slower than that on the PC GPU, Tesla T40. The performance difference depends on the architecture of the model.
Bai, Y., Guo, Y., Wei, J., Lu, L., Wang, R., Wang, Y..  2020.  Fake Generated Painting Detection Via Frequency Analysis. 2020 IEEE International Conference on Image Processing (ICIP). :1256–1260.
With the development of deep neural networks, digital fake paintings can be generated by various style transfer algorithms. To detect the fake generated paintings, we analyze the fake generated and real paintings in Fourier frequency domain and observe statistical differences and artifacts. Based on our observations, we propose Fake Generated Painting Detection via Frequency Analysis (FGPD-FA) by extracting three types of features in frequency domain. Besides, we also propose a digital fake painting detection database for assessing the proposed method. Experimental results demonstrate the excellence of the proposed method in different testing conditions.
Wickramasinghe, C. S., Marino, D. L., Grandio, J., Manic, M..  2020.  Trustworthy AI Development Guidelines for Human System Interaction. 2020 13th International Conference on Human System Interaction (HSI). :130–136.
Artificial Intelligence (AI) is influencing almost all areas of human life. Even though these AI-based systems frequently provide state-of-the-art performance, humans still hesitate to develop, deploy, and use AI systems. The main reason for this is the lack of trust in AI systems caused by the deficiency of transparency of existing AI systems. As a solution, “Trustworthy AI” research area merged with the goal of defining guidelines and frameworks for improving user trust in AI systems, allowing humans to use them without fear. While trust in AI is an active area of research, very little work exists where the focus is to build human trust to improve the interactions between human and AI systems. In this paper, we provide a concise survey on concepts of trustworthy AI. Further, we present trustworthy AI development guidelines for improving the user trust to enhance the interactions between AI systems and humans, that happen during the AI system life cycle.
2021-01-11
Khandait, P., Hubballi, N., Mazumdar, B..  2020.  Efficient Keyword Matching for Deep Packet Inspection based Network Traffic Classification. 2020 International Conference on COMmunication Systems NETworkS (COMSNETS). :567–570.
Network traffic classification has a range of applications in network management including QoS and security monitoring. Deep Packet Inspection (DPI) is one of the effective method used for traffic classification. DPI is computationally expensive operation involving string matching between payload and application signatures. Existing traffic classification techniques perform multiple scans of payload to classify the application flows - first scan to extract the words and the second scan to match the words with application signatures. In this paper we propose an approach which can classify network flows with single scan of flow payloads using a heuristic method to achieve a sub-linear search complexity. The idea is to scan few initial bytes of payload and determine potential application signature(s) for subsequent signature matching. We perform experiments with a large dataset containing 171873 network flows and show that it has a good classification accuracy of 98%.
2020-12-28
Wang, A., Yuan, Z., He, B..  2020.  Design and Realization of Smart Home Security System Based on AWS. 2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS). :291—295.
With the popularization and application of Internet of Things technology, the degree of intelligence of the home system is getting higher and higher. As an important part of the smart home, the security system plays an important role in protecting against accidents such as flammable gas leakage, fire, and burglary that may occur in the home environment. This design focuses on sensor signal acquisition and processing, wireless access, and cloud applications, and integrates Cypress’s new generation of PSoC 6 MCU, CYW4343W Wi-Fi and Bluetooth dual-module chips, and Amazon’s AWS cloud into smart home security System designing. First, through the designed air conditioning and refrigeration module, fire warning processing module, lighting control module, ventilation fan control module, combustible gas and smoke detection and warning module, important parameter information in the home environment is obtained. Then, the hardware system is connected to the AWS cloud platform through Wi-Fi; finally, a WEB interface is built in the AWS cloud to realize remote monitoring of the smart home environment. This design has a good reference for the design of future smart home security systems.
2020-12-01
Sunny, S. M. N. A., Liu, X., Shahriar, M. R..  2018.  Remote Monitoring and Online Testing of Machine Tools for Fault Diagnosis and Maintenance Using MTComm in a Cyber-Physical Manufacturing Cloud. 2018 IEEE 11th International Conference on Cloud Computing (CLOUD). :532—539.

Existing systems allow manufacturers to acquire factory floor data and perform analysis with cloud applications for machine health monitoring, product quality prediction, fault diagnosis and prognosis etc. However, they do not provide capabilities to perform testing of machine tools and associated components remotely, which is often crucial to identify causes of failure. This paper presents a fault diagnosis system in a cyber-physical manufacturing cloud (CPMC) that allows manufacturers to perform diagnosis and maintenance of manufacturing machine tools through remote monitoring and online testing using Machine Tool Communication (MTComm). MTComm is an Internet scale communication method that enables both monitoring and operation of heterogeneous machine tools through RESTful web services over the Internet. It allows manufacturers to perform testing operations from cloud applications at both machine and component level for regular maintenance and fault diagnosis. This paper describes different components of the system and their functionalities in CPMC and techniques used for anomaly detection and remote online testing using MTComm. It also presents the development of a prototype of the proposed system in a CPMC testbed. Experiments were conducted to evaluate its performance to diagnose faults and test machine tools remotely during various manufacturing scenarios. The results demonstrated excellent feasibility to detect anomaly during manufacturing operations and perform testing operations remotely from cloud applications using MTComm.

Kadhim, Y., Mishra, A..  2019.  Radial Basis Function (RBF) Based on Multistage Autoencoders for Intrusion Detection system (IDS). 2019 1st International Informatics and Software Engineering Conference (UBMYK). :1—4.

In this paper, RBF-based multistage auto-encoders are used to detect IDS attacks. RBF has numerous applications in various actual life settings. The planned technique involves a two-part multistage auto-encoder and RBF. The multistage auto-encoder is applied to select top and sensitive features from input data. The selected features from the multistage auto-encoder is wired as input to the RBF and the RBF is trained to categorize the input data into two labels: attack or no attack. The experiment was realized using MATLAB2018 on a dataset comprising 175,341 case, each of which involves 42 features and is authenticated using 82,332 case. The developed approach here has been applied for the first time, to the knowledge of the authors, to detect IDS attacks with 98.80% accuracy when validated using UNSW-NB15 dataset. The experimental results show the proposed method presents satisfactory results when compared with those obtained in this field.

Hendrawan, H., Sukarno, P., Nugroho, M. A..  2019.  Quality of Service (QoS) Comparison Analysis of Snort IDS and Bro IDS Application in Software Define Network (SDN) Architecture. 2019 7th International Conference on Information and Communication Technology (ICoICT). :1—7.

Intrusion Detection system (IDS) was an application which was aimed to monitor network activity or system and it could find if there was a dangerous operation. Implementation of IDS on Software Define Network architecture (SDN) has drawbacks. IDS on SDN architecture might decreasing network Quality of Service (QoS). So the network could not provide services to the existing network traffic. Throughput, delay and packet loss were important parameters of QoS measurement. Snort IDS and bro IDS were tools in the application of IDS on the network. Both had differences, one of which was found in the detection method. Snort IDS used a signature based detection method while bro IDS used an anomaly based detection method. The difference between them had effects in handling the network traffic through it. In this research, we compared both tools. This comparison are done with testing parameters such as throughput, delay, packet loss, CPU usage, and memory usage. From this test, it was found that bro outperform snort IDS for throughput, delay , and packet loss parameters. However, CPU usage and memory usage on bro requires higher resource than snort.

2020-11-17
Radha, P., Selvakumar, N., Sekar, J. Raja, Johnsonselva, J. V..  2018.  Enhancing Internet of Battle Things using Ultrasonic assisted Non-Destructive Testing (Technical solution). 2018 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC). :1—4.

The subsystem of IoMT (Internet of Military of Things) called IoBT (Internet of Battle of Things) is the major resource of the military where the various stack holders of the battlefield and different categories of equipment are tightly integrated through the internet. The proposed architecture mentioned in this paper will be helpful to design IoBT effectively for warfare using irresistible technologies like information technology, embedded technology, and network technology. The role of Machine intelligence is essential in IoBT to create smart things and provide accurate solutions without human intervention. Non-Destructive Testing (NDT) is used in Industries to examine and analyze the invisible defects of equipment. Generally, the ultrasonic waves are used to examine and analyze the internal defects of materials. Hence the proposed architecture of IoBT is enhanced by ultrasonic based NDT to study the properties of the things of the battlefield without causing any damage.

Qian, K., Parizi, R. M., Lo, D..  2018.  OWASP Risk Analysis Driven Security Requirements Specification for Secure Android Mobile Software Development. 2018 IEEE Conference on Dependable and Secure Computing (DSC). :1—2.
The security threats to mobile applications are growing explosively. Mobile apps flaws and security defects open doors for hackers to break in and access sensitive information. Defensive requirements analysis should be an integral part of secure mobile SDLC. Developers need to consider the information confidentiality and data integrity, to verify the security early in the development lifecycle rather than fixing the security holes after attacking and data leaks take place. Early eliminating known security vulnerabilities will help developers increase the security of apps and reduce the likelihood of exploitation. However, many software developers lack the necessary security knowledge and skills at the development stage, and that's why Secure Mobile Software Development education is very necessary for mobile software engineers. In this paper, we propose a guided security requirement analysis based on OWASP Mobile Top ten security risk recommendations for Android mobile software development and its traceability of the developmental controls in SDLC. Building secure apps immune to the OWASP Mobile Top ten risks would be an effective approach to provide very useful mobile security guidelines.
2020-11-04
Ngambeki, I., Nico, P., Dai, J., Bishop, M..  2018.  Concept Inventories in Cybersecurity Education: An Example from Secure Programming. 2018 IEEE Frontiers in Education Conference (FIE). :1—5.

This Innovative Practice Work in Progress paper makes the case for using concept inventories in cybersecurity education and presents an example of the development of a concept inventory in the field of secure programming. The secure programming concept inventory is being developed by a team of researchers from four universities. We used a Delphi study to define the content area to be covered by the concept inventory. Participants in the Delphi study included ten experts from academia, government, and industry. Based on the results, we constructed a concept map of secure programming concepts. We then compared this concept map to the Joint Task Force on Cybersecurity Education Curriculum 2017 guidelines to ensure complete coverage of secure programming concepts. Our mapping indicates a substantial match between the concept map and those guidelines.

Huang, B., Zhang, P..  2018.  Software Runtime Accumulative Testing. 2018 12th International Conference on Reliability, Maintainability, and Safety (ICRMS). :218—222.

The "aging" phenomenon occurs after the long-term running of software, with the fault rate rising and running efficiency dropping. As there is no corresponding testing type for this phenomenon among conventional software tests, "software runtime accumulative testing" is proposed. Through analyzing several examples of software aging causing serious accidents, software is placed in the system environment required for running and the occurrence mechanism of software aging is analyzed. In addition, corresponding testing contents and recommended testing methods are designed with regard to all factors causing software aging, and the testing process and key points of testing requirement analysis for carrying out runtime accumulative testing are summarized, thereby providing a method and guidance for carrying out "software runtime accumulative testing" in software engineering.

Howard, J. J., Blanchard, A. J., Sirotin, Y. B., Hasselgren, J. A., Vemury, A. R..  2018.  An Investigation of High-Throughput Biometric Systems: Results of the 2018 Department of Homeland Security Biometric Technology Rally. 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS). :1—7.

The 2018 Biometric Technology Rally was an evaluation, sponsored by the U.S. Department of Homeland Security, Science and Technology Directorate (DHS S&T), that challenged industry to provide face or face/iris systems capable of unmanned, traveler identification in a high-throughput security environment. Selected systems were installed at the Maryland Test Facility (MdTF), a DHS S&T affiliated bio-metrics testing laboratory, and evaluated using a population of 363 naive human subjects recruited from the general public. The performance of each system was examined based on measured throughput, capture capability, matching capability, and user satisfaction metrics. This research documents the performance of unmanned face and face/iris systems required to maintain an average total subject interaction time of less than 10 seconds. The results highlight discrepancies between the performance of biometric systems as anticipated by the system designers and the measured performance, indicating an incomplete understanding of the main determinants of system performance. Our research shows that failure-to-acquire errors, unpredicted by system designers, were the main driver of non-identification rates instead of failure-to-match errors, which were better predicted. This outcome indicates the need for a renewed focus on reducing the failure-to-acquire rate in high-throughput, unmanned biometric systems.

2020-11-02
Ping, C., Jun-Zhe, Z..  2019.  Research on Intelligent Evaluation Method of Transient Analysis Software Function Test. 2019 International Conference on Advances in Construction Machinery and Vehicle Engineering (ICACMVE). :58–61.

In transient distributed cloud computing environment, software is vulnerable to attack, which leads to software functional completeness, so it is necessary to carry out functional testing. In order to solve the problem of high overhead and high complexity of unsupervised test methods, an intelligent evaluation method for transient analysis software function testing based on active depth learning algorithm is proposed. Firstly, the active deep learning mathematical model of transient analysis software function test is constructed by using association rule mining method, and the correlation dimension characteristics of software function failure are analyzed. Then the reliability of the software is measured by the spectral density distribution method of software functional completeness. The intelligent evaluation model of transient analysis software function testing is established in the transient distributed cloud computing environment, and the function testing and reliability intelligent evaluation are realized. Finally, the performance of the transient analysis software is verified by the simulation experiment. The results show that the accuracy of the software functional integrity positioning is high and the intelligent evaluation of the transient analysis software function testing has a good self-adaptability by using this method to carry out the function test of the transient analysis software. It ensures the safe and reliable operation of the software.

Zhang, Z., Xie, X..  2019.  On the Investigation of Essential Diversities for Deep Learning Testing Criteria. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :394–405.

Recent years, more and more testing criteria for deep learning systems has been proposed to ensure system robustness and reliability. These criteria were defined based on different perspectives of diversity. However, there lacks comprehensive investigation on what are the most essential diversities that should be considered by a testing criteria for deep learning systems. Therefore, in this paper, we conduct an empirical study to investigate the relation between test diversities and erroneous behaviors of deep learning models. We define five metrics to reflect diversities in neuron activities, and leverage metamorphic testing to detect erroneous behaviors. We investigate the correlation between metrics and erroneous behaviors. We also go further step to measure the quality of test suites under the guidance of defined metrics. Our results provided comprehensive insights on the essential diversities for testing criteria to exhibit good fault detection ability.

Aman, W., Khan, F..  2019.  Ontology-based Dynamic and Context-aware Security Assessment Automation for Critical Applications. 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE). :644–647.

Several assessment techniques and methodologies exist to analyze the security of an application dynamically. However, they either are focused on a particular product or are mainly concerned about the assessment process rather than the product's security confidence. Most crucially, they tend to assess the security of a target application as a standalone artifact without assessing its host infrastructure. Such attempts can undervalue the overall security posture since the infrastructure becomes crucial when it hosts a critical application. We present an ontology-based security model that aims to provide the necessary knowledge, including network settings, application configurations, testing techniques and tools, and security metrics to evaluate the security aptitude of a critical application in the context of its hosting infrastructure. The objective is to integrate the current good practices and standards in security testing and virtualization to furnish an on-demand and test-ready virtual target infrastructure to execute the critical application and to initiate a context-aware and quantifiable security assessment process in an automated manner. Furthermore, we present a security assessment architecture to reflect on how the ontology can be integrated into a standard process.

Chong, T., Anu, V., Sultana, K. Z..  2019.  Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java and Python Open Source Projects. 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC). :98–103.

Software vulnerabilities often remain hidden until an attacker exploits the weak/insecure code. Therefore, testing the software from a vulnerability discovery perspective becomes challenging for developers if they do not inspect their code thoroughly (which is time-consuming). We propose that vulnerability prediction using certain software metrics can support the testing process by identifying vulnerable code-components (e.g., functions, classes, etc.). Once a code-component is predicted as vulnerable, the developers can focus their testing efforts on it, thereby avoiding the time/effort required for testing the entire application. The current paper presents a study that compares how software metrics perform as vulnerability predictors for software projects developed in two different languages (Java vs Python). The goal of this research is to analyze the vulnerability prediction performance of software metrics for different programming languages. We designed and conducted experiments on security vulnerabilities reported for three Java projects (Apache Tomcat 6, Tomcat 7, Apache CXF) and two Python projects (Django and Keystone). In this paper, we focus on a specific type of code component: Functions. We apply Machine Learning models for predicting vulnerable functions. Overall results show that software metrics-based vulnerability prediction is more useful for Java projects than Python projects (i.e., software metrics when used as features were able to predict Java vulnerable functions with a higher recall and precision compared to Python vulnerable functions prediction).

2020-10-29
Priyamvada Davuluru, Venkata Salini, Narayanan Narayanan, Barath, Balster, Eric J..  2019.  Convolutional Neural Networks as Classification Tools and Feature Extractors for Distinguishing Malware Programs. 2019 IEEE National Aerospace and Electronics Conference (NAECON). :273—278.

Classifying malware programs is a research area attracting great interest for Anti-Malware industry. In this research, we propose a system that visualizes malware programs as images and distinguishes those using Convolutional Neural Networks (CNNs). We study the performance of several well-established CNN based algorithms such as AlexNet, ResNet and VGG16 using transfer learning approaches. We also propose a computationally efficient CNN-based architecture for classification of malware programs. In addition, we study the performance of these CNNs as feature extractors by using Support Vector Machine (SVM) and K-nearest Neighbors (kNN) for classification purposes. We also propose fusion methods to boost the performance further. We make use of the publicly available database provided by Microsoft Malware Classification Challenge (BIG 2015) for this study. Our overall performance is 99.4% for a set of 2174 test samples comprising 9 different classes thereby setting a new benchmark.

Lo, Wai Weng, Yang, Xu, Wang, Yapeng.  2019.  An Xception Convolutional Neural Network for Malware Classification with Transfer Learning. 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1—5.

In this work, we applied a deep Convolutional Neural Network (CNN) with Xception model to perform malware image classification. The Xception model is a recently developed special CNN architecture that is more powerful with less over- fitting problems than the current popular CNN models such as VGG16. However only a few use cases of the Xception model can be found in literature, and it has never been used to solve the malware classification problem. The performance of our approach was compared with other methods including KNN, SVM, VGG16 etc. The experiments on two datasets (Malimg and Microsoft Malware Dataset) demonstrated that the Xception model can achieve the highest training accuracy than all other approaches including the champion approach, and highest validation accuracy than all other approaches including VGG16 model which are using image-based malware classification (except the champion solution as this information was not provided). Additionally, we proposed a novel ensemble model to combine the predictions from .bytes files and .asm files, showing that a lower logloss can be achieved. Although the champion on the Microsoft Malware Dataset achieved a bit lower logloss, our approach does not require any features engineering, making it more effective to adapt to any future evolution in malware, and very much less time consuming than the champion's solution.

2020-10-26
Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar.  2019.  A Novel Machine Learning Based Malware Detection and Classification Framework. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–4.
As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches.