Visible to the public Biblio

Found 5879 results

Filters: Keyword is composability  [Clear All Filters]
2022-07-28
Ruohonen, Jukka, Hjerppe, Kalle, Rindell, Kalle.  2021.  A Large-Scale Security-Oriented Static Analysis of Python Packages in PyPI. 2021 18th International Conference on Privacy, Security and Trust (PST). :1—10.
Different security issues are a common problem for open source packages archived to and delivered through software ecosystems. These often manifest themselves as software weaknesses that may lead to concrete software vulnerabilities. This paper examines various security issues in Python packages with static analysis. The dataset is based on a snapshot of all packages stored to the Python Package Index (PyPI). In total, over 197 thousand packages and over 749 thousand security issues are covered. Even under the constraints imposed by static analysis, (a) the results indicate prevalence of security issues; at least one issue is present for about 46% of the Python packages. In terms of the issue types, (b) exception handling and different code injections have been the most common issues. The subprocess module stands out in this regard. Reflecting the generally small size of the packages, (c) software size metrics do not predict well the amount of issues revealed through static analysis. With these results and the accompanying discussion, the paper contributes to the field of large-scale empirical studies for better understanding security problems in software ecosystems.
Wang, Jingjing, Huang, Minhuan, Nie, Yuanping, Li, Jin.  2021.  Static Analysis of Source Code Vulnerability Using Machine Learning Techniques: A Survey. 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD). :76—86.

With the rapid increase of practical problem complexity and code scale, the threat of software security is increasingly serious. Consequently, it is crucial to pay attention to the analysis of software source code vulnerability in the development stage and take efficient measures to detect the vulnerability as soon as possible. Machine learning techniques have made remarkable achievements in various fields. However, the application of machine learning in the domain of vulnerability static analysis is still in its infancy and the characteristics and performance of diverse methods are quite different. In this survey, we focus on a source code-oriented static vulnerability analysis method using machine learning techniques. We review the studies on source code vulnerability analysis based on machine learning in the past decade. We systematically summarize the development trends and different technical characteristics in this field from the perspectives of the intermediate representation of source code and vulnerability prediction model and put forward several feasible research directions in the future according to the limitations of the current approaches.

Obert, James, Loffredo, Tim.  2021.  Efficient Binary Static Code Data Flow Analysis Using Unsupervised Learning. 2021 4th International Conference on Artificial Intelligence for Industries (AI4I). :89—90.
The ever increasing need to ensure that code is reliably, efficiently and safely constructed has fueled the evolution of popular static binary code analysis tools. In identifying potential coding flaws in binaries, tools such as IDA Pro are used to disassemble the binaries into an opcode/assembly language format in support of manual static code analysis. Because of the highly manual and resource intensive nature involved with analyzing large binaries, the probability of overlooking potential coding irregularities and inefficiencies is quite high. In this paper, a light-weight, unsupervised data flow methodology is described which uses highly-correlated data flow graph (CDFGs) to identify coding irregularities such that analysis time and required computing resources are minimized. Such analysis accuracy and efficiency gains are achieved by using a combination of graph analysis and unsupervised machine learning techniques which allows an analyst to focus on the most statistically significant flow patterns while performing binary static code analysis.
Qian, Tiantian, Yang, Shengchun, Wang, Shenghe, Pan, Dong, Geng, Jian, Wang, Ke.  2021.  Static Security Analysis of Source-Side High Uncertainty Power Grid Based on Deep Learning. 2021 China International Conference on Electricity Distribution (CICED). :973—975.
As a large amount of renewable energy is injected into the power grid, the source side of the power grid becomes extremely uncertain. Traditional static safety analysis methods based on pure physical models can no longer quickly and reliably give analysis results. Therefore, this paper proposes a deep learning-based static security analytical method. First, the static security assessment index of the power grid under the N-1 principle is proposed. Secondly, a neural network model and its input and output data for static safety analysis problems are designed. Finally, the validity of the proposed method was verified by IEEE grid data. Experiments show that the proposed method can quickly and accurately give the static security analysis results of the source-side high uncertainty grid.
ÖZGÜR, Berkecan, Dogru, Ibrahim Alper, Uçtu, Göksel, ALKAN, Mustafa.  2021.  A Suggested Model for Mobile Application Penetration Test Framework. 2021 International Conference on Information Security and Cryptology (ISCTURKEY). :18—21.

Along with technological developments in the mobile environment, mobile devices are used in many areas like banking, social media and communication. The common characteristic of applications in these fields is that they contain personal or financial information of users. These types of applications are developed for Android or IOS operating systems and have become the target of attackers. To detect weakness, security analysts, perform mobile penetration tests using security analysis tools. These analysis tools have advantages and disadvantages to each other. Some tools can prioritize static or dynamic analysis, others not including these types of tests. Within the scope of the current model, we are aim to gather security analysis tools under the penetration testing framework, also contributing analysis results by data fusion algorithm. With the suggested model, security analysts will be able to use these types of analysis tools in addition to using the advantage of fusion algorithms fed by analysis tools outputs.

Ami, Amit Seal, Kafle, Kaushal, Nadkarni, Adwait, Poshyvanyk, Denys, Moran, Kevin.  2021.  µSE: Mutation-Based Evaluation of Security-Focused Static Analysis Tools for Android. 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :53—56.
This demo paper presents the technical details and usage scenarios of μSE: a mutation-based tool for evaluating security-focused static analysis tools for Android. Mutation testing is generally used by software practitioners to assess the robustness of a given test-suite. However, we leverage this technique to systematically evaluate static analysis tools and uncover and document soundness issues.μSE's analysis has found 25 previously undocumented flaws in static data leak detection tools for Android.μSE offers four mutation schemes, namely Reachability, Complex-reachability, TaintSink, and ScopeSink, which determine the locations of seeded mutants. Furthermore, the user can extend μSE by customizing the API calls targeted by the mutation analysis.μSE is also practical, as it makes use of filtering techniques based on compilation and execution criteria that reduces the number of ineffective mutations.
Iqbal, Younis, Sindhu, Muddassar Azam, Arif, Muhammad Hassan, Javed, Muhammad Amir.  2021.  Enhancement in Buffer Overflow (BOF) Detection Capability of Cppcheck Static Analysis Tool. 2021 International Conference on Cyber Warfare and Security (ICCWS). :112—117.

Buffer overflow (BOF) vulnerability is one of the most dangerous security vulnerability which can be exploited by unwanted users. This vulnerability can be detected by both static and dynamic analysis techniques. For dynamic analysis, execution of the program is required in which the behavior of the program according to specifications is checked while in static analysis the source code is analyzed for security vulnerabilities without execution of code. Despite the fact that many open source and commercial security analysis tools employ static and dynamic methods but there is still a margin for improvement in BOF vulnerability detection capability of these tools. We propose an enhancement in Cppcheck tool for statically detecting BOF vulnerability using data flow analysis in C programs. We have used the Juliet Test Suite to test our approach. We selected two best tools cited in the literature for BOF detection (i.e. Frama-C and Splint) to compare the performance and accuracy of our approach. From the experiments, our proposed approach generated Youden Index of 0.45, Frama-C has only 0.1 Youden's score and Splint generated Youden score of -0.47. These results show that our technique performs better as compared to both Frama-C and Splint static analysis tools.

2022-07-15
Ray, Oliver, Moyle, Steve.  2021.  Towards expert-guided elucidation of cyber attacks through interactive inductive logic programming. 2021 13th International Conference on Knowledge and Systems Engineering (KSE). :1—7.
This paper proposes a logic-based machine learning approach called Acuity which is designed to facilitate user-guided elucidation of novel phenomena from evidence sparsely distributed across large volumes of linked relational data. The work builds on systems from the field of Inductive Logic Programming (ILP) by introducing a suite of new techniques for interacting with domain experts and data sources in a way that allows complex logical reasoning to be strategically exploited on large real-world databases through intuitive hypothesis-shaping and data-caching functionality. We propose two methods for rebutting or shaping candidate hypotheses and two methods for querying or importing relevant data from multiple sources. The benefits of Acuity are illustrated in a proof-of-principle case study involving a retrospective analysis of the CryptoWall ransomware attack using data from a cyber security testbed comprising a small business network and an infected laptop.
Aggarwal, Pranjal, Kumar, Akash, Michael, Kshitiz, Nemade, Jagrut, Sharma, Shubham, C, Pavan Kumar.  2021.  Random Decision Forest approach for Mitigating SQL Injection Attacks. 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). :1—5.
Structured Query Language (SQL) is extensively used for storing, manipulating and retrieving information in the relational database management system. Using SQL statements, attackers will try to gain unauthorized access to databases and launch attacks to modify/retrieve the stored data, such attacks are called as SQL injection attacks. Such SQL Injection (SQLi) attacks tops the list of web application security risks of all the times. Identifying and mitigating the potential SQL attack statements before their execution can prevent SQLi attacks. Various techniques are proposed in the literature to mitigate SQLi attacks. In this paper, a random decision forest approach is introduced to mitigate SQLi attacks. From the experimental results, we can infer that the proposed approach achieves a precision of 97% and an accuracy of 95%.
Hua, Yi, Li, Zhangbing, Sheng, Hankang, Wang, Baichuan.  2021.  A Method for Finding Quasi-identifier of Single Structured Relational Data. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :93—98.
Quasi-identifier is an attribute combined with other attributes to identify specific tuples or partial tuples. Improper selection of quasi-identifiers will lead to the failure of current privacy protection anonymization technology. Therefore, in this paper, we propose a method to solve single structured relational data quasi-identifiers based on functional dependency and determines the attribute classification standard. Firstly, the solution scope of quasi-identifier is determined to be all attributes except identity attributes and critical attributes. Secondly, the real data set is used to evaluate the dependency relationship between the indefinite attribute subset and the identity attribute to solve the quasi-identifiers set. Finally, we propose an algorithm to find all quasi-identifiers and experiment on real data sets of different sizes. The results show that our method can achieve better performance on the same dataset.
Jony, Mehdi Hassan, Johora, Fatema Tuj, Katha, Jannatul Ferdous.  2021.  A Robust and Efficient Numeric Approach for Relational Database Watermarking. 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI). :1—6.
Sharing relational databases on the Internet creates the need to protect these databases. Its output in substantial losses to the data storing systems because of unauthorized access to information that could lose novelty. The research associations use the research databases to mine new information about the research works of the relational databases that are available for free. It is a great challenge to maintain authenticity because these databases are vulnerable to security issues. Watermarking is a candidate solution that fully protects databases shared with the receiver. The protection of relational database ownership that may continue to evolve against the various aquatic mechanisms shared with the recipient that arouses appetite for attacks and must continue to evolve so that they can have database knowledge to support their decision-making system is effective. The relational database based onVirtual private key Watermarking using numeric attribute) involves embedding the same watermark in the same properties in different places in the same place. Therefore, data attackers cannot remove watermarks from data. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the data usability must remain intact after the data is watermarked. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the ability to use the data after watermarking the data must remain intact. The existence of a primary key is the main feature or compulsory item for most of the strategies. Our method provides solutions no primary key feature where the integrating search system of the database remains intact after watermarking distortion.
Figueiredo, Cainã, Lopes, João Gabriel, Azevedo, Rodrigo, Zaverucha, Gerson, Menasché, Daniel Sadoc, Pfleger de Aguiar, Leandro.  2021.  Software Vulnerabilities, Products and Exploits: A Statistical Relational Learning Approach. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :41—46.
Data on software vulnerabilities, products and exploits is typically collected from multiple non-structured sources. Valuable information, e.g., on which products are affected by which exploits, is conveyed by matching data from those sources, i.e., through their relations. In this paper, we leverage this simple albeit unexplored observation to introduce a statistical relational learning (SRL) approach for the analysis of vulnerabilities, products and exploits. In particular, we focus on the problem of determining the existence of an exploit for a given product, given information about the relations between products and vulnerabilities, and vulnerabilities and exploits, focusing on Industrial Control Systems (ICS), the National Vulnerability Database and ExploitDB. Using RDN-Boost, we were able to reach an AUC ROC of 0.83 and an AUC PR of 0.69 for the problem at hand. To reach that performance, we indicate that it is instrumental to include textual features, e.g., extracted from the description of vulnerabilities, as well as structured information, e.g., about product categories. In addition, using interpretable relational regression trees we report simple rules that shed insight on factors impacting the weaponization of ICS products.
Bašić, B., Udovičić, P., Orel, O..  2021.  In-database Auditing Subsystem for Security Enhancement. 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO). :1642—1647.
Many information systems have been around for several decades, and most of them have their underlying databases. The data accumulated in those databases over the years could be a very valuable asset, which must be protected. The first role of database auditing is to ensure and confirm that security measures are set correctly. However, tracing user behavior and collecting a rich audit trail enables us to use that trail in a more proactive ways. As an example, audit trail could be analyzed ad hoc and used to prevent intrusion, or analyzed afterwards, to detect user behavior patterns, forecast workloads, etc. In this paper, we present a simple, secure, configurable, role-separated, and effective in-database auditing subsystem, which can be used as a base for access control, intrusion detection, fraud detection and other security-related analyses and procedures. It consists of a management relations, code and data object generators and several administrative tools. This auditing subsystem, implemented in several information systems, is capable of keeping the entire audit trail (data history) of a database, as well as all the executed SQL statements, which enables different security applications, from ad hoc intrusion prevention to complex a posteriori security analyses.
Sánchez, Ricardo Andrés González, Bernal, Davor Julián Moreno, Parada, Hector Dario Jaimes.  2021.  Security assessment of Nosql Mongodb, Redis and Cassandra database managers. 2021 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI). :1—7.
The advancement of technology in the creation of new tools to solve problems such as information storage generates proportionally developing methods that search for security flaws or breaches that compromise said information. The need to periodically generate security reports on database managers is given by the complexity and number of attacks that can be carried out today. This project seeks to carry out an evaluation of the security of NoSQL database managers. The work methodology is developed according to the order of the objectives, it begins by synthesizing the types of vulnerabilities, attacks and protection schemes limited to MongoDB, Redis and Apache Cassandra. Once established, a prototype of a web system that stores information with a non-relational database will be designed on which a series of attacks defined by a test plan will be applied seeking to add, consult, modify or eliminate information. Finally, a report will be presented that sets out the attacks carried out, the way in which they were applied, the results, possible countermeasures, security advantages and disadvantages for each manager and the conclusions obtained. Thus, it is possible to select which tool is more convenient to use for a person or organization in a particular case. The results showed that MongoDB is more vulnerable to NoSQL injection attacks, Redis is more vulnerable to attacks registered in the CVE and that Cassandra is more complex to use but is less vulnerable.
Tang, Xiao, Cao, Zhenfu, Dong, Xiaolei, Shen, Jiachen.  2021.  PKMark: A Robust Zero-distortion Blind Reversible Scheme for Watermarking Relational Databases. 2021 IEEE 15th International Conference on Big Data Science and Engineering (BigDataSE). :72—79.
In this paper, we propose a zero-distortion blind reversible robust scheme for watermarking relational databases called PKMark. Data owner can declare the copyright of the databases or pursue the infringement by extracting the water-mark information embedded in the database. PKMark is mainly based on the primary key attribute of the tuple. So it does not depend on the type of the attribute, and can provide high-precision numerical attributes. PKMark uses RSA encryption on the watermark before embedding the watermark to ensure the security of the watermark information. Then we use RSA to sign the watermark cipher text so that the owner can verify the ownership of the watermark without disclosing the watermark. The watermark embedding and extraction are based on the hash value of the primary key, so the scheme has blindness and reversibility. In other words, the user can obtain the watermark information or restore the original database without comparing it to the original database. Our scheme also has almost excellent robustness against addition attacks, deletion attacks and alteration attacks. In addition, PKMark is resistant to additive attacks, allowing different users to embed multiple watermarks without interfering with each other, and it can indicate the sequence of watermark embedding so as to indicate the original copyright owner of the database. This watermarking scheme also allows data owners to detect whether the data has been tampered with.
Giesser, Patrick, Stechschulte, Gabriel, Costa Vaz, Anna da, Kaufmann, Michael.  2021.  Implementing Efficient and Scalable In-Database Linear Regression in SQL. 2021 IEEE International Conference on Big Data (Big Data). :5125—5132.
Relational database management systems not only support larger-than-memory data processing and very advanced query optimization, but also offer the benefits of data security, privacy, and consistency. When machine learning on large data sets is processed directly on an existing SQL database server, the data does not need to be exported and transferred to a separate big data processing platform. To achieve this, we implement a linear regression algorithm using SQL code generation such that the computation can be performed server-side and directly in the RDBMs. Our method and its implementation, programmed in Python, solves linear regression (LR) using the ordinary least squares (OLS) method directly in the RDBMS using SQL code generation, leaving most of the processing in the database. Only the matrix of the system of equations, whose size is equal to the number of variables squared, is transferred from the SQL server to the Python client to be solved for OLS regression. For evaluation purposes, our LR implementation was tested with artificially generated datasets and compared to an existing Python library (Scikit Learn). We found that our implementation consistently solves OLS regression faster than Scikit Learn for datasets with more than 10,000 input rows, and if the number of columns is less than 64. Moreover, under the same test conditions where the computation is larger than memory, our implementation showed a fast result, while Scikit returned an out-of-memory error. We conclude that SQL is a promising tool for in-database processing of large-volume, low-dimensional data sets with a particular class of machine learning algorithms, namely those that can be efficiently solved with map-reduce queries such as OLS regression.
Pengwei, Ma, Kai, Wei, Chunyu, Jiang, Junyi, Li, Jiafeng, Tian, Siyuan, Liu, Minjing, Zhong.  2021.  Research on Evaluation System of Relational Cloud Database. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1369—1373.
With the continuous emergence of cloud computing technology, cloud infrastructure software will become the mainstream application model in the future. Among the databases, relational databases occupy the largest market share. Therefore, the relational cloud database will be the main product of the combination of database technology and cloud computing technology, and will become an important branch of the database industry. This article explores the establishment of an evaluation system framework for relational databases, helping enterprises to select relational cloud database products according to a clear goal and path. This article can help enterprises complete the landing of relational cloud database projects.
Lagraa, Sofiane, State, Radu.  2021.  What database do you choose for heterogeneous security log events analysis? 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM). :812—817.
The heterogeneous massive logs incoming from multiple sources pose major challenges to professionals responsible for IT security and system administrator. One of the challenges is to develop a scalable heterogeneous logs database for storage and further analysis. In fact, it is difficult to decide which database is suitable for the needs, the best of a use case, execution time and storage performances. In this paper, we explore, study, and compare the performance of SQL and NoSQL databases on large heterogeneous event logs. We implement the relational database using MySQL, the column-oriented database using Impala on the top of Hadoop, and the graph database using Neo4j. We experiment the databases on a large heterogeneous logs and provide advice, the pros and cons of each SQL and NoSQL database. Our findings that Impala outperforms MySQL and Neo4j databases in terms of loading logs, execution time of simple queries, and storage of logs. However, Neo4j outperforms Impala and MySQL in the execution time of complex queries.
2022-07-14
Rathod, Viraj, Parekh, Chandresh, Dholariya, Dharati.  2021.  AI & ML Based Anamoly Detection and Response Using Ember Dataset. 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). :1–5.
In the era of rapid technological growth, malicious traffic has drawn increased attention. Most well-known offensive security assessment todays are heavily focused on pre-compromise. The amount of anomalous data in today's context is massive. Analyzing the data using primitive methods would be highly challenging. Solution to it is: If we can detect adversary behaviors in the early stage of compromise, one can prevent and safeguard themselves from various attacks including ransomwares and Zero-day attacks. Integration of new technologies Artificial Intelligence & Machine Learning with manual Anomaly Detection can provide automated machine-based detection which in return can provide the fast, error free, simplify & scalable Threat Detection & Response System. Endpoint Detection & Response (EDR) tools provide a unified view of complex intrusions using known adversarial behaviors to identify intrusion events. We have used the EMBER dataset, which is a labelled benchmark dataset. It is used to train machine learning models to detect malicious portable executable files. This dataset consists of features derived from 1.1 million binary files: 900,000 training samples among which 300,000 were malicious, 300,000 were benevolent, 300,000 un-labelled, and 200,000 evaluation samples among which 100K were malicious, 100K were benign. We have also included open-source code for extracting features from additional binaries, enabling the addition of additional sample features to the dataset.
Taylor, Michael A., Larson, Eric C., Thornton, Mitchell A..  2021.  Rapid Ransomware Detection through Side Channel Exploitation. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :47–54.
A new method for the detection of ransomware in an infected host is described and evaluated. The method utilizes data streams from on-board sensors to fingerprint the initiation of a ransomware infection. These sensor streams, which are common in modern computing systems, are used as a side channel for understanding the state of the system. It is shown that ransomware detection can be achieved in a rapid manner and that the use of slight, yet distinguishable changes in the physical state of a system as derived from a machine learning predictive model is an effective technique. A feature vector, consisting of various sensor outputs, is coupled with a detection criteria to predict the binary state of ransomware present versus normal operation. An advantage of this approach is that previously unknown or zero-day version s of ransomware are vulnerable to this detection method since no apriori knowledge of the malware characteristics are required. Experiments are carried out with a variety of different system loads and with different encryption methods used during a ransomware attack. Two test systems were utilized with one having a relatively low amount of available sensor data and the other having a relatively high amount of available sensor data. The average time for attack detection in the "sensor-rich" system was 7.79 seconds with an average Matthews correlation coefficient of 0.8905 for binary system state predictions regardless of encryption method and system load. The model flagged all attacks tested.
Ayub, Md. Ahsan, Sirai, Ambareen.  2021.  Similarity Analysis of Ransomware based on Portable Executable (PE) File Metadata. 2021 IEEE Symposium Series on Computational Intelligence (SSCI). :1–6.
Threats, posed by ransomware, are rapidly increasing, and its cost on both national and global scales is becoming significantly high as evidenced by the recent events. Ransomware carries out an irreversible process, where it encrypts victims' digital assets to seek financial compensations. Adversaries utilize different means to gain initial access to the target machines, such as phishing emails, vulnerable public-facing software, Remote Desktop Protocol (RDP), brute-force attacks, and stolen accounts. To combat these threats of ransomware, this paper aims to help researchers gain a better understanding of ransomware application profiles through static analysis, where we identify a list of suspicious indicators and similarities among 727 active ran-somware samples. We start with generating portable executable (PE) metadata for all the studied samples. With our domain knowledge and exploratory data analysis tasks, we introduce some of the suspicious indicators of the structure of ransomware files. We reduce the dimensionality of the generated dataset by using the Principal Component Analysis (PCA) technique and discover clusters by applying the KMeans algorithm. This motivates us to utilize the one-class classification algorithms on the generated dataset. As a result, the algorithms learn the common data boundary in the structure of our studied ransomware samples, and thereby, we achieve the data-driven similarities. We use the findings to evaluate the trained classifiers with the test samples and observe that the Local Outlier Factor (LoF) performs better on all the selected feature spaces compared to the One-Class SVM and the Isolation Forest algorithms.
Almousa, May, Osawere, Janet, Anwar, Mohd.  2021.  Identification of Ransomware families by Analyzing Network Traffic Using Machine Learning Techniques. 2021 Third International Conference on Transdisciplinary AI (TransAI). :19–24.
The number of prominent ransomware attacks has increased recently. In this research, we detect ransomware by analyzing network traffic by using machine learning algorithms and comparing their detection performances. We have developed multi-class classification models to detect families of ransomware by using the selected network traffic features, which focus on the Transmission Control Protocol (TCP). Our experiment showed that decision trees performed best for classifying ransomware families with 99.83% accuracy, which is slightly better than the random forest algorithm with 99.61% accuracy. The experimental result without feature selection classified six ransomware families with high accuracy. On the other hand, classifiers with feature selection gave nearly the same result as those without feature selection. However, using feature selection gives the advantage of lower memory usage and reduced processing time, thereby increasing speed. We discovered the following ten important features for detecting ransomware: time delta, frame length, IP length, IP destination, IP source, TCP length, TCP sequence, TCP next sequence, TCP header length, and TCP initial round trip.
Pagán, Alexander, Elleithy, Khaled.  2021.  A Multi-Layered Defense Approach to Safeguard Against Ransomware. 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). :0942–0947.
There has been a significant rise in ransomware attacks over the last few years. Cyber attackers have made use of tried and true ransomware viruses to target the government, health care, and educational institutions. Ransomware variants can be purchased on the dark web by amateurs giving them the same attack tools used by professional cyber attackers without experience or skill. Traditional antivirus and antimalware products have improved, but they alone fall short when it comes to catching and stopping ransomware attacks. Employee training has become one of the most important aspects of being prepared for attempted cyberattacks. However, training alone only goes so far; human error is still the main entry point for malware and ransomware infections. In this paper, we propose a multi-layered defense approach to safeguard against ransomware. We have come to the startling realization that it is not a matter of “if” your organization will be hit with ransomware, but “when” your organization will be hit with ransomware. If an organization is not adequately prepared for an attack or how to respond to an attack, the effects can be costly and devastating. Our approach proposes having innovative antimalware software on the local machines, properly configured firewalls, active DNS/Web filtering, email security, backups, and staff training. With the implementation of this layered defense, the attempt can be caught and stopped at multiple points in the event of an attempted ransomware attack. If the attack were successful, the layered defense provides the option for recovery of affected data without paying a ransom.
Almousa, May, Basavaraju, Sai, Anwar, Mohd.  2021.  API-Based Ransomware Detection Using Machine Learning-Based Threat Detection Models. 2021 18th International Conference on Privacy, Security and Trust (PST). :1–7.
Ransomware is a major malware attack experienced by large corporations and healthcare services. Ransomware employs the idea of cryptovirology, which uses cryptography to design malware. The goal of ransomware is to extort ransom by threatening the victim with the destruction of their data. Ransomware typically involves a 3-step process: analyzing the victim’s network traffic, identifying a vulnerability, and then exploiting it. Thus, the detection of ransomware has become an important undertaking that involves various sophisticated solutions for improving security. To further enhance ransomware detection capabilities, this paper focuses on an Application Programming Interface (API)-based ransomware detection approach in combination with machine learning (ML) techniques. The focus of this research is (i) understanding the life cycle of ransomware on the Windows platform, (ii) dynamic analysis of ransomware samples to extract various features of malicious code patterns, and (iii) developing and validating machine learning-based ransomware detection models on different ransomware and benign samples. Data were collected from publicly available repositories and subjected to sandbox analysis for sampling. The sampled datasets were applied to build machine learning models. The grid search hyperparameter optimization algorithm was employed to obtain the best fit model; the results were cross-validated with the testing datasets. This analysis yielded a high ransomware detection accuracy of 99.18% for Windows-based platforms and shows the potential for achieving high-accuracy ransomware detection capabilities when using a combination of API calls and an ML model. This approach can be further utilized with existing multilayer security solutions to protect critical data from ransomware attacks.
Urooj, Umara, Maarof, Mohd Aizaini Bin, Al-rimy, Bander Ali Saleh.  2021.  A proposed Adaptive Pre-Encryption Crypto-Ransomware Early Detection Model. 2021 3rd International Cyber Resilience Conference (CRC). :1–6.
Crypto-ransomware is a malware that uses the system’s cryptography functions to encrypt user data. The irreversible effect of crypto-ransomware makes it challenging to survive the attack compared to other malware categories. When a crypto-ransomware attack encrypts user files, it becomes difficult to access these files without having the decryption key. Due to the availability of ransomware development tool kits like Ransomware as a Service (RaaS), many ransomware variants are being developed. This contributes to the rise of ransomware attacks witnessed nowadays. However, the conventional approaches employed by malware detection solutions are not suitable to detect ransomware. This is because ransomware needs to be detected as early as before the encryption takes place. These attacks can effectively be handled only if detected during the pre-encryption phase. Early detection of ransomware attacks is challenging due to the limited amount of data available before encryption. An adaptive pre-encryption model is proposed in this paper which is expected to deal with the population concept drift of crypto-ransomware given the limited amount of data collected during the pre-encryption phase of the attack lifecycle. With such adaptability, the model can maintain up-to-date knowledge about the attack behavior and identify the polymorphic ransomware that continuously changes its behavior.