Ray, Oliver, Moyle, Steve.
2021.
Towards expert-guided elucidation of cyber attacks through interactive inductive logic programming. 2021 13th International Conference on Knowledge and Systems Engineering (KSE). :1—7.
This paper proposes a logic-based machine learning approach called Acuity which is designed to facilitate user-guided elucidation of novel phenomena from evidence sparsely distributed across large volumes of linked relational data. The work builds on systems from the field of Inductive Logic Programming (ILP) by introducing a suite of new techniques for interacting with domain experts and data sources in a way that allows complex logical reasoning to be strategically exploited on large real-world databases through intuitive hypothesis-shaping and data-caching functionality. We propose two methods for rebutting or shaping candidate hypotheses and two methods for querying or importing relevant data from multiple sources. The benefits of Acuity are illustrated in a proof-of-principle case study involving a retrospective analysis of the CryptoWall ransomware attack using data from a cyber security testbed comprising a small business network and an infected laptop.
Aggarwal, Pranjal, Kumar, Akash, Michael, Kshitiz, Nemade, Jagrut, Sharma, Shubham, C, Pavan Kumar.
2021.
Random Decision Forest approach for Mitigating SQL Injection Attacks. 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). :1—5.
Structured Query Language (SQL) is extensively used for storing, manipulating and retrieving information in the relational database management system. Using SQL statements, attackers will try to gain unauthorized access to databases and launch attacks to modify/retrieve the stored data, such attacks are called as SQL injection attacks. Such SQL Injection (SQLi) attacks tops the list of web application security risks of all the times. Identifying and mitigating the potential SQL attack statements before their execution can prevent SQLi attacks. Various techniques are proposed in the literature to mitigate SQLi attacks. In this paper, a random decision forest approach is introduced to mitigate SQLi attacks. From the experimental results, we can infer that the proposed approach achieves a precision of 97% and an accuracy of 95%.
Hua, Yi, Li, Zhangbing, Sheng, Hankang, Wang, Baichuan.
2021.
A Method for Finding Quasi-identifier of Single Structured Relational Data. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :93—98.
Quasi-identifier is an attribute combined with other attributes to identify specific tuples or partial tuples. Improper selection of quasi-identifiers will lead to the failure of current privacy protection anonymization technology. Therefore, in this paper, we propose a method to solve single structured relational data quasi-identifiers based on functional dependency and determines the attribute classification standard. Firstly, the solution scope of quasi-identifier is determined to be all attributes except identity attributes and critical attributes. Secondly, the real data set is used to evaluate the dependency relationship between the indefinite attribute subset and the identity attribute to solve the quasi-identifiers set. Finally, we propose an algorithm to find all quasi-identifiers and experiment on real data sets of different sizes. The results show that our method can achieve better performance on the same dataset.
Jony, Mehdi Hassan, Johora, Fatema Tuj, Katha, Jannatul Ferdous.
2021.
A Robust and Efficient Numeric Approach for Relational Database Watermarking. 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI). :1—6.
Sharing relational databases on the Internet creates the need to protect these databases. Its output in substantial losses to the data storing systems because of unauthorized access to information that could lose novelty. The research associations use the research databases to mine new information about the research works of the relational databases that are available for free. It is a great challenge to maintain authenticity because these databases are vulnerable to security issues. Watermarking is a candidate solution that fully protects databases shared with the receiver. The protection of relational database ownership that may continue to evolve against the various aquatic mechanisms shared with the recipient that arouses appetite for attacks and must continue to evolve so that they can have database knowledge to support their decision-making system is effective. The relational database based onVirtual private key Watermarking using numeric attribute) involves embedding the same watermark in the same properties in different places in the same place. Therefore, data attackers cannot remove watermarks from data. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the data usability must remain intact after the data is watermarked. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the ability to use the data after watermarking the data must remain intact. The existence of a primary key is the main feature or compulsory item for most of the strategies. Our method provides solutions no primary key feature where the integrating search system of the database remains intact after watermarking distortion.
Figueiredo, Cainã, Lopes, João Gabriel, Azevedo, Rodrigo, Zaverucha, Gerson, Menasché, Daniel Sadoc, Pfleger de Aguiar, Leandro.
2021.
Software Vulnerabilities, Products and Exploits: A Statistical Relational Learning Approach. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :41—46.
Data on software vulnerabilities, products and exploits is typically collected from multiple non-structured sources. Valuable information, e.g., on which products are affected by which exploits, is conveyed by matching data from those sources, i.e., through their relations. In this paper, we leverage this simple albeit unexplored observation to introduce a statistical relational learning (SRL) approach for the analysis of vulnerabilities, products and exploits. In particular, we focus on the problem of determining the existence of an exploit for a given product, given information about the relations between products and vulnerabilities, and vulnerabilities and exploits, focusing on Industrial Control Systems (ICS), the National Vulnerability Database and ExploitDB. Using RDN-Boost, we were able to reach an AUC ROC of 0.83 and an AUC PR of 0.69 for the problem at hand. To reach that performance, we indicate that it is instrumental to include textual features, e.g., extracted from the description of vulnerabilities, as well as structured information, e.g., about product categories. In addition, using interpretable relational regression trees we report simple rules that shed insight on factors impacting the weaponization of ICS products.
Bašić, B., Udovičić, P., Orel, O..
2021.
In-database Auditing Subsystem for Security Enhancement. 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO). :1642—1647.
Many information systems have been around for several decades, and most of them have their underlying databases. The data accumulated in those databases over the years could be a very valuable asset, which must be protected. The first role of database auditing is to ensure and confirm that security measures are set correctly. However, tracing user behavior and collecting a rich audit trail enables us to use that trail in a more proactive ways. As an example, audit trail could be analyzed ad hoc and used to prevent intrusion, or analyzed afterwards, to detect user behavior patterns, forecast workloads, etc. In this paper, we present a simple, secure, configurable, role-separated, and effective in-database auditing subsystem, which can be used as a base for access control, intrusion detection, fraud detection and other security-related analyses and procedures. It consists of a management relations, code and data object generators and several administrative tools. This auditing subsystem, implemented in several information systems, is capable of keeping the entire audit trail (data history) of a database, as well as all the executed SQL statements, which enables different security applications, from ad hoc intrusion prevention to complex a posteriori security analyses.
Sánchez, Ricardo Andrés González, Bernal, Davor Julián Moreno, Parada, Hector Dario Jaimes.
2021.
Security assessment of Nosql Mongodb, Redis and Cassandra database managers. 2021 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI). :1—7.
The advancement of technology in the creation of new tools to solve problems such as information storage generates proportionally developing methods that search for security flaws or breaches that compromise said information. The need to periodically generate security reports on database managers is given by the complexity and number of attacks that can be carried out today. This project seeks to carry out an evaluation of the security of NoSQL database managers. The work methodology is developed according to the order of the objectives, it begins by synthesizing the types of vulnerabilities, attacks and protection schemes limited to MongoDB, Redis and Apache Cassandra. Once established, a prototype of a web system that stores information with a non-relational database will be designed on which a series of attacks defined by a test plan will be applied seeking to add, consult, modify or eliminate information. Finally, a report will be presented that sets out the attacks carried out, the way in which they were applied, the results, possible countermeasures, security advantages and disadvantages for each manager and the conclusions obtained. Thus, it is possible to select which tool is more convenient to use for a person or organization in a particular case. The results showed that MongoDB is more vulnerable to NoSQL injection attacks, Redis is more vulnerable to attacks registered in the CVE and that Cassandra is more complex to use but is less vulnerable.
Tang, Xiao, Cao, Zhenfu, Dong, Xiaolei, Shen, Jiachen.
2021.
PKMark: A Robust Zero-distortion Blind Reversible Scheme for Watermarking Relational Databases. 2021 IEEE 15th International Conference on Big Data Science and Engineering (BigDataSE). :72—79.
In this paper, we propose a zero-distortion blind reversible robust scheme for watermarking relational databases called PKMark. Data owner can declare the copyright of the databases or pursue the infringement by extracting the water-mark information embedded in the database. PKMark is mainly based on the primary key attribute of the tuple. So it does not depend on the type of the attribute, and can provide high-precision numerical attributes. PKMark uses RSA encryption on the watermark before embedding the watermark to ensure the security of the watermark information. Then we use RSA to sign the watermark cipher text so that the owner can verify the ownership of the watermark without disclosing the watermark. The watermark embedding and extraction are based on the hash value of the primary key, so the scheme has blindness and reversibility. In other words, the user can obtain the watermark information or restore the original database without comparing it to the original database. Our scheme also has almost excellent robustness against addition attacks, deletion attacks and alteration attacks. In addition, PKMark is resistant to additive attacks, allowing different users to embed multiple watermarks without interfering with each other, and it can indicate the sequence of watermark embedding so as to indicate the original copyright owner of the database. This watermarking scheme also allows data owners to detect whether the data has been tampered with.
Giesser, Patrick, Stechschulte, Gabriel, Costa Vaz, Anna da, Kaufmann, Michael.
2021.
Implementing Efficient and Scalable In-Database Linear Regression in SQL. 2021 IEEE International Conference on Big Data (Big Data). :5125—5132.
Relational database management systems not only support larger-than-memory data processing and very advanced query optimization, but also offer the benefits of data security, privacy, and consistency. When machine learning on large data sets is processed directly on an existing SQL database server, the data does not need to be exported and transferred to a separate big data processing platform. To achieve this, we implement a linear regression algorithm using SQL code generation such that the computation can be performed server-side and directly in the RDBMs. Our method and its implementation, programmed in Python, solves linear regression (LR) using the ordinary least squares (OLS) method directly in the RDBMS using SQL code generation, leaving most of the processing in the database. Only the matrix of the system of equations, whose size is equal to the number of variables squared, is transferred from the SQL server to the Python client to be solved for OLS regression. For evaluation purposes, our LR implementation was tested with artificially generated datasets and compared to an existing Python library (Scikit Learn). We found that our implementation consistently solves OLS regression faster than Scikit Learn for datasets with more than 10,000 input rows, and if the number of columns is less than 64. Moreover, under the same test conditions where the computation is larger than memory, our implementation showed a fast result, while Scikit returned an out-of-memory error. We conclude that SQL is a promising tool for in-database processing of large-volume, low-dimensional data sets with a particular class of machine learning algorithms, namely those that can be efficiently solved with map-reduce queries such as OLS regression.
Pengwei, Ma, Kai, Wei, Chunyu, Jiang, Junyi, Li, Jiafeng, Tian, Siyuan, Liu, Minjing, Zhong.
2021.
Research on Evaluation System of Relational Cloud Database. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1369—1373.
With the continuous emergence of cloud computing technology, cloud infrastructure software will become the mainstream application model in the future. Among the databases, relational databases occupy the largest market share. Therefore, the relational cloud database will be the main product of the combination of database technology and cloud computing technology, and will become an important branch of the database industry. This article explores the establishment of an evaluation system framework for relational databases, helping enterprises to select relational cloud database products according to a clear goal and path. This article can help enterprises complete the landing of relational cloud database projects.
Lagraa, Sofiane, State, Radu.
2021.
What database do you choose for heterogeneous security log events analysis? 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM). :812—817.
The heterogeneous massive logs incoming from multiple sources pose major challenges to professionals responsible for IT security and system administrator. One of the challenges is to develop a scalable heterogeneous logs database for storage and further analysis. In fact, it is difficult to decide which database is suitable for the needs, the best of a use case, execution time and storage performances. In this paper, we explore, study, and compare the performance of SQL and NoSQL databases on large heterogeneous event logs. We implement the relational database using MySQL, the column-oriented database using Impala on the top of Hadoop, and the graph database using Neo4j. We experiment the databases on a large heterogeneous logs and provide advice, the pros and cons of each SQL and NoSQL database. Our findings that Impala outperforms MySQL and Neo4j databases in terms of loading logs, execution time of simple queries, and storage of logs. However, Neo4j outperforms Impala and MySQL in the execution time of complex queries.