Visible to the public Biblio

Filters: Keyword is NoSQL  [Clear All Filters]
2021-10-12
Kashliev, Andrii.  2020.  Storage and Querying of Large Provenance Graphs Using NoSQL DSE. 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :260–262.
Provenance metadata captures history of derivation of an entity, such as a dataset obtained through numerous data transformations. It is of great importance for science, among other fields, as it enables reproducibility and greater intelligibility of research results. With the avalanche of provenance produced by today's society, there is a pressing need for storing and low-latency querying of large provenance graphs. To address this need, in this paper we present a scalable approach to storing and querying provenance graphs using a popular NoSQL column family database system called DataStax Enterprise (DSE). Specifically, we i) propose a storage scheme, including two novel indices that enable efficient traversal of provenance graphs along causality lines, ii) present an algorithm for building our proposed indices for a given provenance graph, iii) implement our algorithm and conduct a performance study in which we store and query a provenance graph with over five million vertices using a DSE cluster running in AWS cloud. Our performance study results further validate scalability and performance efficiency of our approach.
2018-06-11
Rafique, Ansar, Van Landuyt, Dimitri, Reniers, Vincent, Joosen, Wouter.  2017.  Towards Scalable and Dynamic Data Encryption for Multi-tenant SaaS. Proceedings of the Symposium on Applied Computing. :411–416.
Application-level data management middleware solutions are becoming increasingly compelling to deal with the complexity of a multi-cloud or federated cloud storage and multitenant storage architecture. However, these systems typically support traditional data mapping strategies that are created under the assumption of a fixed and rigorous database schema, and mapping data objects while supporting varying data confidentiality requirements therefore leads to fragmentation of data over distributed storage nodes. This introduces performance over-head at the level of individual database transactions and negatively affects the overall scalability. This paper discusses these challenges and highlights the potential of leveraging the data schema flexibility of NoSQL databases to accomplish dynamic and fine-grained data encryption in a more efficient and scalable manner. We illustrate these ideas in the context of an industrial multi-tenant SaaS application.
2018-03-05
Hauger, W. K., Olivier, M. S..  2017.  Forensic Attribution in NoSQL Databases. 2017 Information Security for South Africa (ISSA). :74–82.

NoSQL databases have gained a lot of popularity over the last few years. They are now used in many new system implementations that work with vast amounts of data. This data will typically also include sensitive information that needs to be secured. NoSQL databases are also underlying a number of cloud implementations which are increasingly being used to store sensitive information by various organisations. This has made NoSQL databases a new target for hackers and other state sponsored actors. Forensic examinations of compromised systems will need to be conducted to determine what exactly transpired and who was responsible. This paper examines specifically if NoSQL databases have security features that leave relevant traces so that accurate forensic attribution can be conducted. The seeming lack of default security measures such as access control and logging has prompted this examination. A survey into the top ranked NoSQL databases was conducted to establish what authentication and authorisation features are available. Additionally the provided logging mechanisms were also examined since access control without any auditing would not aid forensic attribution tremendously. Some of the surveyed NoSQL databases do not provide adequate access control mechanisms and logging features that leave relevant traces to allow forensic attribution to be done using those. The other surveyed NoSQL databases did provide adequate mechanisms and logging traces for forensic attribution, but they are not enabled or configured by default. This means that in many cases they might not be available, leading to insufficient information to perform accurate forensic attribution even on those databases.

2017-12-28
Shih, M. H., Chang, J. M..  2017.  Design and analysis of high performance crypt-NoSQL. 2017 IEEE Conference on Dependable and Secure Computing. :52–59.

NoSQL databases have become popular with enterprises due to their scalable and flexible storage management of big data. Nevertheless, their popularity also brings up security concerns. Most NoSQL databases lacked secure data encryption, relying on developers to implement cryptographic methods at application level or middleware layer as a wrapper around the database. While this approach protects the integrity of data, it increases the difficulty of executing queries. We were motivated to design a system that not only provides NoSQL databases with the necessary data security, but also supports the execution of query over encrypted data. Furthermore, how to exploit the distributed fashion of NoSQL databases to deliver high performance and scalability with massive client accesses is another important challenge. In this research, we introduce Crypt-NoSQL, the first prototype to support execution of query over encrypted data on NoSQL databases with high performance. Three different models of Crypt-NoSQL were proposed and performance was evaluated with Yahoo! Cloud Service Benchmark (YCSB) considering an enormous number of clients. Our experimental results show that Crypt-NoSQL can process queries over encrypted data with high performance and scalability. A guidance of establishing service level agreement (SLA) for Crypt-NoSQL as a cloud service is also proposed.

2017-08-02
Nguyen, Trong-Dat, Lee, Sang-Won.  2016.  I/O Characteristics of MongoDB and Trim-based Optimization in Flash SSDs. Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory. :139–144.

NoSQL solutions become emerging for large scaled, high performance, schema-flexible applications. WiredTiger is cost effective, non-locking, no-overwrite storage used as default storage engine in MongoDB. Understanding I/O characteristics of storage engine is important not only for choosing suitable solution with an application but also opening opportunities for researchers optimizing current working system, especially building more flash-awareness NoSQL DBMS. This paper explores background of MongoDB internals then analyze I/O characteristics of WiredTiger storage engine in detail. We also exploit space management mechanism in WiredTiger by using TRIM command.

2017-05-30
Ruohonen, Jukka, Leppänen, Ville.  2016.  On the Design of a Simple Network Resolver for DNS Mining. Proceedings of the 17th International Conference on Computer Systems and Technologies 2016. :105–112.

The domain name system (DNS) offers an ideal distributed database for big data mining related to different cyber security questions. Besides infrastructural problems, scalability issues, and security challenges related to the protocol itself, information from DNS is often required also for more nuanced cyber security questions. Against this backdrop, this paper discusses the fundamental characteristics of DNS in relation to cyber security and different research prototypes designed for passive but continuous DNS-based monitoring of domains and addresses. With this discussion, the paper also illustrates a few general software design aspects.

2015-05-06
Xingbang Tian, Baohua Huang, Min Wu.  2014.  A transparent middleware for encrypting data in MongoDB. Electronics, Computer and Applications, 2014 IEEE Workshop on. :906-909.

Due to the development of cloud computing and NoSQL database, more and more sensitive information are stored in NoSQL databases, which exposes quite a lot security vulnerabilities. This paper discusses security features of MongoDB database and proposes a transparent middleware implementation. The analysis of experiment results show that this transparent middleware can efficiently encrypt sensitive data specified by users on a dataset level. Existing application systems do not need too many modifications in order to apply this middleware.

2015-05-05
Lomotey, R.K., Deters, R..  2014.  Terms Mining in Document-Based NoSQL: Response to Unstructured Data. Big Data (BigData Congress), 2014 IEEE International Congress on. :661-668.

Unstructured data mining has become topical recently due to the availability of high-dimensional and voluminous digital content (known as "Big Data") across the enterprise spectrum. The Relational Database Management Systems (RDBMS) have been employed over the past decades for content storage and management, but, the ever-growing heterogeneity in today's data calls for a new storage approach. Thus, the NoSQL database has emerged as the preferred storage facility nowadays since the facility supports unstructured data storage. This creates the need to explore efficient data mining techniques from such NoSQL systems since the available tools and frameworks which are designed for RDBMS are often not directly applicable. In this paper, we focused on topics and terms mining, based on clustering, in document-based NoSQL. This is achieved by adapting the architectural design of an analytics-as-a-service framework and the proposal of the Viterbi algorithm to enhance the accuracy of the terms classification in the system. The results from the pilot testing of our work show higher accuracy in comparison to some previously proposed techniques such as the parallel search.

2015-05-01
Zahid, A., Masood, R., Shibli, M.A..  2014.  Security of sharded NoSQL databases: A comparative analysis. Information Assurance and Cyber Security (CIACS), 2014 Conference on. :1-8.

NoSQL databases are easy to scale-out because of their flexible schema and support for BASE (Basically Available, Soft State and Eventually Consistent) properties. The process of scaling-out in most of these databases is supported by sharding which is considered as the key feature in providing faster reads and writes to the database. However, securing the data sharded over various servers is a challenging problem because of the data being distributedly processed and transmitted over the unsecured network. Though, extensive research has been performed on NoSQL sharding mechanisms but no specific criterion has been defined to analyze the security of sharded architecture. This paper proposes an assessment criterion comprising various security features for the analysis of sharded NoSQL databases. It presents a detailed view of the security features offered by NoSQL databases and analyzes them with respect to proposed assessment criteria. The presented analysis helps various organizations in the selection of appropriate and reliable database in accordance with their preferences and security requirements.