Visible to the public Biblio

Filters: Keyword is relational databases  [Clear All Filters]
2023-03-17
Al-Kateb, Mohammed, Eltabakh, Mohamed Y., Al-Omari, Awny, Brown, Paul G..  2022.  Analytics at Scale: Evolution at Infrastructure and Algorithmic Levels. 2022 IEEE 38th International Conference on Data Engineering (ICDE). :3217–3220.
Data Analytics is at the core of almost all modern ap-plications ranging from science and finance to healthcare and web applications. The evolution of data analytics over the last decade has been dramatic - new methods, new tools and new platforms - with no slowdown in sight. This rapid evolution has pushed the boundaries of data analytics along several axis including scalability especially with the rise of distributed infrastructures and the Big Data era, and interoperability with diverse data management systems such as relational databases, Hadoop and Spark. However, many analytic application developers struggle with the challenge of production deployment. Recent experience suggests that it is difficult to deliver modern data analytics with the level of reliability, security and manageability that has been a feature of traditional SQL DBMSs. In this tutorial, we discuss the advances and innovations introduced at both the infrastructure and algorithmic levels, directed at making analytic workloads scale, while paying close attention to the kind of quality of service guarantees different technology provide. We start with an overview of the classical centralized analytical techniques, describing the shift towards distributed analytics over non-SQL infrastructures. We contrast such approaches with systems that integrate analytic functionality inside, above or adjacent to SQL engines. We also explore how Cloud platforms' virtualization capabilities make it easier - and cheaper - for end users to apply these new analytic techniques to their data. Finally, we conclude with the learned lessons and a vision for the near future.
ISSN: 2375-026X
Gharpure, Nisha, Rai, Aradhana.  2022.  Vulnerabilities and Threat Management in Relational Database Management Systems. 2022 5th International Conference on Advances in Science and Technology (ICAST). :369–374.
Databases are at the heart of modern applications and any threats to them can seriously endanger the safety and functionality of applications relying on the services offered by a DBMS. It is therefore pertinent to identify key risks to the secure operation of a database system. This paper identifies the key risks, namely, SQL injection, weak audit trails, access management issues and issues with encryption. A malicious actor can get help from any of these issues. It can compromise integrity, availability and confidentiality of the data present in database systems. The paper also identifies various means and ways to defend against these issues and remedy them. This paper then proceeds to identify from the literature, the potential solutions to these ameliorate the threat from these vulnerabilities. It proposes the usage of encryption to protect the data from being breached and leveraging encrypted databases such as CryptoDB. Better access control norms are suggested to prevent unauthorized access, modification and deletion of the data. The paper also recommends ways to prevent SQL injection attacks through techniques such as prepared statements.
Lv, Xiaonan, Huang, Zongwei, Sun, Liangyu, Wu, Miaomiao, Huang, Li, Li, Yehong.  2022.  Research and design of web-based capital transaction data dynamic multi-mode visual analysis tool. 2022 IEEE 7th International Conference on Smart Cloud (SmartCloud). :165–170.
For multi-source heterogeneous complex data types of data cleaning and visual display, we proposed to build dynamic multimode visualization analysis tool, according to the different types of data designed by the user in accordance with the data model, and use visualization technology tools to build and use CQRS technology to design, external interface using a RESTFul architecture, The domain model and data query are completely separated, and the underlying data store adopts Hbase, ES and relational database. Drools is adopted in the data flow engine. According to the internal algorithm, three kinds of graphs can be output, namely, transaction relationship network analysis graph, capital flow analysis graph and transaction timing analysis graph, which can reduce the difficulty of analysis and help users to analyze data in a more friendly way
Raj, Ankit, Somani, Sunil B..  2022.  Predicting Terror Attacks Using Neo4j Sandbox and Machine Learning Algorithms. 2022 6th International Conference On Computing, Communication, Control And Automation (ICCUBEA. :1–6.
Terrorism, and radicalization are major economic, political, and social issues faced by the world in today's era. The challenges that governments and citizens face in combating terrorism are growing by the day. Artificial intelligence, including machine learning and deep learning, has shown promising results in predicting terrorist attacks. In this paper, we attempted to build a machine learning model to predict terror activities using a global terrorism database in both relational and graphical forms. Using the Neo4j Sandbox, you can create a graph database from a relational database. We used the node2vec algorithm from Neo4j Sandbox's graph data science library to convert the high-dimensional graph to a low-dimensional vector form. In order to predict terror activities, seven machine learning models were used, and the performance parameters that were calculated were accuracy, precision, recall, and F1 score. According to our findings, the Logistic Regression model was the best performing model which was able to classify the dataset with an accuracy of 0.90, recall of 0.94 precision of 0.93, and an F1 score of 0.93.
ISSN: 2771-1358
Kharitonov, Valerij A., Krivogina, Darya N., Salamatina, Anna S., Guselnikova, Elina D., Spirina, Varvara S., Markvirer, Vladlena D..  2022.  Intelligent Technologies for Projective Thinking and Research Management in the Knowledge Representation System. 2022 International Conference on Quality Management, Transport and Information Security, Information Technologies (IT&QM&IS). :292–295.
It is proposed to address existing methodological issues in the educational process with the development of intellectual technologies and knowledge representation systems to improve the efficiency of higher education institutions. For this purpose, the structure of relational database is proposed, it will store the information about defended dissertations in the form of a set of attributes (heuristics), representing the mandatory qualification attributes of theses. An inference algorithm is proposed to process the information. This algorithm represents an artificial intelligence, its work is aimed at generating queries based on the applicant preferences. The result of the algorithm's work will be a set of choices, presented in ranked order. Given technologies will allow applicants to quickly become familiar with known scientific results and serve as a starting point for new research. The demand for co-researcher practice in solving the problem of updating the projective thinking methodology and managing the scientific research process has been justified. This article pays attention to the existing parallels between the concepts of technical and human sciences in the framework of their convergence. The concepts of being (economic good and economic utility) and the concepts of consciousness (humanitarian economic good and humanitarian economic utility) are used to form projective thinking. They form direct and inverse correspondences of technology and humanitarian practice in the techno-humanitarian mathematical space. It is proposed to place processed information from the language of context-free formal grammar dissertation abstracts in this space. The principle of data manipulation based on formal languages with context-free grammar allows to create new structures of subject areas in terms of applicants' preferences.It is believed that the success of applicants’ work depends directly on the cognitive training of applicants, which needs to be practiced psychologically. This practice is based on deepening the objectivity and adequacy qualities of obtaining information on the basis of heuristic methods. It requires increased attention and development of intelligence. The paper studies the use of heuristic methods by applicants to find new research directions leads to several promising results. These results can be perceived as potential options in future research. This contributes to an increase in the level of retention of higher education professionals.
Al-Zahrani, Basmah, Alshehri, Suhair, Cherif, Asma, Imine, Abdessamad.  2022.  Property Graph Access Control Using View-Based and Query-Rewriting Approaches. 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA). :1–2.
Managing and storing big data is non-trivial for traditional relational databases (RDBMS). Therefore, the NoSQL (Not Only SQL) database management system emerged. It is ca-pable of handling the vast amount and the heterogeneity of data. In this research, we are interested in one of its trending types, the graph database, namely, the Directed Property Graph (DPG). This type of database is powerful in dealing with complex relationships (\$\textbackslashmathrme.\textbackslashmathrmg\$., social networks). However, its sen-sitive and private data must be protected against unauthorized access. This research proposes a security model that aims at exploiting and combining the benefits of Access Control, View-Based, and Query-Rewriting approaches. This is a novel combination for securing DPG.
ISSN: 2161-5330
Wang, Wenchao, Liu, Chuanyi, Wang, Zhaoguo, Liang, Tiancai.  2022.  FBIPT: A New Robust Reversible Database Watermarking Technique Based on Position Tuples. 2022 4th International Conference on Data Intelligence and Security (ICDIS). :67–74.
Nowadays, data is essential in several fields, such as science, finance, medicine, and transportation, which means its value continues to rise. Relational databases are vulnerable to copyright threats when transmitted and shared as a carrier of data. The watermarking technique is seen as a partial solution to the problem of securing copyright ownership. However, most of them are currently restricted to numerical attributes in relational databases, limiting their versatility. Furthermore, they modify the source data to a large extent, failing to keep the characteristics of the original database, and they are susceptible to solid malicious attacks. This paper proposes a new robust reversible watermarking technique, Fields Based Inserting Position Tuples algorithm (FBIPT), for relational databases. FBIPT does not modify the original database directly; instead, it inserts some position tuples based on three Fields―Group Field, Feature Field, and Control Field. Field information can be calculated by numeric attributes and any attribute that can be transformed into binary bits. FBIPT technique retains all the characteristics of the source database, and experimental results prove the effectiveness of FBIPT and show its highly robust performance compared to state-of-the-art watermarking schemes.
Chakraborty, Partha Sarathi, Kumar, Puspesh, Chandrawanshi, Mangesh Shivaji, Tripathy, Somanath.  2022.  BASDB: Blockchain assisted Secure Outsourced Database Search. 2022 IEEE International Conference on Blockchain and Distributed Systems Security (ICBDS). :1–6.
The outsourcing of databases is very popular among IT companies and industries. It acts as a solution for businesses to ensure availability of the data for their users. The solution of outsourcing the database is to encrypt the data in a form where the database service provider can perform relational operations over the encrypted database. At the same time, the associated security risk of data leakage prevents many potential industries from deploying it. In this paper, we present a secure outsourcing database search scheme (BASDB) with the use of a smart contract for search operation over index of encrypted database and storing encrypted relational database in the cloud. Our proposed scheme BASDB is a simple and practical solution for effective search on encrypted relations and is well resistant to information leakage against attacks like search and access pattern leakage.
Huamán, Cesar Humberto Ortiz, Fuster, Nilcer Fernandez, Luyo, Ademir Cuadros, Armas-Aguirre, Jimmy.  2022.  Critical Data Security Model: Gap Security Identification and Risk Analysis In Financial Sector. 2022 17th Iberian Conference on Information Systems and Technologies (CISTI). :1–6.
In this paper, we proposed a data security model of a big data analytical environment in the financial sector. Big Data can be seen as a trend in the advancement of technology that has opened the door to a new approach to understanding and decision making that is used to describe the vast amount of data (structured, unstructured and semi-structured) that is too time consuming and costly to load a relational database for analysis. The increase in cybercriminal attacks on an organization’s assets results in organizations beginning to invest in and care more about their cybersecurity points and controls. The management of business-critical data is an important point for which robust cybersecurity controls should be considered. The proposed model is applied in a datalake and allows the identification of security gaps on an analytical repository, a cybersecurity risk analysis, design of security components and an assessment of inherent risks on high criticality data in a repository of a regulated financial institution. The proposal was validated in financial entities in Lima, Peru. Proofs of concept of the model were carried out to measure the level of maturity focused on: leadership and commitment, risk management, protection control, event detection and risk management. Preliminary results allowed placing the entities in level 3 of the model, knowing their greatest weaknesses, strengths and how these can affect the fulfillment of business objectives.
ISSN: 2166-0727
2022-07-15
Aggarwal, Pranjal, Kumar, Akash, Michael, Kshitiz, Nemade, Jagrut, Sharma, Shubham, C, Pavan Kumar.  2021.  Random Decision Forest approach for Mitigating SQL Injection Attacks. 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). :1—5.
Structured Query Language (SQL) is extensively used for storing, manipulating and retrieving information in the relational database management system. Using SQL statements, attackers will try to gain unauthorized access to databases and launch attacks to modify/retrieve the stored data, such attacks are called as SQL injection attacks. Such SQL Injection (SQLi) attacks tops the list of web application security risks of all the times. Identifying and mitigating the potential SQL attack statements before their execution can prevent SQLi attacks. Various techniques are proposed in the literature to mitigate SQLi attacks. In this paper, a random decision forest approach is introduced to mitigate SQLi attacks. From the experimental results, we can infer that the proposed approach achieves a precision of 97% and an accuracy of 95%.
Jony, Mehdi Hassan, Johora, Fatema Tuj, Katha, Jannatul Ferdous.  2021.  A Robust and Efficient Numeric Approach for Relational Database Watermarking. 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI). :1—6.
Sharing relational databases on the Internet creates the need to protect these databases. Its output in substantial losses to the data storing systems because of unauthorized access to information that could lose novelty. The research associations use the research databases to mine new information about the research works of the relational databases that are available for free. It is a great challenge to maintain authenticity because these databases are vulnerable to security issues. Watermarking is a candidate solution that fully protects databases shared with the receiver. The protection of relational database ownership that may continue to evolve against the various aquatic mechanisms shared with the recipient that arouses appetite for attacks and must continue to evolve so that they can have database knowledge to support their decision-making system is effective. The relational database based onVirtual private key Watermarking using numeric attribute) involves embedding the same watermark in the same properties in different places in the same place. Therefore, data attackers cannot remove watermarks from data. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the data usability must remain intact after the data is watermarked. The proposed strategy is to work by inserting watermark bits in such a way that it causes minimal distortion in the data and the ability to use the data after watermarking the data must remain intact. The existence of a primary key is the main feature or compulsory item for most of the strategies. Our method provides solutions no primary key feature where the integrating search system of the database remains intact after watermarking distortion.
Bašić, B., Udovičić, P., Orel, O..  2021.  In-database Auditing Subsystem for Security Enhancement. 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO). :1642—1647.
Many information systems have been around for several decades, and most of them have their underlying databases. The data accumulated in those databases over the years could be a very valuable asset, which must be protected. The first role of database auditing is to ensure and confirm that security measures are set correctly. However, tracing user behavior and collecting a rich audit trail enables us to use that trail in a more proactive ways. As an example, audit trail could be analyzed ad hoc and used to prevent intrusion, or analyzed afterwards, to detect user behavior patterns, forecast workloads, etc. In this paper, we present a simple, secure, configurable, role-separated, and effective in-database auditing subsystem, which can be used as a base for access control, intrusion detection, fraud detection and other security-related analyses and procedures. It consists of a management relations, code and data object generators and several administrative tools. This auditing subsystem, implemented in several information systems, is capable of keeping the entire audit trail (data history) of a database, as well as all the executed SQL statements, which enables different security applications, from ad hoc intrusion prevention to complex a posteriori security analyses.
Tang, Xiao, Cao, Zhenfu, Dong, Xiaolei, Shen, Jiachen.  2021.  PKMark: A Robust Zero-distortion Blind Reversible Scheme for Watermarking Relational Databases. 2021 IEEE 15th International Conference on Big Data Science and Engineering (BigDataSE). :72—79.
In this paper, we propose a zero-distortion blind reversible robust scheme for watermarking relational databases called PKMark. Data owner can declare the copyright of the databases or pursue the infringement by extracting the water-mark information embedded in the database. PKMark is mainly based on the primary key attribute of the tuple. So it does not depend on the type of the attribute, and can provide high-precision numerical attributes. PKMark uses RSA encryption on the watermark before embedding the watermark to ensure the security of the watermark information. Then we use RSA to sign the watermark cipher text so that the owner can verify the ownership of the watermark without disclosing the watermark. The watermark embedding and extraction are based on the hash value of the primary key, so the scheme has blindness and reversibility. In other words, the user can obtain the watermark information or restore the original database without comparing it to the original database. Our scheme also has almost excellent robustness against addition attacks, deletion attacks and alteration attacks. In addition, PKMark is resistant to additive attacks, allowing different users to embed multiple watermarks without interfering with each other, and it can indicate the sequence of watermark embedding so as to indicate the original copyright owner of the database. This watermarking scheme also allows data owners to detect whether the data has been tampered with.
Giesser, Patrick, Stechschulte, Gabriel, Costa Vaz, Anna da, Kaufmann, Michael.  2021.  Implementing Efficient and Scalable In-Database Linear Regression in SQL. 2021 IEEE International Conference on Big Data (Big Data). :5125—5132.
Relational database management systems not only support larger-than-memory data processing and very advanced query optimization, but also offer the benefits of data security, privacy, and consistency. When machine learning on large data sets is processed directly on an existing SQL database server, the data does not need to be exported and transferred to a separate big data processing platform. To achieve this, we implement a linear regression algorithm using SQL code generation such that the computation can be performed server-side and directly in the RDBMs. Our method and its implementation, programmed in Python, solves linear regression (LR) using the ordinary least squares (OLS) method directly in the RDBMS using SQL code generation, leaving most of the processing in the database. Only the matrix of the system of equations, whose size is equal to the number of variables squared, is transferred from the SQL server to the Python client to be solved for OLS regression. For evaluation purposes, our LR implementation was tested with artificially generated datasets and compared to an existing Python library (Scikit Learn). We found that our implementation consistently solves OLS regression faster than Scikit Learn for datasets with more than 10,000 input rows, and if the number of columns is less than 64. Moreover, under the same test conditions where the computation is larger than memory, our implementation showed a fast result, while Scikit returned an out-of-memory error. We conclude that SQL is a promising tool for in-database processing of large-volume, low-dimensional data sets with a particular class of machine learning algorithms, namely those that can be efficiently solved with map-reduce queries such as OLS regression.
Pengwei, Ma, Kai, Wei, Chunyu, Jiang, Junyi, Li, Jiafeng, Tian, Siyuan, Liu, Minjing, Zhong.  2021.  Research on Evaluation System of Relational Cloud Database. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1369—1373.
With the continuous emergence of cloud computing technology, cloud infrastructure software will become the mainstream application model in the future. Among the databases, relational databases occupy the largest market share. Therefore, the relational cloud database will be the main product of the combination of database technology and cloud computing technology, and will become an important branch of the database industry. This article explores the establishment of an evaluation system framework for relational databases, helping enterprises to select relational cloud database products according to a clear goal and path. This article can help enterprises complete the landing of relational cloud database projects.
Lagraa, Sofiane, State, Radu.  2021.  What database do you choose for heterogeneous security log events analysis? 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM). :812—817.
The heterogeneous massive logs incoming from multiple sources pose major challenges to professionals responsible for IT security and system administrator. One of the challenges is to develop a scalable heterogeneous logs database for storage and further analysis. In fact, it is difficult to decide which database is suitable for the needs, the best of a use case, execution time and storage performances. In this paper, we explore, study, and compare the performance of SQL and NoSQL databases on large heterogeneous event logs. We implement the relational database using MySQL, the column-oriented database using Impala on the top of Hadoop, and the graph database using Neo4j. We experiment the databases on a large heterogeneous logs and provide advice, the pros and cons of each SQL and NoSQL database. Our findings that Impala outperforms MySQL and Neo4j databases in terms of loading logs, execution time of simple queries, and storage of logs. However, Neo4j outperforms Impala and MySQL in the execution time of complex queries.
2021-08-31
Zhang, Zehao, Yu, Zhen, Weng, Wei, Guan, Cheng.  2020.  Study on the Digitalization Method of Intelligent Emergency Plan of Power System. 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE). :179—182.
This paper puts forward a formalized method of emergency plan based on ontology, sums up the main concepts such as system, event, rule, measure, constraint and resource, and analyzes the logical relationship among concepts. A digital intelligent emergency plan storage scheme based on relational database model is proposed. In this paper, full-text search, data search and knowledge search are comprehensively used to adapt to the information needs and characteristics of different users' query plans. Finally, an example of emergency plan made by a power supply company is given to illustrate the effectiveness of the method.
Bartol, Janez, Souvent, Andrej, Suljanović, Nermin, Zajc, Matej.  2020.  Secure data exchange between IoT endpoints for energy balancing using distributed ledger. 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe). :56—60.
This paper investigates a secure data exchange between many small distributed consumers/prosumers and the aggregator in the process of energy balancing. It addresses the challenges of ensuring data exchange in a simple, scalable, and affordable way. The communication platform for data exchange is using Ethereum Blockchain technology. It provides a distributed ledger database across a distributed network, supports simple connectivity for new stakeholders, and enables many small entities to contribute with their flexible energy to the system balancing. The architecture of a simulation/emulation environment provides a direct connection of a relational database to the Ethereum network, thus enabling dynamic data management. In addition, it extends security of the environment with security mechanisms of relational databases. Proof-of-concept setup with the simulation of system balancing processes, confirms the suitability of the solution for secure data exchange in the market, operation, and measurement area. For the most intensive and space-consuming measurement data exchange, we have investigated data aggregation to ensure performance optimisation of required computation and space usage.
Churi, Akshata A., Shinde, Vinayak D..  2020.  Alphanumeric Database Security through Digital Watermarking. 2020 International Conference on Convergence to Digital World - Quo Vadis (ICCDW). :1—4.
As the demand of online data availability increases for sharing data, business analytics, security of available data becomes important issue, data needs to be protected from unauthorized access as well as it needs to provide authority that the data is received from a trusted owner. To provide owners identity digital watermarking technique is used since long time for multimedia data. This paper proposed a technique which supports watermarking on database as most of the data available today is in database format. The characters to be entered as watermark are converted into binary values; these binary values are hidden in the database using space character. Each bit is hidden in each tuple randomly. Ant colony optimization algorithm is proposed to select tuples where watermark bits are inserted. The proposed system is enhanced in terms of security due to use of ant colony optimization and resilient because even if some bits are modified the hidden text remains almost same.
Siledar, Seema, Tamane, Sharvari.  2020.  A distortion-free watermarking approach for verifying integrity of relational databases. 2020 International Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC). :192—195.
Due to high availability and easy accessibility of information, it has become quite difficult to assure security of data. Even though watermarking seems to be an effective solution to protect data, it is still challenging to be used with relational databases. Moreover, inserting a watermark in database may lead to distortion. As a result, the contents of database can no longer remain useful. Our proposed distortion-free watermarking approach ensures that integrity of database can be preserved by generating an image watermark from its contents. This image is registered with Certification Authority (CA) before the database is distributed for use. In case, the owner suspects any kind of tampering in the database, an image watermark is generated and compared with the registered image watermark. If both do not match, it can be concluded that the integrity of database has been compromised. Experiments are conducted on Forest Cover Type data set to localize tampering to the finest granularity. Results show that our approach can detect all types of attack with 100% accuracy.
2021-04-08
Yaseen, Q., Panda, B..  2012.  Tackling Insider Threat in Cloud Relational Databases. 2012 IEEE Fifth International Conference on Utility and Cloud Computing. :215—218.
Cloud security is one of the major issues that worry individuals and organizations about cloud computing. Therefore, defending cloud systems against attacks such asinsiders' attacks has become a key demand. This paper investigates insider threat in cloud relational database systems(cloud RDMS). It discusses some vulnerabilities in cloud computing structures that may enable insiders to launch attacks, and shows how load balancing across multiple availability zones may facilitate insider threat. To prevent such a threat, the paper suggests three models, which are Peer-to-Peer model, Centralized model and Mobile-Knowledgebase model, and addresses the conditions under which they work well.
2021-02-22
Bhagat, V., J, B. R..  2020.  Natural Language Processing on Diverse Data Layers Through Microservice Architecture. 2020 IEEE International Conference for Innovation in Technology (INOCON). :1–6.
With the rapid growth in Natural Language Processing (NLP), all types of industries find a need for analyzing a massive amount of data. Sentiment analysis is becoming a more exciting area for the businessmen and researchers in Text mining & NLP. This process includes the calculation of various sentiments with the help of text mining. Supplementary to this, the world is connected through Information Technology and, businesses are moving toward the next step of the development to make their system more intelligent. Microservices have fulfilled the need for development platforms which help the developers to use various development tools (Languages and applications) efficiently. With the consideration of data analysis for business growth, data security becomes a major concern in front of developers. This paper gives a solution to keep the data secured by providing required access to data scientists without disturbing the base system software. This paper has discussed data storage and exchange policies of microservices through common JavaScript Object Notation (JSON) response which performs the sentiment analysis of customer's data fetched from various microservices through secured APIs.
2020-12-28
Yu, Y., Li, H., Fu, Y., Wu, X..  2020.  A Dynamic Updating Method for Release of Privacy Protected Data Based on Privacy Differences in Relational Data. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :23—27.

To improve dynamic updating of privacy protected data release caused by multidimensional sensitivity attribute privacy differences in relational data, we propose a dynamic updating method for privacy protection data release based on the multidimensional privacy differences. By adopting the multi-sensitive bucketization technology (MSB), this method performs quantitative classification of the multidimensional sensitive privacy difference and the recorded value, provides the basic updating operation unit, and thereby realizes dynamic updating of privacy protection data release based on the privacy difference among relational data. The experiment confirms that the method can secure the data updating efficiency while ensuring the quality of data release.

2020-10-12
Rudd-Orthner, Richard N M, Mihaylova, Lyudmilla.  2019.  An Algebraic Expert System with Neural Network Concepts for Cyber, Big Data and Data Migration. 2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). :1–6.

This paper describes a machine assistance approach to grading decisions for values that might be missing or need validation, using a mathematical algebraic form of an Expert System, instead of the traditional textual or logic forms and builds a neural network computational graph structure. This Experts System approach is also structured into a neural network like format of: input, hidden and output layers that provide a structured approach to the knowledge-base organization, this provides a useful abstraction for reuse for data migration applications in big data, Cyber and relational databases. The approach is further enhanced with a Bayesian probability tree approach to grade the confidences of value probabilities, instead of the traditional grading of the rule probabilities, and estimates the most probable value in light of all evidence presented. This is ground work for a Machine Learning (ML) experts system approach in a form that is closer to a Neural Network node structure.

2020-05-22
Devarakonda, Ranjeet, Giansiracusa, Michael, Kumar, Jitendra.  2018.  Machine Learning and Social Media to Mine and Disseminate Big Scientific Data. 2018 IEEE International Conference on Big Data (Big Data). :5312—5315.

One of the challenges in supplying the communities with wider access to scientific databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it might not be practical to make the public aware of the structure of databases. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide a more intuitive method for generating database queries and delivering responses. Through social media, which makes it possible to interact with a wide section of the population, and with the help of natural language processing, researchers at the Atmospheric Radiation Measurement (ARM) Data Center at Oak Ridge National Laboratory (ORNL) have developed a concept to enable easy search and retrieval of data from several environmental data centers for the scientific community through social media.Using a machine learning framework that maps natural language text to thousands of datasets, instruments, variables, and data streams, the prototype system would allow users to request data through Twitter and receive a link (via tweet) to applicable data results on the project's search catalog tailored to their key words. This automated identification of relevant data from various petascale archives at ORNL could increase convenience, access, and use of the project's data by the broader community. In this paper we discuss how some data-intensive projects at ORNL are using innovative ways to help in data discovery.