Visible to the public Biblio

Filters: Keyword is Hadoop distributed file system  [Clear All Filters]
2020-12-28
Marichamy, V. S., Natarajan, V..  2020.  A Study of Big Data Security on a Partitional Clustering Algorithm with Perturbation Technique. 2020 International Conference on Smart Electronics and Communication (ICOSEC). :482—486.

Partitional Clustering Algorithm (PCA) on the Hadoop Distributed File System is to perform big data securities using the Perturbation Technique is the main idea of the proposed work. There are numerous clustering methods available that are used to categorize the information from the big data. PCA discovers the cluster based on the initial partition of the data. In this approach, it is possible to develop a security safeguarding of data that is impoverished to allow the calculations and communication. The performances were analyzed on Health Care database under the studies of various parameters like precision, accuracy, and F-score measure. The outcome of the results is to demonstrate that this method is used to decrease the complication in preserving privacy and better accuracy than that of the existing techniques.

2020-08-28
Yau, Yiu Chung, Khethavath, Praveen, Figueroa, Jose A..  2019.  Secure Pattern-Based Data Sensitivity Framework for Big Data in Healthcare. 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering (BCD). :65—70.
With the exponential growth in the usage of electronic medical records (EMR), the amount of data generated by the healthcare industry has too increased exponentially. These large amounts of data, known as “Big Data” is mostly unstructured. Special big data analytics methods are required to process the information and retrieve information which is meaningful. As patient information in hospitals and other healthcare facilities become increasingly electronic, Big Data technologies are needed now more than ever to manage and understand this data. In addition, this information tends to be quite sensitive and needs a highly secure environment. However, current security algorithms are hard to be implemented because it would take a huge amount of time and resources. Security protocols in Big data are also not adequate in protecting sensitive information in the healthcare. As a result, the healthcare data is both heterogeneous and insecure. As a solution we propose the Secure Pattern-Based Data Sensitivity Framework (PBDSF), that uses machine learning mechanisms to identify the common set of attributes of patient data, data frequency, various patterns of codes used to identify specific conditions to secure sensitive information. The framework uses Hadoop and is built on Hadoop Distributed File System (HDFS) as a basis for our clusters of machines to process Big Data, and perform tasks such as identifying sensitive information in a huge amount of data and encrypting data that are identified to be sensitive.
2020-08-24
Al-Odat, Zeyad A., Khan, Samee U..  2019.  Anonymous Privacy-Preserving Scheme for Big Data Over the Cloud. 2019 IEEE International Conference on Big Data (Big Data). :5711–5717.
This paper introduces an anonymous privacy-preserving scheme for big data over the cloud. The proposed design helps to enhance the encryption/decryption time of big data by utilizing the MapReduce framework. The Hadoop distributed file system and the secure hash algorithm are employed to provide the anonymity, security and efficiency requirements for the proposed scheme. The experimental results show a significant enhancement in the computational time of data encryption and decryption.
2020-05-11
OUIAZZANE, Said, ADDOU, Malika, BARRAMOU, Fatimazahra.  2019.  A Multi-Agent Model for Network Intrusion Detection. 2019 1st International Conference on Smart Systems and Data Science (ICSSD). :1–5.
The objective of this paper is to propose a distributed intrusion detection model based on a multi agent system. Mutli Agent Systems (MAS) are very suitable for intrusion detection systems as they meet the characteristics required by the networks and Big Data issues. The MAS agents cooperate and communicate with each other to ensure the effective detection of network intrusions without the intervention of an expert as used to be in the classical intrusion detection systems relying on signature matching to detect known attacks. The proposed model helped to detect known and unknown attacks within big computer infrastructure by responding to the network requirements in terms of distribution, autonomy, responsiveness and communication. The proposed model is capable of achieving a good and a real time intrusion detection using multi-agents paradigm and Hadoop Distributed File System (HDFS).
2017-12-28
Luo, S., Wang, Y., Huang, W., Yu, H..  2016.  Backup and Disaster Recovery System for HDFS. 2016 International Conference on Information Science and Security (ICISS). :1–4.

HDFS has been widely used for storing massive scale data which is vulnerable to site disaster. The file system backup is an important strategy for data retention. In this paper, we present an efficient, easy- to-use Backup and Disaster Recovery System for HDFS. The system includes a client based on HDFS with additional feature of remote backup, and a remote server with a HDFS cluster to keep the backup data. It supports full backup and regularly incremental backup to the server with very low cost and high throughout. In our experiment, the average speed of backup and recovery is up to 95 MB/s, approaching the theoretical maximum speed of gigabit Ethernet.