Biblio
The objective of this paper is to propose a model of a distributed intrusion detection system based on the multi-agent paradigm and the distributed file system (HDFS). Multi-agent systems (MAS) are very suitable to intrusion detection systems as they can address the issue of geographic data security in terms of autonomy, distribution and performance. The proposed system is based on a set of autonomous agents that cooperate and collaborate with each other to effectively detect intrusions and suspicious activities that may impact geographic information systems. Our system allows the detection of known and unknown computer attacks without any human intervention (Security Experts) unlike traditional intrusion detection systems that rely on knowledge bases as a mechanism to detect known attacks. The proposed model allows a real time detection of known and unknown attacks within large networks hosting geographic data.
In the era of big data, many users and companies start to move their data to cloud storage to simplify data management and reduce data maintenance cost. However, security and privacy issues become major concerns because third-party cloud service providers are not always trusty. Although data contents can be protected by encryption, the access patterns that contain important information are still exposed to clouds or malicious attackers. In this paper, we apply the ORAM algorithm to enable privacy-preserving access to big data that are deployed in distributed file systems built upon hundreds or thousands of servers in a single or multiple geo-distributed cloud sites. Since the ORAM algorithm would lead to serious access load unbalance among storage servers, we study a data placement problem to achieve a load balanced storage system with improved availability and responsiveness. Due to the NP-hardness of this problem, we propose a low-complexity algorithm that can deal with large-scale problem size with respect to big data. Extensive simulations are conducted to show that our proposed algorithm finds results close to the optimal solution, and significantly outperforms a random data placement algorithm.