Biblio
The need to protect big data, particularly those relating to information security (IS) maintenance (ISM) of an enterprise's IT infrastructure, is shown. A worldwide experience of addressing big data ISM issues is briefly summarized and a big data protection problem statement is formulated. An infrastructure for big data ISM is proposed. New applications areas for big data IT after addressing ISM issues are listed in conclusion.
Acoustic microscopy is characterized by relatively long scanning time, which is required for the motion of the transducer over the entire scanning area. This time may be reduced by using a multi-channel acoustical system which has several identical transducers arranged as an array and is mounted on a mechanical scanner so that each transducer scans only a fraction of the total area. The resulting image is formed as a combination of all acquired partial data sets. The mechanical instability of the scanner, as well as the difference in parameters of the individual transducers causes a misalignment of the image fractures. This distortion may be partially compensated for by the introduction of constant or dynamical signal leveling and data shift procedures. However, a reduction of the random instability component requires more advanced algorithms, including auto-adjustment of processing parameters. The described procedure was implemented into the prototype of an ultrasonic fingerprint reading system. The specialized cylindrical scanner provides a helical spiral lens trajectory which eliminates repeatable acceleration, reduces vibration and allows constant data flow on maximal rate. It is equipped with an array of four spherically focused 50 MHz acoustic lenses operating in pulse-echo mode. Each transducer is connected to a separate channel including pulser, receiver and digitizer. The output 3D data volume contains interlaced B-scans coming from each channel. Afterward, data processing includes pre-determined procedures of constant layer shift in order to compensate for the transducer displacement, phase shift and amplitude leveling for compensation of variation in transducer characteristics. Analysis of statistical parameters of individual scans allows adaptive eliminating of the axial misalignment and mechanical vibrations. Further 2D correlation of overlapping partial C-scans will realize an interpolative adjustment which essentially improves the output image. Implementation of this adaptive algorithm into a data processing sequence allows us to significantly reduce misreading due to hardware noise and finger motion during scanning. The system provides a high quality acoustic image of the fingerprint including different levels of information: fingerprint pattern, sweat porous locations, internal dermis structures. These additional features can effectively facilitate fingerprint based identification. The developed principles and algorithm implementations allow improved quality, stability and reliability of acoustical data obtained with the mechanical scanner, accommodating several transducers. General principles developed during this work can be applied to other configurations of advanced ultrasonic systems designed for various biomedical and NDE applications. The data processing algorithm, developed for a specific biometric task, can be adapted for the compensation of mechanical imperfections of the other devices.
Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.
Although RAID is a well-known technique to protect data against disk errors, it is vulnerable to silent data corruptions that cannot be detected by disk drives. Existing integrity protection schemes designed for RAID arrays often introduce high I/O overhead. Our key insight is that by properly designing an integrity protection scheme that adapts to the read/write characteristics of storage workloads, the I/O overhead can be significantly mitigated. In view of this, this paper presents a systematic study on I/O-efficient integrity protection against silent data corruptions in RAID arrays. We formalize an integrity checking model, and justify that a large proportion of disk reads can be checked with simpler and more I/O-efficient integrity checking mechanisms. Based on this integrity checking model, we construct two integrity protection schemes that provide complementary performance advantages for storage workloads with different user write sizes. We further propose a quantitative method for choosing between the two schemes in real-world scenarios. Our trace-driven simulation results show that with the appropriate integrity protection scheme, we can reduce the I/O overhead to below 15%.
Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.
Cloud computing allows users to delegate data and computation to cloud service providers, at the cost of giving up physical control of their computing infrastructure. An attacker (e.g., insider) with physical access to the computing platform can perform various physical attacks, including probing memory buses and cold-boot style attacks. Previous work on secure (co-)processors provides hardware support for memory encryption and prevents direct leakage of sensitive data over the memory bus. However, an adversary snooping on the bus can still infer sensitive information from the memory access traces. Existing work on Oblivious RAM (ORAM) provides a solution for users to put all data in an ORAM; and accesses to an ORAM are obfuscated such that no information leaks through memory access traces. This method, however, incurs significant memory access overhead. This work is the first to leverage programming language techniques to offer efficient memory-trace oblivious program execution, while providing formal security guarantees. We formally define the notion of memory-trace obliviousness, and provide a type system for verifying that a program satisfies this property. We also describe a compiler that transforms a program into a structurally similar one that satisfies memory trace obliviousness. To achieve optimal efficiency, our compiler partitions variables into several small ORAM banks rather than one large one, without risking security. We use several example programs to demonstrate the efficiency gains our compiler achieves in comparison with the naive method of placing all variables in the same ORAM.