Visible to the public Biblio

Filters: Keyword is Noise measurement  [Clear All Filters]
2020-08-03
Al-Emadi, Sara, Al-Ali, Abdulla, Mohammad, Amr, Al-Ali, Abdulaziz.  2019.  Audio Based Drone Detection and Identification using Deep Learning. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :459–464.
In recent years, unmanned aerial vehicles (UAVs) have become increasingly accessible to the public due to their high availability with affordable prices while being equipped with better technology. However, this raises a great concern from both the cyber and physical security perspectives since UAVs can be utilized for malicious activities in order to exploit vulnerabilities by spying on private properties, critical areas or to carry dangerous objects such as explosives which makes them a great threat to the society. Drone identification is considered the first step in a multi-procedural process in securing physical infrastructure against this threat. In this paper, we present drone detection and identification methods using deep learning techniques such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Convolutional Recurrent Neural Network (CRNN). These algorithms will be utilized to exploit the unique acoustic fingerprints of the flying drones in order to detect and identify them. We propose a comparison between the performance of different neural networks based on our dataset which features audio recorded samples of drone activities. The major contribution of our work is to validate the usage of these methodologies of drone detection and identification in real life scenarios and to provide a robust comparison of the performance between different deep neural network algorithms for this application. In addition, we are releasing the dataset of drone audio clips for the research community for further analysis.
2020-06-15
Puteaux, Pauline, Puech, William.  2018.  Noisy Encrypted Image Correction based on Shannon Entropy Measurement in Pixel Blocks of Very Small Size. 2018 26th European Signal Processing Conference (EUSIPCO). :161–165.
Many techniques have been presented to protect image content confidentiality. The owner of an image encrypts it using a key and transmits the encrypted image across a network. If the recipient is authorized to access the original content of the image, he can reconstruct it losslessly. However, if during the transmission the encrypted image is noised, some parts of the image can not be deciphered. In order to localize and correct these errors, we propose an approach based on the local Shannon entropy measurement. We first analyze this measure as a function of the block-size. We provide then a full description of our blind error localization and removal process. Experimental results show that the proposed approach, based on local entropy, can be used in practice to correct noisy encrypted images, even with blocks of very small size.
Kin-Cleaves, Christy, Ker, Andrew D..  2018.  Adaptive Steganography in the Noisy Channel with Dual-Syndrome Trellis Codes. 2018 IEEE International Workshop on Information Forensics and Security (WIFS). :1–7.
Adaptive steganography aims to reduce distortion in the embedding process, typically using Syndrome Trellis Codes (STCs). However, in the case of non-adversarial noise, these are a bad choice: syndrome codes are fragile by design, amplifying the channel error rate into unacceptably-high payload error rates. In this paper we examine the fragility of STCs in the noisy channel, and consider how this can be mitigated if their use cannot be avoided altogether. We also propose an extension called Dual-Syndrome Trellis Codes, that combines error correction and embedding in the same Viterbi process, which slightly outperforms a straight-forward combination of standard forward error correction and STCs.
2020-06-02
Ostrev, Dimiter.  2019.  Composable, Unconditionally Secure Message Authentication without any Secret Key. 2019 IEEE International Symposium on Information Theory (ISIT). :622—626.

We consider a setup in which the channel from Alice to Bob is less noisy than the channel from Eve to Bob. We show that there exist encoding and decoding which accomplish error correction and authentication simultaneously; that is, Bob is able to correctly decode a message coming from Alice and reject a message coming from Eve with high probability. The system does not require any secret key shared between Alice and Bob, provides information theoretic security, and can safely be composed with other protocols in an arbitrary context.

2020-04-20
Wang, Chong Xiao, Song, Yang, Tay, Wee Peng.  2018.  PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.
We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.
Wang, Chong Xiao, Song, Yang, Tay, Wee Peng.  2018.  PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.
We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.
2020-03-16
Ren, Wenyu, Yu, Tuo, Yardley, Timothy, Nahrstedt, Klara.  2019.  CAPTAR: Causal-Polytree-based Anomaly Reasoning for SCADA Networks. 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). :1–7.
The Supervisory Control and Data Acquisition (SCADA) system is the most commonly used industrial control system but is subject to a wide range of serious threats. Intrusion detection systems are deployed to promote the security of SCADA systems, but they continuously generate tremendous number of alerts without further comprehending them. There is a need for an efficient system to correlate alerts and discover attack strategies to provide explainable situational awareness to SCADA operators. In this paper, we present a causal-polytree-based anomaly reasoning framework for SCADA networks, named CAPTAR. CAPTAR takes the meta-alerts from our previous anomaly detection framework EDMAND, correlates the them using a naive Bayes classifier, and matches them to predefined causal polytrees. Utilizing Bayesian inference on the causal polytrees, CAPTAR can produces a high-level view of the security state of the protected SCADA network. Experiments on a prototype of CAPTAR proves its anomaly reasoning ability and its capabilities of satisfying the real-time reasoning requirement.
2020-03-04
Puteaux, Pauline, Puech, William.  2019.  Image Analysis and Processing in the Encrypted Domain. 2019 IEEE International Conference on Image Processing (ICIP). :3020–3022.

In this research project, we are interested by finding solutions to the problem of image analysis and processing in the encrypted domain. For security reasons, more and more digital data are transferred or stored in the encrypted domain. However, during the transmission or the archiving of encrypted images, it is often necessary to analyze or process them, without knowing the original content or the secret key used during the encryption phase. We propose to work on this problem, by associating theoretical aspects with numerous applications. Our main contributions concern: data hiding in encrypted images, correction of noisy encrypted images, recompression of crypto-compressed images and secret image sharing.

2019-12-30
Tabakhpour, Adel, Abdelaziz, Morad M. A..  2019.  Neural Network Model for False Data Detection in Power System State Estimation. 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE). :1-5.

False data injection is an on-going concern facing power system state estimation. In this work, a neural network is trained to detect the existence of false data in measurements. The proposed approach can make use of historical data, if available, by using them in the training sets of the proposed neural network model. However, the inputs of perceptron model in this work are the residual elements from the state estimation, which are highly correlated. Therefore, their dimension could be reduced by preserving the most informative features from the inputs. To this end, principal component analysis is used (i.e., a data preprocessing technique). This technique is especially efficient for highly correlated data sets, which is the case in power system measurements. The results of different perceptron models that are proposed for detection, are compared to a simple perceptron that produces identical result to the outlier detection scheme. For generating the training sets, state estimation was run for different false data on different measurements in 13-bus IEEE test system, and the residuals are saved as inputs of training sets. The testing results of the trained network show its good performance in detection of false data in measurements.

2019-12-05
Akhtar, Nabeel, Matta, Ibrahim, Raza, Ali, Wang, Yuefeng.  2018.  EL-SEC: ELastic Management of Security Applications on Virtualized Infrastructure. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :778-783.

The concept of Virtualized Network Functions (VNFs) aims to move Network Functions (NFs) out of dedicated hardware devices into software that runs on commodity hardware. A single NF consists of multiple VNF instances, usually running on virtual machines in a cloud infrastructure. The elastic management of an NF refers to load management across the VNF instances and the autonomic scaling of the number of VNF instances as the load on the NF changes. In this paper, we present EL-SEC, an autonomic framework to elastically manage security NFs on a virtualized infrastructure. As a use case, we deploy the Snort Intrusion Detection System as the NF on the GENI testbed. Concepts from control theory are used to create an Elastic Manager, which implements various controllers - in this paper, Proportional Integral (PI) and Proportional Integral Derivative (PID) - to direct traffic across the VNF Snort instances by monitoring the current load. RINA (a clean-slate Recursive InterNetwork Architecture) is used to build a distributed application that monitors load and collects Snort alerts, which are processed by the Elastic Manager and an Attack Analyzer, respectively. Software Defined Networking (SDN) is used to steer traffic through the VNF instances, and to block attack traffic. Our results show that virtualized security NFs can be easily deployed using our EL-SEC framework. With the help of real-time graphs, we show that PI and PID controllers can be used to easily scale the system, which leads to quicker detection of attacks.

Hayashi, Masahito.  2018.  Secure Physical Layer Network Coding versus Secure Network Coding. 2018 IEEE Information Theory Workshop (ITW). :1-5.

Secure network coding realizes the secrecy of the message when the message is transmitted via noiseless network and a part of edges or a part of intermediate nodes are eavesdropped. In this framework, if the channels of the network has noise, we apply the error correction to noisy channel before applying the secure network coding. In contrast, secure physical layer network coding is a method to securely transmit a message by a combination of coding operation on nodes when the network is given as a set of noisy channels. In this paper, we give several examples of network, in which, secure physical layer network coding realizes a performance that cannot be realized by secure network coding.

2019-09-30
Hohlfeld, J., Czoschke, P., Asselin, P., Benakli, M..  2019.  Improving Our Understanding of Measured Jitter (in HAMR). IEEE Transactions on Magnetics. 55:1–11.

The understanding of measured jitter is improved in three ways. First, it is shown that the measured jitter is not only governed by written-in jitter and the reader resolution along the cross-track direction but by remanence noise in the vicinity of transitions and the down-track reader resolution as well. Second, a novel data analysis scheme is introduced that allows for an unambiguous separation of these two contributions. Third, based on data analyses involving the first two learnings and micro-magnetic simulations, we identify and explain the root causes for variations of jitter with write current (WC) (write field), WC overshoot amplitude (write-field rise time), and linear disk velocity measured for heat-assisted magnetic recording.

2019-09-05
Nasseralfoghara, M., Hamidi, H..  2019.  Web Covert Timing Channels Detection Based on Entropy. 2019 5th International Conference on Web Research (ICWR). :12-15.

Todays analyzing web weaknesses and vulnerabilities in order to find security attacks has become more urgent. In case there is a communication contrary to the system security policies, a covert channel has been created. The attacker can easily disclosure information from the victim's system with just one public access permission. Covert timing channels, unlike covert storage channels, do not have memory storage and they draw less attention. Different methods have been proposed for their identification, which generally benefit from the shape of traffic and the channel's regularity. In this article, an entropy-based detection method is designed and implemented. The attacker can adjust the amount of channel entropy by controlling measures such as changing the channel's level or creating noise on the channel to protect from the analyst's detection. As a result, the entropy threshold is not always constant for detection. By comparing the entropy from different levels of the channel and the analyst, we conclude that the analyst must investigate traffic at all possible levels.

2019-06-24
Wang, J., Zhang, X., Zhang, H., Lin, H., Tode, H., Pan, M., Han, Z..  2018.  Data-Driven Optimization for Utility Providers with Differential Privacy of Users' Energy Profile. 2018 IEEE Global Communications Conference (GLOBECOM). :1–6.

Smart meters migrate conventional electricity grid into digitally enabled Smart Grid (SG), which is more reliable and efficient. Fine-grained energy consumption data collected by smart meters helps utility providers accurately predict users' demands and significantly reduce power generation cost, while it imposes severe privacy risks on consumers and may discourage them from using those “espionage meters". To enjoy the benefits of smart meter measured data without compromising the users' privacy, in this paper, we try to integrate distributed differential privacy (DDP) techniques into data-driven optimization, and propose a novel scheme that not only minimizes the cost for utility providers but also preserves the DDP of users' energy profiles. Briefly, we add differential private noises to the users' energy consumption data before the smart meters send it to the utility provider. Due to the uncertainty of the users' demand distribution, the utility provider aggregates a given set of historical users' differentially private data, estimates the users' demands, and formulates the data- driven cost minimization based on the collected noisy data. We also develop algorithms for feasible solutions, and verify the effectiveness of the proposed scheme through simulations using the simulated energy consumption data generated from the utility company's real data analysis.

2019-04-01
Rathour, N., Kaur, K., Bansal, S., Bhargava, C..  2018.  A Cross Correlation Approach for Breaking of Text CAPTCHA. 2018 International Conference on Intelligent Circuits and Systems (ICICS). :6–10.
Online web service providers generally protect themselves through CAPTCHA. A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. CAPTCHAS are mainly instigated as distorted text which the handler must correctly transcribe. Numerous schemes have been proposed till date in order to prevent attacks by Bots. This paper also presents a cross correlation based approach in breaking of famous service provider's text CAPTCHA i.e. PayPal.com and the other one is of India's most visited website IRCTC.co.in. The procedure can be fragmented down into 3 firmly tied tasks: pre-processing, segmentation, and classification. The pre-processing of the image is performed to remove all the background noise of the image. The noise in the CAPTCHA are unwanted on pixels in the background. The segmentation is performed by scanning the image for on pixels. The organization is performed by using the association values of the inputs and templates. Two types of templates have been used for classification purpose. One is the standard templates which give 30% success rate and other is the noisy templates made from the captcha images and success rate achieved with these is 100%.
2019-01-21
Wen, Y., Lao, Y..  2018.  PUF Modeling Attack using Active Learning. 2018 IEEE International Symposium on Circuits and Systems (ISCAS). :1–5.

Along with the rapid development of hardware security techniques, the revolutionary growth of countermeasures or attacking methods developed by intelligent and adaptive adversaries have significantly complicated the ability to create secure hardware systems. Thus, there is a critical need to (re)evaluate existing or new hardware security techniques against these state-of-the-art attacking methods. With this in mind, this paper presents a novel framework for incorporating active learning techniques into hardware security field. We demonstrate that active learning can significantly improve the learning efficiency of physical unclonable function (PUF) modeling attack, which samples the least confident and the most informative challenge-response pair (CRP) for training in each iteration. For example, our experimental results show that in order to obtain a prediction error below 4%, 2790 CRPs are required in passive learning, while only 811 CRPs are required in active learning. The sampling strategies and detailed applications of PUF modeling attack under various environmental conditions are also discussed. When the environment is very noisy, active learning may sample a large number of mislabeled CRPs and hence result in high prediction error. We present two methods to mitigate the contradiction between informative and noisy CRPs.

2018-10-26
Wang, G., Qin, Yanyuan, Chang, Chengjuan.  2017.  Communication with partial noisy feedback. 2017 IEEE Symposium on Computers and Communications (ISCC). :602–607.

This paper introduces the notion of one-way communication schemes with partial noisy feedback. To support this communication, the schemes suppose that Alice and Bob wish to communicate: Alice sends a sequence of alphabets over a channel to Bob, while Alice receives feedback bits from Bob for δ fraction of the transmissions. An adversary is allowed to tamper up to a constant fraction of these transmissions for both forward rounds and feedback rounds separately. This paper intends to determine the Maximum Error Rate (MER), as a function of δ (0 ≤ δ ≤ 1), under the MER rate, so that Alice can successfully communicate the messages to Bob via some protocols with δ fraction of noisy feedback. To provide a reasonable solution for the above problem, we need to explore a new kind of coding scheme for the interactive communication. In this paper, we use the notion of “non-malleable codes” (NMC) which relaxes the notions of error-correction and error-detection to some extent in communication. Informally, a code is non-malleable if the message contained in a modified codeword is either the original message or a completely unrelated value. This property largely enforces the way to detect the transmission errors. Based on the above knowledge, we provide an alphabet-based encoding scheme, including a pair of (Enc, Dec). Suppose the message needing to be transmitted is m; if m is corrupted unintentionally, then the encoding scheme Dec(Enc(m)) outputs a symbol `⊥' to denote that some potential corruptions happened during transmission. In this work, based on the previous results, we show that for any δ ∈ (0; 1), there exists a deterministic communication scheme with noiseless full feedback(δ = 1), such that the maximal tolerable error fraction γ (on Alice's transmissions) can be up to 1/2, theoretically. Moreover, we show that for any δ ∈ (0; 1), there exists a communication scheme with noisy feedback, denoting the forward and backward rounds noised with error fractions of γ0and γ1respectively, such that the maximal tolerable error fraction γ0(on forward rounds) can be up to 1/2, as well as the γ1(on feedback rounds) up to 1.

2018-09-28
Qu, X., Mu, L..  2017.  An augmented cubature Kalman filter for nonlinear dynamical systems with random parameters. 2017 36th Chinese Control Conference (CCC). :1114–1118.

In this paper, we investigate the Bayesian filtering problem for discrete nonlinear dynamical systems which contain random parameters. An augmented cubature Kalman filter (CKF) is developed to deal with the random parameters, where the state vector is enlarged by incorporating the random parameters. The corresponding number of cubature points is increased, so the augmented CKF method requires more computational complexity. However, the estimation accuracy is improved in comparison with that of the classical CKF method which uses the nominal values of the random parameters. An application to the mobile source localization with time difference of arrival (TDOA) measurements and random sensor positions is provided where the simulation results illustrate that the augmented CKF method leads to a superior performance in comparison with the classical CKF method.

2018-08-23
Xu, W., Yan, Z., Tian, Y., Cui, Y., Lin, J..  2017.  Detection with compressive measurements corrupted by sparse errors. 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP). :1–5.

Compressed sensing can represent the sparse signal with a small number of measurements compared to Nyquist-rate samples. Considering the high-complexity of reconstruction algorithms in CS, recently compressive detection is proposed, which performs detection directly in compressive domain without reconstruction. Different from existing work that generally considers the measurements corrupted by dense noises, this paper studies the compressive detection problem when the measurements are corrupted by both dense noises and sparse errors. The sparse errors exist in many practical systems, such as the ones affected by impulse noise or narrowband interference. We derive the theoretical performance of compressive detection when the sparse error is either deterministic or random. The theoretical results are further verified by simulations.

2018-05-16
Hernández, S., Lu, P. L., Granz, S., Krivosik, P., Huang, P. W., Eppler, W., Rausch, T., Gage, E..  2017.  Using Ensemble Waveform Analysis to Compare Heat Assisted Magnetic Recording Characteristics of Modeled and Measured Signals. IEEE Transactions on Magnetics. 53:1–6.

Ensemble waveform analysis is used to calculate signal to noise ratio (SNR) and other recording characteristics from micromagnetically modeled heat assisted magnetic recording waveforms and waveforms measured at both drive and spin-stand level. Using windowing functions provides the breakdown between transition and remanence SNRs. In addition, channel bit density (CBD) can be extracted from the ensemble waveforms using the di-bit extraction method. Trends in both transition SNR, remanence SNR, and CBD as a function of ambient temperature at constant track width showed good agreement between model and measurement. Both model and drive-level measurement show degradation in SNR at higher ambient temperatures, which may be due to changes in the down-track profile at the track edges compared with track center. CBD as a function of cross-track position is also calculated for both modeling and spin-stand measurements. The CBD widening at high cross-track offset, which is observed at both measurement and model, was directly related to the radius of curvature of the written transitions observed in the model and the thermal profiles used.

2018-04-04
Jin, Y., Eriksson, J..  2017.  Fully Automatic, Real-Time Vehicle Tracking for Surveillance Video. 2017 14th Conference on Computer and Robot Vision (CRV). :147–154.

We present an object tracking framework which fuses multiple unstable video-based methods and supports automatic tracker initialization and termination. To evaluate our system, we collected a large dataset of hand-annotated 5-minute traffic surveillance videos, which we are releasing to the community. To the best of our knowledge, this is the first publicly available dataset of such long videos, providing a diverse range of real-world object variation, scale change, interaction, different resolutions and illumination conditions. In our comprehensive evaluation using this dataset, we show that our automatic object tracking system often outperforms state-of-the-art trackers, even when these are provided with proper manual initialization. We also demonstrate tracking throughput improvements of 5× or more vs. the competition.

2018-02-21
Du, Y., Zhang, H..  2017.  Estimating the eavesdropping distance for radiated emission and conducted emission from information technology equipment. 2017 IEEE 5th International Symposium on Electromagnetic Compatibility (EMC-Beijing). :1–7.

The display image on the visual display unit (VDU) can be retrieved from the radiated and conducted emission at some distance with no trace. In this paper, the maximum eavesdropping distance for the unintentional radiation and conduction electromagnetic (EM) signals which contain information has been estimated in theory by considering some realistic parameters. Firstly, the maximum eavesdropping distance for the unintentional EM radiation is estimated based on the reception capacity of a log-periodic antenna which connects to a receiver, the experiment data, the attenuation in free-space and the additional attenuation in the propagation path. And then, based on a multi-conductor transmission model and some experiment results, the maximum eavesdropping distance for the conducted emission is theoretically derived. The estimating results demonstrated that the ITE equipment may also exist threat of the information leakage even if it has met the current EMC requirements.

Liu, M., Yan, Y. J., Li, W..  2017.  Implementation and optimization of A5-1 algorithm on coarse-grained reconfigurable cryptographic logic array. 2017 IEEE 12th International Conference on ASIC (ASICON). :279–282.

A5-1 algorithm is a stream cipher used to encrypt voice data in GSM, which needs to be realized with high performance due to real-time requirements. Traditional implementation on FPGA or ASIC can't obtain a trade-off among performance, cost and flexibility. To this aim, this paper introduces CGRCA to implement A5-1, and in order to optimize the performance and resource consumption, this paper proposes a resource-based path seeking (RPS) algorithm to develop an advanced implementation. Experimental results show that final optimal throughput of A5-1 implemented on CGRCA is 162.87Mbps when the frequency is 162.87MHz, and the set-up time is merely 87 cycles, which is optimal among similar works.

2018-02-15
Sheppard, J. W., Strasser, S..  2017.  A factored evolutionary optimization approach to Bayesian abductive inference for multiple-fault diagnosis. 2017 IEEE AUTOTESTCON. :1–10.

When supporting commercial or defense systems, a perennial challenge is providing effective test and diagnosis strategies to minimize downtime, thereby maximizing system availability. Potentially one of the most effective ways to maximize downtime is to be able to detect and isolate as many faults in a system at one time as possible. This is referred to as the "multiple-fault diagnosis" problem. While several tools have been developed over the years to assist in performing multiple-fault diagnosis, considerable work remains to provide the best diagnosis possible. Recently, a new model for evolutionary computation has been developed called the "Factored Evolutionary Algorithm" (FEA). In this paper, we combine our prior work in deriving diagnostic Bayesian networks from static fault isolation manuals and fault trees with the FEA strategy to perform abductive inference as a way of addressing the multiple-fault diagnosis problem. We demonstrate the effectiveness of this approach on several networks derived from existing, real-world FIMs.

2018-02-06
Xylogiannopoulos, K., Karampelas, P., Alhajj, R..  2017.  Text Mining in Unclean, Noisy or Scrambled Datasets for Digital Forensics Analytics. 2017 European Intelligence and Security Informatics Conference (EISIC). :76–83.

In our era, most of the communication between people is realized in the form of electronic messages and especially through smart mobile devices. As such, the written text exchanged suffers from bad use of punctuation, misspelling words, continuous chunk of several words without spaces, tables, internet addresses etc. which make traditional text analytics methods difficult or impossible to be applied without serious effort to clean the dataset. Our proposed method in this paper can work in massive noisy and scrambled texts with minimal preprocessing by removing special characters and spaces in order to create a continuous string and detect all the repeated patterns very efficiently using the Longest Expected Repeated Pattern Reduced Suffix Array (LERP-RSA) data structure and a variant of All Repeated Patterns Detection (ARPaD) algorithm. Meta-analyses of the results can further assist a digital forensics investigator to detect important information to the chunk of text analyzed.