Visible to the public Biblio

Filters: Keyword is Mutual information  [Clear All Filters]
2022-07-01
Rangi, Anshuka, Franceschetti, Massimo.  2021.  Channel Coding Theorems in Non-stochastic Information Theory. 2021 IEEE International Symposium on Information Theory (ISIT). :1790–1795.
Recently, the δ-mutual information between uncertain variables has been introduced as a generalization of Nair's non-stochastic mutual information functional [1], [2]. Within this framework, we introduce four different notions of capacity and present corresponding coding theorems. Our definitions include an analogue of Shannon's capacity in a non-stochastic setting, and a generalization of the zero-error capacity. The associated coding theorems hold for stationary, memoryless, non-stochastic uncertain channels. These results establish the relationship between the δ-mutual information and our operational definitions, providing a step towards the development of a complete non-stochastic information theory.
2021-05-05
Hallaji, Ehsan, Razavi-Far, Roozbeh, Saif, Mehrdad.  2020.  Detection of Malicious SCADA Communications via Multi-Subspace Feature Selection. 2020 International Joint Conference on Neural Networks (IJCNN). :1—8.
Security maintenance of Supervisory Control and Data Acquisition (SCADA) systems has been a point of interest during recent years. Numerous research works have been dedicated to the design of intrusion detection systems for securing SCADA communications. Nevertheless, these data-driven techniques are usually dependant on the quality of the monitored data. In this work, we propose a novel feature selection approach, called MSFS, to tackle undesirable quality of data caused by feature redundancy. In contrast to most feature selection techniques, the proposed method models each class in a different subspace, where it is optimally discriminated. This has been accomplished by resorting to ensemble learning, which enables the usage of multiple feature sets in the same feature space. The proposed method is then utilized to perform intrusion detection in smaller subspaces, which brings about efficiency and accuracy. Moreover, a comparative study is performed on a number of advanced feature selection algorithms. Furthermore, a dataset obtained from the SCADA system of a gas pipeline is employed to enable a realistic simulation. The results indicate the proposed approach extensively improves the detection performance in terms of classification accuracy and standard deviation.
2021-04-08
Bloch, M., Laneman, J. N..  2009.  Information-spectrum methods for information-theoretic security. 2009 Information Theory and Applications Workshop. :23–28.
We investigate the potential of an information-spectrum approach to information-theoretic security. We show how this approach provides conceptually simple yet powerful results that can be used to investigate complex communication scenarios. In particular, we illustrate the usefulness of information-spectrum methods by analyzing the effect of channel state information (CSI) on the secure rates achievable over wiretap channels. We establish a formula for secrecy capacity, which we then specialize to compute achievable rates for ergodic fading channels in the presence of imperfect CSI. Our results confirm the importance of having some knowledge about the eavesdropper's channel, but also show that imperfect CSI does not necessarily preclude security.
2020-11-20
Chin, J., Zufferey, T., Shyti, E., Hug, G..  2019.  Load Forecasting of Privacy-Aware Consumers. 2019 IEEE Milan PowerTech. :1—6.

The roll-out of smart meters (SMs) in the electric grid has enabled data-driven grid management and planning techniques. SM data can be used together with short-term load forecasts (STLFs) to overcome polling frequency constraints for better grid management. However, the use of SMs that report consumption data at high spatial and temporal resolutions entails consumer privacy risks, motivating work in protecting consumer privacy. The impact of privacy protection schemes on STLF accuracy is not well studied, especially for smaller aggregations of consumers, whose load profiles are subject to more volatility and are, thus, harder to predict. In this paper, we analyse the impact of two user demand shaping privacy protection schemes, model-distribution predictive control (MDPC) and load-levelling, on STLF accuracy. Support vector regression is used to predict the load profiles at different consumer aggregation levels. Results indicate that, while the MDPC algorithm marginally affects forecast accuracy for smaller consumer aggregations, this diminishes at higher aggregation levels. More importantly, the load-levelling scheme significantly improves STLF accuracy as it smoothens out the grid visible consumer load profile.

2020-09-21
Arrieta, Miguel, Esnaola, Iñaki, Effros, Michelle.  2019.  Universal Privacy Guarantees for Smart Meters. 2019 IEEE International Symposium on Information Theory (ISIT). :2154–2158.
Smart meters enable improvements in electricity distribution system efficiency at some cost in customer privacy. Users with home batteries can mitigate this privacy loss by applying charging policies that mask their underlying energy use. A battery charging policy is proposed and shown to provide universal privacy guarantees subject to a constraint on energy cost. The guarantee bounds our strategy's maximal information leakage from the user to the utility provider under general stochastic models of user energy consumption. The policy construction adapts coding strategies for non-probabilistic permuting channels to this privacy problem.
Pedram, Ali Reza, Tanaka, Takashi, Hale, Matthew.  2019.  Bidirectional Information Flow and the Roles of Privacy Masks in Cloud-Based Control. 2019 IEEE Information Theory Workshop (ITW). :1–5.
We consider a cloud-based control architecture for a linear plant with Gaussian process noise, where the state of the plant contains a client's sensitive information. We assume that the cloud tries to estimate the state while executing a designated control algorithm. The mutual information between the client's actual state and the cloud's estimate is adopted as a measure of privacy loss. We discuss the necessity of uplink and downlink privacy masks. After observing that privacy is not necessarily a monotone function of the noise levels of privacy masks, we discuss the joint design procedure for uplink and downlink privacy masks. Finally, the trade-off between privacy and control performance is explored.
2020-09-08
Mufassa, Fauzil Halim, Anwar, Khoirul.  2019.  Extrinsic Information Transfer (EXIT) Analysis for Short Polar Codes. 2019 Symposium on Future Telecommunication Technologies (SOFTT). 1:1–6.

Ze the quality of channels into either completely noisy or noieseless channels. This paper presents extrinsic information transfer (EXIT) analysis for iterative decoding of Polar codes to reveal the mechanism of channel transformation. The purpose of understanding the transformation process are to comprehend the placement process of information bit and frozen bit and to comprehend the security standard of Polar codes. Mutual information derived based on the concept of EXIT chart for check nodes and variable nodes of low density parity check (LDPC) codes and applied to Polar codes. This paper explores the quality of the polarized channels in finite blocklength. The finite block-length is of our interest since in the fifth telecommunications generation (5G) the block length is limited. This paper reveals the EXIT curve changes of Polar codes and explores the polarization characteristics, thus, high value of mutual informations for frozen bit are needed to be detectable. If it is the other way, the error correction capability of Polar codes would be drastically decreases. These results are expected to be a reference for developments of Polar codes for 5G technologies and beyond.

2020-09-04
Osia, Seyed Ali, Rassouli, Borzoo, Haddadi, Hamed, Rabiee, Hamid R., Gündüz, Deniz.  2019.  Privacy Against Brute-Force Inference Attacks. 2019 IEEE International Symposium on Information Theory (ISIT). :637—641.
Privacy-preserving data release is about disclosing information about useful data while retaining the privacy of sensitive data. Assuming that the sensitive data is threatened by a brute-force adversary, we define Guessing Leakage as a measure of privacy, based on the concept of guessing. After investigating the properties of this measure, we derive the optimal utility-privacy trade-off via a linear program with any f-information adopted as the utility measure, and show that the optimal utility is a concave and piece-wise linear function of the privacy-leakage budget.
2020-04-20
Xiao, Tianrui, Khisti, Ashish.  2019.  Maximal Information Leakage based Privacy Preserving Data Disclosure Mechanisms. 2019 16th Canadian Workshop on Information Theory (CWIT). :1–6.
It is often necessary to disclose training data to the public domain, while protecting privacy of certain sensitive labels. We use information theoretic measures to develop such privacy preserving data disclosure mechanisms. Our mechanism involves perturbing the data vectors to strike a balance in the privacy-utility trade-off. We use maximal information leakage between the output data vector and the confidential label as our privacy metric. We first study the theoretical Bernoulli-Gaussian model and study the privacy-utility trade-off when only the mean of the Gaussian distributions can be perturbed. We show that the optimal solution is the same as the case when the utility is measured using probability of error at the adversary. We then consider an application of this framework to a data driven setting and provide an empirical approximation to the Sibson mutual information. By performing experiments on the MNIST and FERG data sets, we show that our proposed framework achieves equivalent or better privacy than previous methods based on mutual information.
2020-01-28
Monaco, John V..  2019.  Feasibility of a Keystroke Timing Attack on Search Engines with Autocomplete. 2019 IEEE Security and Privacy Workshops (SPW). :212–217.
Many websites induce the browser to send network traffic in response to user input events. This includes websites with autocomplete, a popular feature on search engines that anticipates the user's query while they are typing. Websites with this functionality require HTTP requests to be made as the query input field changes, such as when the user presses a key. The browser responds to input events by generating network traffic to retrieve the search predictions. The traffic emitted by the client can expose the timings of keyboard input events which may lead to a keylogging side channel attack whereby the query is revealed through packet inter-arrival times. We investigate the feasibility of such an attack on several popular search engines by characterizing the behavior of each website and measuring information leakage at the network level. Three out of the five search engines we measure preserve the mutual information between keystrokes and timings to within 1% of what it is on the host. We describe the ways in which two search engines mitigate this vulnerability with minimal effects on usability.
2020-01-20
Elisa, Noe, Yang, Longzhi, Fu, Xin, Naik, Nitin.  2019.  Dendritic Cell Algorithm Enhancement Using Fuzzy Inference System for Network Intrusion Detection. 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–6.

Dendritic cell algorithm (DCA) is an immune-inspired classification algorithm which is developed for the purpose of anomaly detection in computer networks. The DCA uses a weighted function in its context detection phase to process three categories of input signals including safe, danger and pathogenic associated molecular pattern to three output context values termed as co-stimulatory, mature and semi-mature, which are then used to perform classification. The weighted function used by the DCA requires either manually pre-defined weights usually provided by the immunologists, or empirically derived weights from the training dataset. Neither of these is sufficiently flexible to work with different datasets to produce optimum classification result. To address such limitation, this work proposes an approach for computing the three output context values of the DCA by employing the recently proposed TSK+ fuzzy inference system, such that the weights are always optimal for the provided data set regarding a specific application. The proposed approach was validated and evaluated by applying it to the two popular datasets KDD99 and UNSW NB15. The results from the experiments demonstrate that, the proposed approach outperforms the conventional DCA in terms of classification accuracy.

2019-03-04
Husari, G., Niu, X., Chu, B., Al-Shaer, E..  2018.  Using Entropy and Mutual Information to Extract Threat Actions from Cyber Threat Intelligence. 2018 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–6.
With the rapid growth of the cyber attacks, cyber threat intelligence (CTI) sharing becomes essential for providing advance threat notice and enabling timely response to cyber attacks. Our goal in this paper is to develop an approach to extract low-level cyber threat actions from publicly available CTI sources in an automated manner to enable timely defense decision making. Specifically, we innovatively and successfully used the metrics of entropy and mutual information from Information Theory to analyze the text in the cybersecurity domain. Combined with some basic NLP techniques, our framework, called ActionMiner has achieved higher precision and recall than the state-of-the-art Stanford typed dependency parser, which usually works well in general English but not cybersecurity texts.
2018-11-19
Sun, K., Esnaola, I., Perlaza, S. M., Poor, H. V..  2017.  Information-Theoretic Attacks in the Smart Grid. 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm). :455–460.

Gaussian random attacks that jointly minimize the amount of information obtained by the operator from the grid and the probability of attack detection are presented. The construction of the attack is posed as an optimization problem with a utility function that captures two effects: firstly, minimizing the mutual information between the measurements and the state variables; secondly, minimizing the probability of attack detection via the Kullback-Leibler (KL) divergence between the distribution of the measurements with an attack and the distribution of the measurements without an attack. Additionally, a lower bound on the utility function achieved by the attacks constructed with imperfect knowledge of the second order statistics of the state variables is obtained. The performance of the attack construction using the sample covariance matrix of the state variables is numerically evaluated. The above results are tested in the IEEE 30-Bus test system.

2018-03-05
Cohen, A., Cohen, A., Médard, M., Gurewitz, O..  2017.  Individually-Secure Multi-Source Multicast. 2017 IEEE International Symposium on Information Theory (ISIT). :3105–3109.

The principal mission of Multi-Source Multicast (MSM) is to disseminate all messages from all sources in a network to all destinations. MSM is utilized in numerous applications. In many of them, securing the messages disseminated is critical. A common secure model is to consider a network where there is an eavesdropper which is able to observe a subset of the network links, and seek a code which keeps the eavesdropper ignorant regarding all the messages. While this is solved when all messages are located at a single source, Secure MSM (SMSM) is an open problem, and the rates required are hard to characterize in general. In this paper, we consider Individual Security, which promises that the eavesdropper has zero mutual information with each message individually. We completely characterize the rate region for SMSM under individual security, and show that such a security level is achievable at the full capacity of the network, that is, the cut-set bound is the matching converse, similar to non-secure MSM. Moreover, we show that the field size is similar to non-secure MSM and does not have to be larger due to the security constraint.

2018-02-15
Chanyaswad, T., Al, M., Chang, J. M., Kung, S. Y..  2017.  Differential mutual information forward search for multi-kernel discriminant-component selection with an application to privacy-preserving classification. 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP). :1–6.

In machine learning, feature engineering has been a pivotal stage in building a high-quality predictor. Particularly, this work explores the multiple Kernel Discriminant Component Analysis (mKDCA) feature-map and its variants. However, seeking the right subset of kernels for mKDCA feature-map can be challenging. Therefore, we consider the problem of kernel selection, and propose an algorithm based on Differential Mutual Information (DMI) and incremental forward search. DMI serves as an effective metric for selecting kernels, as is theoretically supported by mutual information and Fisher's discriminant analysis. On the other hand, incremental forward search plays a role in removing redundancy among kernels. Finally, we illustrate the potential of the method via an application in privacy-aware classification, and show on three mobile-sensing datasets that selecting an effective set of kernels for mKDCA feature-maps can enhance the utility classification performance, while successfully preserve the data privacy. Specifically, the results show that the proposed DMI forward search method can perform better than the state-of-the-art, and, with much smaller computational cost, can perform as well as the optimal, yet computationally expensive, exhaustive search.

2018-01-10
Bhattacharjee, S. Das, Talukder, A., Al-Shaer, E., Doshi, P..  2017.  Prioritized active learning for malicious URL detection using weighted text-based features. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :107–112.

Data analytics is being increasingly used in cyber-security problems, and found to be useful in cases where data volumes and heterogeneity make it cumbersome for manual assessment by security experts. In practical cyber-security scenarios involving data-driven analytics, obtaining data with annotations (i.e. ground-truth labels) is a challenging and known limiting factor for many supervised security analytics task. Significant portions of the large datasets typically remain unlabelled, as the task of annotation is extensively manual and requires a huge amount of expert intervention. In this paper, we propose an effective active learning approach that can efficiently address this limitation in a practical cyber-security problem of Phishing categorization, whereby we use a human-machine collaborative approach to design a semi-supervised solution. An initial classifier is learnt on a small amount of the annotated data which in an iterative manner, is then gradually updated by shortlisting only relevant samples from the large pool of unlabelled data that are most likely to influence the classifier performance fast. Prioritized Active Learning shows a significant promise to achieve faster convergence in terms of the classification performance in a batch learning framework, and thus requiring even lesser effort for human annotation. An useful feature weight update technique combined with active learning shows promising classification performance for categorizing Phishing/malicious URLs without requiring a large amount of annotated training samples to be available during training. In experiments with several collections of PhishMonger's Targeted Brand dataset, the proposed method shows significant improvement over the baseline by as much as 12%.

2017-12-28
Liu, H., Ditzler, G..  2017.  A fast information-theoretic approximation of joint mutual information feature selection. 2017 International Joint Conference on Neural Networks (IJCNN). :4610–4617.

Feature selection is an important step in data analysis to address the curse of dimensionality. Such dimensionality reduction techniques are particularly important when if a classification is required and the model scales in polynomial time with the size of the feature (e.g., some applications include genomics, life sciences, cyber-security, etc.). Feature selection is the process of finding the minimum subset of features that allows for the maximum predictive power. Many of the state-of-the-art information-theoretic feature selection approaches use a greedy forward search; however, there are concerns with the search in regards to the efficiency and optimality. A unified framework was recently presented for information-theoretic feature selection that tied together many of the works in over the past twenty years. The work showed that joint mutual information maximization (JMI) is generally the best options; however, the complexity of greedy search for JMI scales quadratically and it is infeasible on high dimensional datasets. In this contribution, we propose a fast approximation of JMI based on information theory. Our approach takes advantage of decomposing the calculations within JMI to speed up a typical greedy search. We benchmarked the proposed approach against JMI on several UCI datasets, and we demonstrate that the proposed approach returns feature sets that are highly consistent with JMI, while decreasing the run time required to perform feature selection.

2017-12-27
Kar, N., Aman, M. A. A. A., Mandal, K., Bhattacharya, B..  2017.  Chaos-based video steganography. 2017 8th International Conference on Information Technology (ICIT). :482–487.

In this paper a novel data hiding method has been proposed which is based on Non-Linear Feedback Shift Register and Tinkerbell 2D chaotic map. So far, the major work in Steganography using chaotic map has been confined to image steganography where significant restrictions are there to increase payload. In our work, 2D chaotic map and NLFSR are used to developed a video steganography mechanism where data will be embedded in the segregated frames. This will increase the data hiding limit exponentially. Also, embedding position of each frame will be different from others frames which will increase the overall security of the proposed mechanism. We have achieved this randomized data hiding points by using a chaotic map. Basically, Chaotic theory which is non-linear dynamics physics is using in this era in the field of Cryptography and Steganography and because of this theory, little bit changes in initial condition makes the output totally different. So, it is very hard to get embedding position of data without knowing the initial value of the chaotic map.

Ye, Z., Yin, H., Ye, Y..  2017.  Information security analysis of deterministic encryption and chaotic encryption in spatial domain and frequency domain. 2017 14th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE). :1–6.

Information security is crucial to data storage and transmission, which is necessary to protect information under various hostile environments. Cryptography serves as a major element to ensure confidentiality in both communication and information technology, where the encryption and decryption schemes are implemented to scramble the pure plaintext and descramble the secret ciphertext using security keys. There are two dominating types of encryption schemes: deterministic encryption and chaotic encryption. Encryption and decryption can be conducted in either spatial domain or frequency domain. To ensure secure transmission of digital information, comparisons on merits and drawbacks of two practical encryption schemes are conducted, where case studies on the true color digital image encryption are presented. Both deterministic encryption in spatial domain and chaotic encryption in frequency domain are analyzed in context, as well as the information integrity after decryption.

2017-12-20
Li, S., Wang, B..  2017.  A Method for Hybrid Bayesian Network Structure Learning from Massive Data Using MapReduce. 2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieee international conference on intelligent data and security (ids). :272–276.
Bayesian Network is the popular and important data mining model for representing uncertain knowledge. For large scale data it is often too costly to learn the accurate structure. To resolve this problem, much work has been done on migrating the structure learning algorithms to the MapReduce framework. In this paper, we introduce a distributed hybrid structure learning algorithm by combining the advantages of constraint-based and score-and-search-based algorithms. By reusing the intermediate results of MapReduce, the algorithm greatly simplified the computing work and got good results in both efficiency and accuracy.
2015-05-05
Sarikaya, Y., Ercetin, O., Koksal, C.E..  2014.  Confidentiality-Preserving Control of Uplink Cellular Wireless Networks Using Hybrid ARQ. Networking, IEEE/ACM Transactions on. PP:1-1.

We consider the problem of cross-layer resource allocation with information-theoretic secrecy for uplink transmissions in time-varying cellular wireless networks. Particularly, each node in an uplink cellular network injects two types of traffic, confidential and open at rates chosen in order to maximize a global utility function while keeping the data queues stable and meeting a constraint on the secrecy outage probability. The transmitting node only knows the distribution of channel gains. Our scheme is based on Hybrid Automatic Repeat Request (HARQ) transmission with incremental redundancy. We prove that our scheme achieves a utility, arbitrarily close to the maximum achievable. Numerical experiments are performed to verify the analytical results and to show the efficacy of the dynamic control algorithm.
 

2015-05-04
Naini, R., Moulin, P..  2014.  Fingerprint information maximization for content identification. Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. :3809-3813.

This paper presents a novel design of content fingerprints based on maximization of the mutual information across the distortion channel. We use the information bottleneck method to optimize the filters and quantizers that generate these fingerprints. A greedy optimization scheme is used to select filters from a dictionary and allocate fingerprint bits. We test the performance of this method for audio fingerprinting and show substantial improvements over existing learning based fingerprints.