Biblio
The roll-out of smart meters (SMs) in the electric grid has enabled data-driven grid management and planning techniques. SM data can be used together with short-term load forecasts (STLFs) to overcome polling frequency constraints for better grid management. However, the use of SMs that report consumption data at high spatial and temporal resolutions entails consumer privacy risks, motivating work in protecting consumer privacy. The impact of privacy protection schemes on STLF accuracy is not well studied, especially for smaller aggregations of consumers, whose load profiles are subject to more volatility and are, thus, harder to predict. In this paper, we analyse the impact of two user demand shaping privacy protection schemes, model-distribution predictive control (MDPC) and load-levelling, on STLF accuracy. Support vector regression is used to predict the load profiles at different consumer aggregation levels. Results indicate that, while the MDPC algorithm marginally affects forecast accuracy for smaller consumer aggregations, this diminishes at higher aggregation levels. More importantly, the load-levelling scheme significantly improves STLF accuracy as it smoothens out the grid visible consumer load profile.
Ze the quality of channels into either completely noisy or noieseless channels. This paper presents extrinsic information transfer (EXIT) analysis for iterative decoding of Polar codes to reveal the mechanism of channel transformation. The purpose of understanding the transformation process are to comprehend the placement process of information bit and frozen bit and to comprehend the security standard of Polar codes. Mutual information derived based on the concept of EXIT chart for check nodes and variable nodes of low density parity check (LDPC) codes and applied to Polar codes. This paper explores the quality of the polarized channels in finite blocklength. The finite block-length is of our interest since in the fifth telecommunications generation (5G) the block length is limited. This paper reveals the EXIT curve changes of Polar codes and explores the polarization characteristics, thus, high value of mutual informations for frozen bit are needed to be detectable. If it is the other way, the error correction capability of Polar codes would be drastically decreases. These results are expected to be a reference for developments of Polar codes for 5G technologies and beyond.
Dendritic cell algorithm (DCA) is an immune-inspired classification algorithm which is developed for the purpose of anomaly detection in computer networks. The DCA uses a weighted function in its context detection phase to process three categories of input signals including safe, danger and pathogenic associated molecular pattern to three output context values termed as co-stimulatory, mature and semi-mature, which are then used to perform classification. The weighted function used by the DCA requires either manually pre-defined weights usually provided by the immunologists, or empirically derived weights from the training dataset. Neither of these is sufficiently flexible to work with different datasets to produce optimum classification result. To address such limitation, this work proposes an approach for computing the three output context values of the DCA by employing the recently proposed TSK+ fuzzy inference system, such that the weights are always optimal for the provided data set regarding a specific application. The proposed approach was validated and evaluated by applying it to the two popular datasets KDD99 and UNSW NB15. The results from the experiments demonstrate that, the proposed approach outperforms the conventional DCA in terms of classification accuracy.
Gaussian random attacks that jointly minimize the amount of information obtained by the operator from the grid and the probability of attack detection are presented. The construction of the attack is posed as an optimization problem with a utility function that captures two effects: firstly, minimizing the mutual information between the measurements and the state variables; secondly, minimizing the probability of attack detection via the Kullback-Leibler (KL) divergence between the distribution of the measurements with an attack and the distribution of the measurements without an attack. Additionally, a lower bound on the utility function achieved by the attacks constructed with imperfect knowledge of the second order statistics of the state variables is obtained. The performance of the attack construction using the sample covariance matrix of the state variables is numerically evaluated. The above results are tested in the IEEE 30-Bus test system.
The principal mission of Multi-Source Multicast (MSM) is to disseminate all messages from all sources in a network to all destinations. MSM is utilized in numerous applications. In many of them, securing the messages disseminated is critical. A common secure model is to consider a network where there is an eavesdropper which is able to observe a subset of the network links, and seek a code which keeps the eavesdropper ignorant regarding all the messages. While this is solved when all messages are located at a single source, Secure MSM (SMSM) is an open problem, and the rates required are hard to characterize in general. In this paper, we consider Individual Security, which promises that the eavesdropper has zero mutual information with each message individually. We completely characterize the rate region for SMSM under individual security, and show that such a security level is achievable at the full capacity of the network, that is, the cut-set bound is the matching converse, similar to non-secure MSM. Moreover, we show that the field size is similar to non-secure MSM and does not have to be larger due to the security constraint.
In machine learning, feature engineering has been a pivotal stage in building a high-quality predictor. Particularly, this work explores the multiple Kernel Discriminant Component Analysis (mKDCA) feature-map and its variants. However, seeking the right subset of kernels for mKDCA feature-map can be challenging. Therefore, we consider the problem of kernel selection, and propose an algorithm based on Differential Mutual Information (DMI) and incremental forward search. DMI serves as an effective metric for selecting kernels, as is theoretically supported by mutual information and Fisher's discriminant analysis. On the other hand, incremental forward search plays a role in removing redundancy among kernels. Finally, we illustrate the potential of the method via an application in privacy-aware classification, and show on three mobile-sensing datasets that selecting an effective set of kernels for mKDCA feature-maps can enhance the utility classification performance, while successfully preserve the data privacy. Specifically, the results show that the proposed DMI forward search method can perform better than the state-of-the-art, and, with much smaller computational cost, can perform as well as the optimal, yet computationally expensive, exhaustive search.
Data analytics is being increasingly used in cyber-security problems, and found to be useful in cases where data volumes and heterogeneity make it cumbersome for manual assessment by security experts. In practical cyber-security scenarios involving data-driven analytics, obtaining data with annotations (i.e. ground-truth labels) is a challenging and known limiting factor for many supervised security analytics task. Significant portions of the large datasets typically remain unlabelled, as the task of annotation is extensively manual and requires a huge amount of expert intervention. In this paper, we propose an effective active learning approach that can efficiently address this limitation in a practical cyber-security problem of Phishing categorization, whereby we use a human-machine collaborative approach to design a semi-supervised solution. An initial classifier is learnt on a small amount of the annotated data which in an iterative manner, is then gradually updated by shortlisting only relevant samples from the large pool of unlabelled data that are most likely to influence the classifier performance fast. Prioritized Active Learning shows a significant promise to achieve faster convergence in terms of the classification performance in a batch learning framework, and thus requiring even lesser effort for human annotation. An useful feature weight update technique combined with active learning shows promising classification performance for categorizing Phishing/malicious URLs without requiring a large amount of annotated training samples to be available during training. In experiments with several collections of PhishMonger's Targeted Brand dataset, the proposed method shows significant improvement over the baseline by as much as 12%.
Feature selection is an important step in data analysis to address the curse of dimensionality. Such dimensionality reduction techniques are particularly important when if a classification is required and the model scales in polynomial time with the size of the feature (e.g., some applications include genomics, life sciences, cyber-security, etc.). Feature selection is the process of finding the minimum subset of features that allows for the maximum predictive power. Many of the state-of-the-art information-theoretic feature selection approaches use a greedy forward search; however, there are concerns with the search in regards to the efficiency and optimality. A unified framework was recently presented for information-theoretic feature selection that tied together many of the works in over the past twenty years. The work showed that joint mutual information maximization (JMI) is generally the best options; however, the complexity of greedy search for JMI scales quadratically and it is infeasible on high dimensional datasets. In this contribution, we propose a fast approximation of JMI based on information theory. Our approach takes advantage of decomposing the calculations within JMI to speed up a typical greedy search. We benchmarked the proposed approach against JMI on several UCI datasets, and we demonstrate that the proposed approach returns feature sets that are highly consistent with JMI, while decreasing the run time required to perform feature selection.
In this paper a novel data hiding method has been proposed which is based on Non-Linear Feedback Shift Register and Tinkerbell 2D chaotic map. So far, the major work in Steganography using chaotic map has been confined to image steganography where significant restrictions are there to increase payload. In our work, 2D chaotic map and NLFSR are used to developed a video steganography mechanism where data will be embedded in the segregated frames. This will increase the data hiding limit exponentially. Also, embedding position of each frame will be different from others frames which will increase the overall security of the proposed mechanism. We have achieved this randomized data hiding points by using a chaotic map. Basically, Chaotic theory which is non-linear dynamics physics is using in this era in the field of Cryptography and Steganography and because of this theory, little bit changes in initial condition makes the output totally different. So, it is very hard to get embedding position of data without knowing the initial value of the chaotic map.
Information security is crucial to data storage and transmission, which is necessary to protect information under various hostile environments. Cryptography serves as a major element to ensure confidentiality in both communication and information technology, where the encryption and decryption schemes are implemented to scramble the pure plaintext and descramble the secret ciphertext using security keys. There are two dominating types of encryption schemes: deterministic encryption and chaotic encryption. Encryption and decryption can be conducted in either spatial domain or frequency domain. To ensure secure transmission of digital information, comparisons on merits and drawbacks of two practical encryption schemes are conducted, where case studies on the true color digital image encryption are presented. Both deterministic encryption in spatial domain and chaotic encryption in frequency domain are analyzed in context, as well as the information integrity after decryption.
We consider the problem of cross-layer resource allocation with information-theoretic secrecy for uplink transmissions in time-varying cellular wireless networks. Particularly, each node in an uplink cellular network injects two types of traffic, confidential and open at rates chosen in order to maximize a global utility function while keeping the data queues stable and meeting a constraint on the secrecy outage probability. The transmitting node only knows the distribution of channel gains. Our scheme is based on Hybrid Automatic Repeat Request (HARQ) transmission with incremental redundancy. We prove that our scheme achieves a utility, arbitrarily close to the maximum achievable. Numerical experiments are performed to verify the analytical results and to show the efficacy of the dynamic control algorithm.
This paper presents a novel design of content fingerprints based on maximization of the mutual information across the distortion channel. We use the information bottleneck method to optimize the filters and quantizers that generate these fingerprints. A greedy optimization scheme is used to select filters from a dictionary and allocate fingerprint bits. We test the performance of this method for audio fingerprinting and show substantial improvements over existing learning based fingerprints.