A fast information-theoretic approximation of joint mutual information feature selection

Submitted by grigby1 on Thu, 12/28/2017 - 1:37pm

Title	A fast information-theoretic approximation of joint mutual information feature selection
Publication Type	Conference Paper
Year of Publication	2017
Authors	Liu, H., Ditzler, G.
Conference Name	2017 International Joint Conference on Neural Networks (IJCNN)
Date Published	may
ISBN Number	978-1-5090-6182-2
Keywords	approximation theory, classification, Collaboration, Complexity theory, composability, computational complexity, Data analysis, data reduction, dimensionality reduction, Entropy, fast information-theoretic approximation, feature selection, feature size, filtering theory, greedy algorithms, greedy forward search, greedy search complexity, high dimensional datasets, Human Behavior, human factors, Information theory, JMI maximization, JMI scales, joint mutual information feature selection, joint mutual information maximization, Metrics, minimum features subset, Mutual information, pattern classification, Policy-Governed Secure Collaboration, polynomial time, pubcrawl, Redundancy, Resiliency, Scalability, science of security, search problems, UCI datasets, unified framework, Upper bound
Abstract	Feature selection is an important step in data analysis to address the curse of dimensionality. Such dimensionality reduction techniques are particularly important when if a classification is required and the model scales in polynomial time with the size of the feature (e.g., some applications include genomics, life sciences, cyber-security, etc.). Feature selection is the process of finding the minimum subset of features that allows for the maximum predictive power. Many of the state-of-the-art information-theoretic feature selection approaches use a greedy forward search; however, there are concerns with the search in regards to the efficiency and optimality. A unified framework was recently presented for information-theoretic feature selection that tied together many of the works in over the past twenty years. The work showed that joint mutual information maximization (JMI) is generally the best options; however, the complexity of greedy search for JMI scales quadratically and it is infeasible on high dimensional datasets. In this contribution, we propose a fast approximation of JMI based on information theory. Our approach takes advantage of decomposing the calculations within JMI to speed up a typical greedy search. We benchmarked the proposed approach against JMI on several UCI datasets, and we demonstrate that the proposed approach returns feature sets that are highly consistent with JMI, while decreasing the run time required to perform feature selection.
URL	https://ieeexplore.ieee.org/document/7966441/
DOI	10.1109/IJCNN.2017.7966441
Citation Key	liu_fast_2017

Groups:

Science of Security VO