Title | A Novel Support Vector Machine Algorithm for Missing Data |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Zhu, Mengeheng, Shi, Hong |
Conference Name | Proceedings of the 2Nd International Conference on Innovation in Artificial Intelligence |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-6345-7 |
Keywords | classification, composability, distance calculation, Metrics, missing data, pubcrawl, Resiliency, support vector machine, Support vector machines |
Abstract | Missing data problem often occurs in data analysis. The most common way to solve this problem is imputation. But imputation methods are only suitable for dealing with a low proportion of missing data, when assuming that missing data satisfies MCAR (Missing Completely at Random) or MAR (Missing at Random). In this paper, considering the reasons for missing data, we propose a novel support vector machine method using a new kernel function to solve the problem with a relatively large proportion of missing data. This method makes full use of observed data to reduce the error caused by filling a large number of missing values. We validate our method on 4 data sets from UCI Repository of Machine Learning. The accuracy, F-score, Kappa statistics and recall are used to evaluate the performance. Experimental results show that our method achieve significant improvement in terms of classification results compared with common imputation methods, even when the proportion of missing data is high. |
URL | http://doi.acm.org/10.1145/3194206.3194214 |
DOI | 10.1145/3194206.3194214 |
Citation Key | zhu_novel_2018 |