Visible to the public Biblio

Filters: Keyword is GBM  [Clear All Filters]
2022-02-07
Han, Sung-Hwa.  2021.  Analysis of Data Transforming Technology for Malware Detection. 2021 21st ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter). :224–229.
As AI technology advances and its use increases, efforts to incorporate machine learning for malware detection are increasing. However, for malware learning, a standardized data set is required. Because malware is unstructured data, it cannot be directly learned. In order to solve this problem, many studies have attempted to convert unstructured data into structured data. In this study, the features and limitations of each were analyzed by investigating and analyzing the method of converting unstructured data proposed in each study into structured data. As a result, most of the data conversion techniques suggest conversion mechanisms, but the scope of each technique has not been determined. The resulting data set is not suitable for use as training data because it has infinite properties.
2017-02-23
G. Kejela, C. Rong.  2015.  "Cross-Device Consumer Identification". 2015 IEEE International Conference on Data Mining Workshop (ICDMW). :1687-1689.

Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.