Visible to the public Biblio

Filters: Keyword is Xgboost  [Clear All Filters]
2023-08-23
Liang, Chenjun, Deng, Li, Zhu, Jincan, Cao, Zhen, Li, Chao.  2022.  Cloud Storage I/O Load Prediction Based on XB-IOPS Feature Engineering. 2022 IEEE 8th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :54—60.
With the popularization of cloud computing and the deepening of its application, more and more cloud block storage systems have been put into use. The performance optimization of cloud block storage systems has become an important challenge facing today, which is manifested in the reduction of system performance caused by the unbalanced resource load of cloud block storage systems. Accurately predicting the I/O load status of the cloud block storage system can effectively avoid the load imbalance problem. However, the cloud block storage system has the characteristics of frequent random reads and writes, and a large amount of I/O requests, which makes prediction difficult. Therefore, we propose a novel I/O load prediction method for XB-IOPS feature engineering. The feature engineering is designed according to the I/O request pattern, I/O size and I/O interference, and realizes the prediction of the actual load value at a certain moment in the future and the average load value in the continuous time interval in the future. Validated on a real dataset of Alibaba Cloud block storage system, the results show that the XB-IOPS feature engineering prediction model in this paper has better performance in Alibaba Cloud block storage devices where random I/O and small I/O dominate. The prediction performance is better, and the prediction time is shorter than other prediction models.
2022-05-09
Ma, Zhuoran, Ma, Jianfeng, Miao, Yinbin, Liu, Ximeng, Choo, Kim-Kwang Raymond, Yang, Ruikang, Wang, Xiangyu.  2021.  Lightweight Privacy-preserving Medical Diagnosis in Edge Computing. 2021 IEEE World Congress on Services (SERVICES). :9–9.
In the era of machine learning, mobile users are able to submit their symptoms to doctors at any time, anywhere for personal diagnosis. It is prevalent to exploit edge computing for real-time diagnosis services in order to reduce transmission latency. Although data-driven machine learning is powerful, it inevitably compromises privacy by relying on vast amounts of medical data to build a diagnostic model. Therefore, it is necessary to protect data privacy without accessing local data. However, the blossom has also been accompanied by various problems, i.e., the limitation of training data, vulnerabilities, and privacy concern. As a solution to these above challenges, in this paper, we design a lightweight privacy-preserving medical diagnosis mechanism on edge. Our method redesigns the extreme gradient boosting (XGBoost) model based on the edge-cloud model, which adopts encrypted model parameters instead of local data to reduce amounts of ciphertext computation to plaintext computation, thus realizing lightweight privacy preservation on resource-limited edges. Additionally, the proposed scheme is able to provide a secure diagnosis on edge while maintaining privacy to ensure an accurate and timely diagnosis. The proposed system with secure computation could securely construct the XGBoost model with lightweight overhead, and efficiently provide a medical diagnosis without privacy leakage. Our security analysis and experimental evaluation indicate the security, effectiveness, and efficiency of the proposed system.
2022-01-31
Gurjar, Neelam Singh, S R, Sudheendra S, Kumar, Chejarla Santosh, K. S, Krishnaveni.  2021.  WebSecAsst - A Machine Learning based Chrome Extension. 2021 6th International Conference on Communication and Electronics Systems (ICCES). :1631—1635.
A browser extension, also known as a plugin or an addon, is a small software application that adds functionality to a web browser. However, security threats are always linked with such software where data can be compromised and ultimately trust is broken. The proposed research work jas developed a security model named WebSecAsst, which is a chrome plugin relying on the Machine Learning model XGBoost and VirusTotal to detect malicious websites visited by the user and to detect whether the files downloaded from the internet are Malicious or Safe. During this detection, the proposed model preserves the privacy of the user's data to a greater extent than the existing commercial chrome extensions.
2020-05-15
Jeyasudha, J., Usha, G..  2018.  Detection of Spammers in the Reconnaissance Phase by machine learning techniques. 2018 3rd International Conference on Inventive Computation Technologies (ICICT). :216—220.

Reconnaissance phase is where attackers identify their targets and how to collect information from professional social networks which can be used to select and exploit targeted employees to penetrate in an organization. Here, a framework is proposed for the early detection of attackers in the reconnaissance phase, highlighting the common characteristic behavior among attackers in professional social networks. And to create artificial honeypot profiles within the organizational social network which can be used to detect a potential incoming threat. By analyzing the dataset of social Network profiles in combination of machine learning techniques, A DspamRPfast model is proposed for the creation of a classifier system to predict the probabilities of the profiles being fake or malicious and to filter them out using XGBoost and for the faster classification and greater accuracy of 84.8%.

2017-12-28
Vu, Q. H., Ruta, D., Cen, L..  2017.  An ensemble model with hierarchical decomposition and aggregation for highly scalable and robust classification. 2017 Federated Conference on Computer Science and Information Systems (FedCSIS). :149–152.

This paper introduces an ensemble model that solves the binary classification problem by incorporating the basic Logistic Regression with the two recent advanced paradigms: extreme gradient boosted decision trees (xgboost) and deep learning. To obtain the best result when integrating sub-models, we introduce a solution to split and select sets of features for the sub-model training. In addition to the ensemble model, we propose a flexible robust and highly scalable new scheme for building a composite classifier that tries to simultaneously implement multiple layers of model decomposition and outputs aggregation to maximally reduce both bias and variance (spread) components of classification errors. We demonstrate the power of our ensemble model to solve the problem of predicting the outcome of Hearthstone, a turn-based computer game, based on game state information. Excellent predictive performance of our model has been acknowledged by the second place scored in the final ranking among 188 competing teams.

2017-02-23
G. Kejela, C. Rong.  2015.  "Cross-Device Consumer Identification". 2015 IEEE International Conference on Data Mining Workshop (ICDMW). :1687-1689.

Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.