Biblio
Reconnaissance phase is where attackers identify their targets and how to collect information from professional social networks which can be used to select and exploit targeted employees to penetrate in an organization. Here, a framework is proposed for the early detection of attackers in the reconnaissance phase, highlighting the common characteristic behavior among attackers in professional social networks. And to create artificial honeypot profiles within the organizational social network which can be used to detect a potential incoming threat. By analyzing the dataset of social Network profiles in combination of machine learning techniques, A DspamRPfast model is proposed for the creation of a classifier system to predict the probabilities of the profiles being fake or malicious and to filter them out using XGBoost and for the faster classification and greater accuracy of 84.8%.
This paper introduces an ensemble model that solves the binary classification problem by incorporating the basic Logistic Regression with the two recent advanced paradigms: extreme gradient boosted decision trees (xgboost) and deep learning. To obtain the best result when integrating sub-models, we introduce a solution to split and select sets of features for the sub-model training. In addition to the ensemble model, we propose a flexible robust and highly scalable new scheme for building a composite classifier that tries to simultaneously implement multiple layers of model decomposition and outputs aggregation to maximally reduce both bias and variance (spread) components of classification errors. We demonstrate the power of our ensemble model to solve the problem of predicting the outcome of Hearthstone, a turn-based computer game, based on game state information. Excellent predictive performance of our model has been acknowledged by the second place scored in the final ranking among 188 competing teams.
Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.