Visible to the public Biblio

Found 289 results

Filters: Keyword is Optimization  [Clear All Filters]
2018-09-05
Wang, J., Shi, D., Li, Y., Chen, J., Duan, X..  2017.  Realistic measurement protection schemes against false data injection attacks on state estimators. 2017 IEEE Power Energy Society General Meeting. :1–5.
False data injection attacks (FDIA) on state estimators are a kind of imminent cyber-physical security issue. Fortunately, it has been proved that if a set of measurements is strategically selected and protected, no FDIA will remain undetectable. In this paper, the metric Return on Investment (ROI) is introduced to evaluate the overall returns of the alternative measurement protection schemes (MPS). By setting maximum total ROI as the optimization objective, the previously ignored cost-benefit issue is taken into account to derive a realistic MPS for power utilities. The optimization problem is transformed into the Steiner tree problem in graph theory, where a tree pruning based algorithm is used to reduce the computational complexity and find a quasi-optimal solution with acceptable approximations. The correctness and efficiency of the algorithm are verified by case studies.
Jia, R., Dong, R., Ganesh, P., Sastry, S., Spanos, C..  2017.  Towards a theory of free-lunch privacy in cyber-physical systems. 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton). :902–910.

Emerging cyber-physical systems (CPS) often require collecting end users' data to support data-informed decision making processes. There has been a long-standing argument as to the tradeoff between privacy and data utility. In this paper, we adopt a multiparametric programming approach to rigorously study conditions under which data utility has to be sacrificed to protect privacy and situations where free-lunch privacy can be achieved, i.e., data can be concealed without hurting the optimality of the decision making underlying the CPS. We formalize the concept of free-lunch privacy, and establish various results on its existence, geometry, as well as efficient computation methods. We propose the free-lunch privacy mechanism, which is a pragmatic mechanism that exploits free-lunch privacy if it exists with the constant guarantee of optimal usage of data. We study the resilience of this mechanism against attacks that attempt to infer the parameter of a user's data generating process. We close the paper by a case study on occupancy-adaptive smart home temperature control to demonstrate the efficacy of the mechanism.

2018-08-23
Xi, X., Zhang, F., Lian, Z..  2017.  Implicit Trust Relation Extraction Based on Hellinger Distance. 2017 13th International Conference on Semantics, Knowledge and Grids (SKG). :223–227.

Recent studies have shown that adding explicit social trust information to social recommendation significantly improves the prediction accuracy of ratings, but it is difficult to obtain a clear trust data among users in real life. Scholars have studied and proposed some trust measure methods to calculate and predict the interaction and trust between users. In this article, a method of social trust relationship extraction based on hellinger distance is proposed, and user similarity is calculated by describing the f-divergence of one side node in user-item bipartite networks. Then, a new matrix factorization model based on implicit social relationship is proposed by adding the extracted implicit social relations into the improved matrix factorization. The experimental results support that the effect of using implicit social trust to recommend is almost the same as that of using actual explicit user trust ratings, and when the explicit trust data cannot be extracted, our method has a better effect than the other traditional algorithms.

Ming, X., Shu, T., Xianzhong, X..  2017.  An energy-efficient wireless image transmission method based on adaptive block compressive sensing and softcast. 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC). :712–717.

With the rapid and radical evolution of information and communication technology, energy consumption for wireless communication is growing at a staggering rate, especially for wireless multimedia communication. Recently, reducing energy consumption in wireless multimedia communication has attracted increasing attention. In this paper, we propose an energy-efficient wireless image transmission scheme based on adaptive block compressive sensing (ABCS) and SoftCast, which is called ABCS-SoftCast. In ABCS-SoftCast, the compression distortion and transmission distortion are considered in a joint manner, and the energy-distortion model is formulated for each image block. Then, the sampling rate (SR) and power allocation factors of each image block are optimized simultaneously. Comparing with conventional SoftCast scheme, experimental results demonstrate that the energy consumption can be greatly reduced even when the receiving image qualities are approximately the same.

2018-06-07
Wu, Xi, Li, Fengan, Kumar, Arun, Chaudhuri, Kamalika, Jha, Somesh, Naughton, Jeffrey.  2017.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics. Proceedings of the 2017 ACM International Conference on Management of Data. :1307–1322.

While significant progress has been made separately on analytics systems for scalable stochastic gradient descent (SGD) and private SGD, none of the major scalable analytics frameworks have incorporated differentially private SGD. There are two inter-related issues for this disconnect between research and practice: (1) low model accuracy due to added noise to guarantee privacy, and (2) high development and runtime overhead of the private algorithms. This paper takes a first step to remedy this disconnect and proposes a private SGD algorithm to address both issues in an integrated manner. In contrast to the white-box approach adopted by previous work, we revisit and use the classical technique of output perturbation to devise a novel “bolt-on” approach to private SGD. While our approach trivially addresses (2), it makes (1) even more challenging. We address this challenge by providing a novel analysis of the L2-sensitivity of SGD, which allows, under the same privacy guarantees, better convergence of SGD when only a constant number of passes can be made over the data. We integrate our algorithm, as well as other state-of-the-art differentially private SGD, into Bismarck, a popular scalable SGD-based analytics system on top of an RDBMS. Extensive experiments show that our algorithm can be easily integrated, incurs virtually no overhead, scales well, and most importantly, yields substantially better (up to 4X) test accuracy than the state-of-the-art algorithms on many real datasets.

Rullo, A., Serra, E., Bertino, E., Lobo, J..  2017.  Shortfall-Based Optimal Security Provisioning for Internet of Things. 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). :2585–2586.

We present a formal method for computing the best security provisioning for Internet of Things (IoT) scenarios characterized by a high degree of mobility. The security infrastructure is intended as a security resource allocation plan, computed as the solution of an optimization problem that minimizes the risk of having IoT devices not monitored by any resource. We employ the shortfall as a risk measure, a concept mostly used in the economics, and adapt it to our scenario. We show how to compute and evaluate an allocation plan, and how such security solutions address the continuous topology changes that affect an IoT environment.

Hinojosa, V., Gonzalez-Longatt, F..  2017.  Stochastic security-constrained generation expansion planning methodology based on a generalized line outage distribution factors. 2017 IEEE Manchester PowerTech. :1–6.

In this study, it is proposed to carry out an efficient formulation in order to figure out the stochastic security-constrained generation capacity expansion planning (SC-GCEP) problem. The main idea is related to directly compute the line outage distribution factors (LODF) which could be applied to model the N - m post-contingency analysis. In addition, the post-contingency power flows are modeled based on the LODF and the partial transmission distribution factors (PTDF). The post-contingency constraints have been reformulated using linear distribution factors (PTDF and LODF) so that both the pre- and post-contingency constraints are modeled simultaneously in the SC-GCEP problem using these factors. In the stochastic formulation, the load uncertainty is incorporated employing a two-stage multi-period framework, and a K - means clustering technique is implemented to decrease the number of load scenarios. The main advantage of this methodology is the feasibility to quickly compute the post-contingency factors especially with multiple-line outages (N - m). This concept would improve the security-constraint analysis modeling quickly the outage of m transmission lines in the stochastic SC-GCEP problem. It is carried out several experiments using two electrical power systems in order to validate the performance of the proposed formulation.

Hinojosa, V..  2017.  A generalized stochastic N-m security-constrained generation expansion planning methodology using partial transmission distribution factors. 2017 IEEE Power Energy Society General Meeting. :1–5.

This study proposes to apply an efficient formulation to solve the stochastic security-constrained generation capacity expansion planning (GCEP) problem using an improved method to directly compute the generalized generation distribution factors (GGDF) and the line outage distribution factors (LODF) in order to model the pre- and the post-contingency constraints based on the only application of the partial transmission distribution factors (PTDF). The classical DC-based formulation has been reformulated in order to include the security criteria solving both pre- and post-contingency constraints simultaneously. The methodology also takes into account the load uncertainty in the optimization problem using a two-stage multi-period model, and a clustering technique is used as well to reduce load scenarios (stochastic problem). The main advantage of this methodology is the feasibility to quickly compute the LODF especially with multiple-line outages (N-m). This idea could speed up contingency analyses and improve significantly the security-constrained analyses applied to GCEP problems. It is worth to mentioning that this approach is carried out without sacrificing optimality.

Matt, J., Waibel, P., Schulte, S..  2017.  Cost- and Latency-Efficient Redundant Data Storage in the Cloud. 2017 IEEE 10th Conference on Service-Oriented Computing and Applications (SOCA). :164–172.

With the steady increase of offered cloud storage services, they became a popular alternative to local storage systems. Beside several benefits, the usage of cloud storage services can offer, they have also some downsides like potential vendor lock-in or unavailability. Different pricing models, storage technologies and changing storage requirements are further complicating the selection of the best fitting storage solution. In this work, we present a heuristic optimization approach that optimizes the placement of data on cloud-based storage services in a redundant, cost- and latency-efficient way while considering user-defined Quality of Service requirements. The presented approach uses monitored data access patterns to find the best fitting storage solution. Through extensive evaluations, we show that our approach saves up to 30% of the storage cost and reduces the upload and download times by up to 48% and 69% in comparison to a baseline that follows a state-of-the-art approach.

2018-05-30
Alamaniotis, M., Tsoukalas, L. H., Bourbakis, N..  2017.  Anticipatory Driven Nodal Electricity Load Morphing in Smart Cities Enhancing Consumption Privacy. 2017 IEEE Manchester PowerTech. :1–6.

Integration of information technologies with the current power infrastructure promises something further than a smart grid: implementation of smart cities. Power efficient cities will be a significant step toward greener cities and a cleaner environment. However, the extensive use of information technologies in smart cities comes at a cost of reduced privacy. In particular, consumers' power profiles will be accessible by third parties seeking information over consumers' personal habits. In this paper, a methodology for enhancing privacy of electricity consumption patterns is proposed and tested. The proposed method exploits digital connectivity and predictive tools offered via smart grids to morph consumption patterns by grouping consumers via an optimization scheme. To that end, load anticipation, correlation and Theil coefficients are utilized synergistically with genetic algorithms to find an optimal assembly of consumers whose aggregated pattern hides individual consumption features. Results highlight the efficiency of the proposed method in enhancing privacy in the environment of smart cities.

2018-05-09
Livshitz, I., Lontsikh, P., Eliseev, S..  2017.  The optimization method of the integrated management system security audit. 2017 20th Conference of Open Innovations Association (FRUCT). :248–253.

Nowadays the application of integrated management systems (IMS) attracts the attention of top management from various organizations. However, there is an important problem of running the security audits in IMS and realization of complex checks of different ISO standards in full scale with the essential reducing of available resources.

2018-05-02
Rjoub, G., Bentahar, J..  2017.  Cloud Task Scheduling Based on Swarm Intelligence and Machine Learning. 2017 IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud). :272–279.

Cloud computing is the expansion of parallel computing, distributed computing. The technology of cloud computing becomes more and more widely used, and one of the fundamental issues in this cloud environment is related to task scheduling. However, scheduling in Cloud environments represents a difficult issue since it is basically NP-complete. Thus, many variants based on approximation techniques, especially those inspired by Swarm Intelligence (SI) have been proposed. This paper proposes a machine learning algorithm to guide the cloud choose the scheduling technique by using multi criteria decision to optimize the performance. The main contribution of our work is to minimize the makespan of a given task set. The new strategy is simulated using the CloudSim toolkit package where the impact of the algorithm is checked with different numbers of VMs varying from 2 to 50, and different task sizes between 30 bytes and 2700 bytes. Experiment results show that the proposed algorithm minimizes the execution time and the makespan between 7% and 75%, and improves the performance of the load balancing scheduling.

Tan, R. K., Bora, Ş.  2017.  Parameter tuning in modeling and simulations by using swarm intelligence optimization algorithms. 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN). :148–152.

Modeling and simulation of real-world environments has in recent times being widely used. The modeling of environments whose examination in particular is difficult and the examination via the model becomes easier. The parameters of the modeled systems and the values they can obtain are quite large, and manual tuning is tedious and requires a lot of effort while it often it is almost impossible to get the desired results. For this reason, there is a need for the parameter space to be set. The studies conducted in recent years were reviewed, it has been observed that there are few studies for parameter tuning problem in modeling and simulations. In this study, work has been done for a solution to be found to the problem of parameter tuning with swarm intelligence optimization algorithms Particle swarm optimization and Firefly algorithms. The performance of these algorithms in the parameter tuning process has been tested on 2 different agent based model studies. The performance of the algorithms has been observed by manually entering the parameters found for the model. According to the obtained results, it has been seen that the Firefly algorithm where the Particle swarm optimization algorithm works faster has better parameter values. With this study, the parameter tuning problem of the models in the different fields were solved.

2018-05-01
Xie, T., Zhou, Q., Hu, J., Shu, L., Jiang, P..  2017.  A Sequential Multi-Objective Robust Optimization Approach under Interval Uncertainty Based on Support Vector Machines. 2017 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM). :2088–2092.

Interval uncertainty can cause uncontrollable variations in the objective and constraint values, which could seriously deteriorate the performance or even change the feasibility of the optimal solutions. Robust optimization is to obtain solutions that are optimal and minimally sensitive to uncertainty. In this paper, a sequential multi-objective robust optimization (MORO) approach based on support vector machines (SVM) is proposed. Firstly, a sequential optimization structure is adopted to ease the computational burden. Secondly, SVM is used to construct a classification model to classify design alternatives into feasible or infeasible. The proposed approach is tested on a numerical example and an engineering case. Results illustrate that the proposed approach can reasonably approximate solutions obtained from the existing sequential MORO approach (SMORO), while the computational costs are significantly reduced compared with those of SMORO.

Tran, D. T., Waris, M. A., Gabbouj, M., Iosifidis, A..  2017.  Sample-Based Regularization for Support Vector Machine Classification. 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA). :1–6.

In this paper, we propose a new regularization scheme for the well-known Support Vector Machine (SVM) classifier that operates on the training sample level. The proposed approach is motivated by the fact that Maximum Margin-based classification defines decision functions as a linear combination of the selected training data and, thus, the variations on training sample selection directly affect generalization performance. We show that the exploitation of the proposed regularization scheme is well motivated and intuitive. Experimental results show that the proposed regularization scheme outperforms standard SVM in human action recognition tasks as well as classical recognition problems.

2018-04-04
Liang, J., Sankar, L., Kosut, O..  2017.  Vulnerability analysis and consequences of false data injection attack on power system state estimation. 2017 IEEE Power Energy Society General Meeting. :1–1.
An unobservable false data injection (FDI) attack on AC state estimation (SE) is introduced and its consequences on the physical system are studied. With a focus on understanding the physical consequences of FDI attacks, a bi-level optimization problem is introduced whose objective is to maximize the physical line flows subsequent to an FDI attack on DC SE. The maximization is subject to constraints on both attacker resources (size of attack) and attack detection (limiting load shifts) as well as those required by DC optimal power flow (OPF) following SE. The resulting attacks are tested on a more realistic non-linear system model using AC state estimation and ACOPF, and it is shown that, with an appropriately chosen sub-network, the attacker can overload transmission lines with moderate shifts of load.
Parchami, M., Bashbaghi, S., Granger, E..  2017.  CNNs with cross-correlation matching for face recognition in video surveillance using a single training sample per person. 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). :1–6.

In video surveillance, face recognition (FR) systems seek to detect individuals of interest appearing over a distributed network of cameras. Still-to-video FR systems match faces captured in videos under challenging conditions against facial models, often designed using one reference still per individual. Although CNNs can achieve among the highest levels of accuracy in many real-world FR applications, state-of-the-art CNNs that are suitable for still-to-video FR, like trunk-branch ensemble (TBE) CNNs, represent complex solutions for real-time applications. In this paper, an efficient CNN architecture is proposed for accurate still-to-video FR from a single reference still. The CCM-CNN is based on new cross-correlation matching (CCM) and triplet-loss optimization methods that provide discriminant face representations. The matching pipeline exploits a matrix Hadamard product followed by a fully connected layer inspired by adaptive weighted cross-correlation. A triplet-based training approach is proposed to optimize the CCM-CNN parameters such that the inter-class variations are increased, while enhancing robustness to intra-class variations. To further improve robustness, the network is fine-tuned using synthetically-generated faces based on still and videos of non-target individuals. Experiments on videos from the COX Face and Chokepoint datasets indicate that the CCM-CNN can achieve a high level of accuracy that is comparable to TBE-CNN and HaarNet, but with a significantly lower time and memory complexity. It may therefore represent the better trade-off between accuracy and complexity for real-time video surveillance applications.

2018-04-02
Muthumanickam, K., Ilavarasan, E..  2017.  Optimizing Detection of Malware Attacks through Graph-Based Approach. 2017 International Conference on Technical Advancements in Computers and Communications (ICTACC). :87–91.

Today the technology advancement in communication technology permits a malware author to introduce code obfuscation technique, for example, Application Programming Interface (API) hook, to make detecting the footprints of their code more difficult. A signature-based model such as Antivirus software is not effective against such attacks. In this paper, an API graph-based model is proposed with the objective of detecting hook attacks during malicious code execution. The proposed model incorporates techniques such as graph-generation, graph partition and graph comparison to distinguish a legitimate system call from malicious system call. The simulation results confirm that the proposed model outperforms than existing approaches.

He, X., Islam, M. M., Jin, R., Dai, H..  2017.  Foresighted Deception in Dynamic Security Games. 2017 IEEE International Conference on Communications (ICC). :1–6.

Deception has been widely considered in literature as an effective means of enhancing security protection when the defender holds some private information about the ongoing rivalry unknown to the attacker. However, most of the existing works on deception assume static environments and thus consider only myopic deception, while practical security games between the defender and the attacker may happen in dynamic scenarios. To better exploit the defender's private information in dynamic environments and improve security performance, a stochastic deception game (SDG) framework is developed in this work to enable the defender to conduct foresighted deception. To solve the proposed SDG, a new iterative algorithm that is provably convergent is developed. A corresponding learning algorithm is developed as well to facilitate the defender in conducting foresighted deception in unknown dynamic environments. Numerical results show that the proposed foresighted deception can offer a substantial performance improvement as compared to the conventional myopic deception.

Wu, D., Zhang, Y., Liu, Y..  2017.  Dummy Location Selection Scheme for K-Anonymity in Location Based Services. 2017 IEEE Trustcom/BigDataSE/ICESS. :441–448.

Location-Based Service (LBS) becomes increasingly important for our daily life. However, the localization information in the air is vulnerable to various attacks, which result in serious privacy concerns. To overcome this problem, we formulate a multi-objective optimization problem with considering both the query probability and the practical dummy location region. A low complexity dummy location selection scheme is proposed. We first find several candidate dummy locations with similar query probabilities. Among these selected candidates, a cloaking area based algorithm is then offered to find K - 1 dummy locations to achieve K-anonymity. The intersected area between two dummy locations is also derived to assist to determine the total cloaking area. Security analysis verifies the effectiveness of our scheme against the passive and active adversaries. Compared with other methods, simulation results show that the proposed dummy location scheme can improve the privacy level and enlarge the cloaking area simultaneously.

Gao, Y., Luo, T., Li, J., Wang, C..  2017.  Research on K Anonymity Algorithm Based on Association Analysis of Data Utility. 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). :426–432.

More and more medical data are shared, which leads to disclosure of personal privacy information. Therefore, the construction of medical data privacy preserving publishing model is of great value: not only to make a non-correspondence between the released information and personal identity, but also to maintain the data utility after anonymity. However, there is an inherent contradiction between the anonymity and the data utility. In this paper, a Principal Component Analysis-Grey Relational Analysis (PCA-GRA) K anonymous algorithm is proposed to improve the data utility effectively under the premise of anonymity, in which the association between quasi-identifiers and the sensitive information is reckoned as a criterion to control the generalization hierarchy. Compared with the previous anonymity algorithms, results show that the proposed PCA-GRA K anonymous algorithm has achieved significant improvement in data utility from three aspects, namely information loss, feature maintenance and classification evaluation performance.

2018-03-26
Hosseinpourpia, M., Oskoei, M. A..  2017.  GA Based Parameter Estimation for Multi-Faceted Trust Model of Recommender Systems. 2017 5th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS). :160–165.

Recommender system is to suggest items that might be interest of the users in social networks. Collaborative filtering is an approach that works based on similarity and recommends items liked by other similar users. Trust model adopts users' trust network in place of similarity. Multi-faceted trust model considers multiple and heterogeneous trust relationship among the users and recommend items based on rating exist in the network of trustees of a specific facet. This paper applies genetic algorithm to estimate parameters of multi-faceted trust model, in which the trust weights are calculated based on the ratings and the trust network for each facet, separately. The model was built on Epinions data set that includes consumers' opinion, rating for items and the web of trust network. It was used to predict users' rating for items in different facets and root mean squared of prediction error (RMSE) was considered as a measure of performance. Empirical evaluations demonstrated that multi-facet models improve performance of the recommender system.

2018-03-19
Soltan, S., Zussman, G..  2017.  Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model. 2017 IEEE Power Energy Society General Meeting. :1–5.

In this paper, we present an algorithm for estimating the state of the power grid following a cyber-physical attack. We assume that an adversary attacks an area by: (i) disconnecting some lines within that area (failed lines), and (ii) obstructing the information from within the area to reach the control center. Given the phase angles of the buses outside the attacked area under the AC power flow model (before and after the attack), the algorithm estimates the phase angles of the buses and detects the failed lines inside the attacked area. The novelty of our approach is the transformation of the line failures detection problem, which is combinatorial in nature, to a convex optimization problem. As a result, our algorithm can detect any number of line failures in a running time that is independent of the number of failures and is solely dependent on the size of the network. To the best of our knowledge, this is the first convex relaxation for the problem of line failures detection using phase angle measurements under the AC power flow model. We evaluate the performance of our algorithm in the IEEE 118- and 300-bus systems, and show that it estimates the phase angles of the buses with less that 1% error, and can detect the line failures with 80% accuracy for single, double, and triple line failures.

Ditzler, G., Prater, A..  2017.  Fine Tuning Lasso in an Adversarial Environment against Gradient Attacks. 2017 IEEE Symposium Series on Computational Intelligence (SSCI). :1–7.

Machine learning and data mining algorithms typically assume that the training and testing data are sampled from the same fixed probability distribution; however, this violation is often violated in practice. The field of domain adaptation addresses the situation where this assumption of a fixed probability between the two domains is violated; however, the difference between the two domains (training/source and testing/target) may not be known a priori. There has been a recent thrust in addressing the problem of learning in the presence of an adversary, which we formulate as a problem of domain adaption to build a more robust classifier. This is because the overall security of classifiers and their preprocessing stages have been called into question with the recent findings of adversaries in a learning setting. Adversarial training (and testing) data pose a serious threat to scenarios where an attacker has the opportunity to ``poison'' the training or ``evade'' on the testing data set(s) in order to achieve something that is not in the best interest of the classifier. Recent work has begun to show the impact of adversarial data on several classifiers; however, the impact of the adversary on aspects related to preprocessing of data (i.e., dimensionality reduction or feature selection) has widely been ignored in the revamp of adversarial learning research. Furthermore, variable selection, which is a vital component to any data analysis, has been shown to be particularly susceptible under an attacker that has knowledge of the task. In this work, we explore avenues for learning resilient classification models in the adversarial learning setting by considering the effects of adversarial data and how to mitigate its effects through optimization. Our model forms a single convex optimization problem that uses the labeled training data from the source domain and known- weaknesses of the model for an adversarial component. We benchmark the proposed approach on synthetic data and show the trade-off between classification accuracy and skew-insensitive statistics.

2018-03-05
Das, A., Shen, M. Y., Wang, J..  2017.  Modeling User Communities for Identifying Security Risks in an Organization. 2017 IEEE International Conference on Big Data (Big Data). :4481–4486.

In this paper, we address the problem of peer grouping employees in an organization for identifying security risks. Our motivation for studying peer grouping is its importance for a clear understanding of user and entity behavior analytics (UEBA) that is the primary tool for identifying insider threat through detecting anomalies in network traffic. We show that using Louvain method of community detection it is possible to automate peer group creation with feature-based weight assignments. Depending on the number of employees and their features we show that it is also possible to give each group a meaningful description. We present three new algorithms: one that allows an addition of new employees to already generated peer groups, another that allows for incorporating user feedback, and lastly one that provides the user with recommended nodes to be reassigned. We use Niara's data to validate our claims. The novelty of our method is its robustness, simplicity, scalability, and ease of deployment in a production environment.