Biblio

Found 350 results

Filters: Keyword is data mining  [Clear All Filters]
2020-12-11
Kumar, S., Vasthimal, D. K..  2019.  Raw Cardinality Information Discovery for Big Datasets. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :200—205.
Real-time discovery of all different types of unique attributes within unstructured data is a challenging problem to solve when dealing with multiple petabytes of unstructured data volume everyday. Popular discovery solutions such as the creation of offline jobs to uniquely identify attributes or running aggregation queries on raw data sets limits real time discovery use-cases and often results into poor resource utilization. The discovery information must be treated as a parallel problem to just storing raw data sets efficiently onto back-end big data systems. Solving the discovery problem by creating a parallel discovery data store infrastructure has multiple benefits as it allows such to channel the actual search queries against the raw data set in much more funneled manner instead of being widespread across the entire data sets. Such focused search queries and data separation are far more performant and requires less compute and memory footprint.
2020-12-07
Qian, Y..  2019.  Research on Trusted Authentication Model and Mechanism of Data Fusion. 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). :479–482.
Firstly, this paper analyses the technical foundation of single sign-on solution of unified authentication platform, and analyses the advantages and disadvantages of each solution. Secondly, from the point of view of software engineering, such as function requirement, performance requirement, development mode, architecture scheme, technology development framework and system configuration environment of the unified authentication platform, the unified authentication platform is analyzed and designed, and the database design and system design framework of the system are put forward according to the system requirements. Thirdly, the idea and technology of unified authentication platform based on JA-SIG CAS are discussed, and the design and implementation of each module of unified authentication platform based on JA-SIG CAS are analyzed, which has been applied in ship cluster platform.
2020-10-05
Murino, Giuseppina, Armando, Alessandro, Tacchella, Armando.  2019.  Resilience of Cyber-Physical Systems: an Experimental Appraisal of Quantitative Measures. 2019 11th International Conference on Cyber Conflict (CyCon). 900:1–19.
Cyber-Physical Systems (CPSs) interconnect the physical world with digital computers and networks in order to automate production and distribution processes. Nowadays, most CPSs do not work in isolation, but their digital part is connected to the Internet in order to enable remote monitoring, control and configuration. Such a connection may offer entry-points enabling attackers to gain control silently and exploit access to the physical world at the right time to cause service disruption and possibly damage to the surrounding environment. Prevention and monitoring measures can reduce the risk brought by cyber attacks, but the residual risk can still be unacceptably high in critical infrastructures or services. Resilience - i.e., the ability of a system to withstand adverse events while maintaining an acceptable functionality - is therefore a key property for such systems. In our research, we seek a model-free, quantitative, and general-purpose evaluation methodology to extract resilience indexes from, e.g., system logs and process data. While a number of resilience metrics have already been put forward, little experimental evidence is available when it comes to the cyber security of CPSs. By using the model of a real wastewater treatment plant, and simulating attacks that tamper with a critical feedback control loop, we provide a comparison between four resilience indexes selected through a thorough literature review involving over 40 papers. Our results show that the selected indexes differ in terms of behavior and sensitivity with respect to specific attacks, but they can all summarize and extract meaningful information from bulky system logs. Our evaluation includes an approach for extracting performance indicators from observed variables which does not require knowledge of system dynamics; and a discussion about combining resilience indexes into a single system-wide measure is included. 11The authors wish to thank Leonardo S.p.A. for its financial support. The research herein presented is partially supported by project NEFERIS awarded by the Italian Ministry of Defense to Leonardo S.p.A. in partnership with the University of Genoa. This work received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No 830892 for project SPARTA.
2020-07-30
Srisopha, Kamonphop, Phonsom, Chukiat, Lin, Keng, Boehm, Barry.  2019.  Same App, Different Countries: A Preliminary User Reviews Study on Most Downloaded iOS Apps. 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). :76—80.
Prior work on mobile app reviews has demonstrated that user reviews contain a wealth of information and are seen as a potential source of requirements. However, most of the studies done in this area mainly focused on mining and analyzing user reviews from the US App Store, leaving reviews of users from other countries unexplored. In this paper, we seek to understand if the perception of the same apps between users from other countries and that from the US differs through analyzing user reviews. We retrieve 300,643 user reviews of the 15 most downloaded iOS apps of 2018, published directly by Apple, from nine English-speaking countries over the course of 5 months. We manually classify 3,358 reviews into several software quality and improvement factors. We leverage a random forest based algorithm to identify factors that can be used to differentiate reviews between the US and other countries. Our preliminary results show that all countries have some factors that are proportionally inconsistent with the US.
2020-04-13
Chowdhury, Nahida Sultana, Raje, Rajeev R..  2019.  SERS: A Security-Related and Evidence-Based Ranking Scheme for Mobile Apps. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :130–139.
In recent years, the number of smart mobile devices has rapidly increased worldwide. This explosion of continuously connected mobile devices has resulted in an exponential growth in the number of publically available mobile Apps. To facilitate the selection of mobile Apps, from various available choices, the App distribution platforms typically rank/recommend Apps based on average star ratings, the number of downloads, and associated reviews - the external aspect of an App. However, these ranking schemes typically tend to ignore critical internal aspects (e.g., security vulnerabilities) of the Apps. Such an omission of internal aspects is certainly not desirable, especially when many of the users do not possess the necessary skills to evaluate the internal aspects and choose an App based on the default ranking scheme which uses the external aspect. In this paper, we build upon our earlier efforts by focusing specifically on the security-related internal aspect of an App and its combination with the external aspect computed from the user reviews by identifying security-related comments.We use this combination to rank-order similar Apps. We evaluate our approach on publicly available Apps from the Google PlayStore and compare our ranking with prevalent ranking techniques such as the average star ratings. The experimental results indicate the effectiveness of our proposed approach.
2019-12-16
Wu, Jimmy Ming-Tai, Chun-Wei Lin, Jerry, Djenouri, Youcef, Fournier-Viger, Philippe, Zhang, Yuyu.  2019.  A Swarm-based Data Sanitization Algorithm in Privacy-Preserving Data Mining. 2019 IEEE Congress on Evolutionary Computation (CEC). :1461–1467.
In recent decades, data protection (PPDM), which not only hides information, but also provides information that is useful to make decisions, has become a critical concern. We present a sanitization algorithm with the consideration of four side effects based on multi-objective PSO and hierarchical clustering methods to find optimized solutions for PPDM. Experiments showed that compared to existing approaches, the designed sanitization algorithm based on the hierarchical clustering method achieves satisfactory performance in terms of hiding failure, missing cost, and artificial cost.
2019-08-26
Gonzalez, D., Alhenaki, F., Mirakhorli, M..  2019.  Architectural Security Weaknesses in Industrial Control Systems (ICS) an Empirical Study Based on Disclosed Software Vulnerabilities. 2019 IEEE International Conference on Software Architecture (ICSA). :31–40.

Industrial control systems (ICS) are systems used in critical infrastructures for supervisory control, data acquisition, and industrial automation. ICS systems have complex, component-based architectures with many different hardware, software, and human factors interacting in real time. Despite the importance of security concerns in industrial control systems, there has not been a comprehensive study that examined common security architectural weaknesses in this domain. Therefore, this paper presents the first in-depth analysis of 988 vulnerability advisory reports for Industrial Control Systems developed by 277 vendors. We performed a detailed analysis of the vulnerability reports to measure which components of ICS have been affected the most by known vulnerabilities, which security tactics were affected most often in ICS and what are the common architectural security weaknesses in these systems. Our key findings were: (1) Human-Machine Interfaces, SCADA configurations, and PLCs were the most affected components, (2) 62.86% of vulnerability disclosures in ICS had an architectural root cause, (3) the most common architectural weaknesses were “Improper Input Validation”, followed by “Im-proper Neutralization of Input During Web Page Generation” and “Improper Authentication”, and (4) most tactic-related vulnerabilities were related to the tactics “Validate Inputs”, “Authenticate Actors” and “Authorize Actors”.

2020-01-20
Sivanantham, S., Abirami, R., Gowsalya, R..  2019.  Comparing the Performance of Adaptive Boosted Classifiers in Anomaly based Intrusion Detection System for Networks. 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). :1–5.

The computer network is used by billions of people worldwide for variety of purposes. This has made the security increasingly important in networks. It is essential to use Intrusion Detection Systems (IDS) and devices whose main function is to detect anomalies in networks. Mostly all the intrusion detection approaches focuses on the issues of boosting techniques since results are inaccurate and results in lengthy detection process. The major pitfall in network based intrusion detection is the wide-ranging volume of data gathered from the network. In this paper, we put forward a hybrid anomaly based intrusion detection system which uses Classification and Boosting technique. The Paper is organized in such a way it compares the performance three different Classifiers along with boosting. Boosting process maximizes classification accuracy. Results of proposed scheme will analyzed over different datasets like Intrusion Detection Kaggle Dataset and NSL KDD. Out of vast analysis it is found Random tree provides best average Accuracy rate of around 99.98%, Detection rate of 98.79% and a minimum False Alarm rate.

2020-08-28
Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Leevy, Joffrey L..  2019.  A Comparison of Performance Metrics with Severely Imbalanced Network Security Big Data. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :83—88.

Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.

2020-02-10
Zubov, Ilya G., Lysenko, Nikolai V., Labkov, Gleb M..  2019.  Detection of the Information Hidden in Image by Convolutional Neural Networks. 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :393–394.

This article shows the possibility of detection of the hidden information in images. This is the approach to steganalysis than the basic data about the image and the information about the hiding method of the information are unknown. The architecture of the convolutional neural network makes it possible to detect small changes in the image with high probability.

2020-01-20
Ishaque, Mohammed, Hudec, Ladislav.  2019.  Feature extraction using Deep Learning for Intrusion Detection System. 2019 2nd International Conference on Computer Applications Information Security (ICCAIS). :1–5.

Deep Learning is an area of Machine Learning research, which can be used to manipulate large amount of information in an intelligent way by using the functionality of computational intelligence. A deep learning system is a fully trainable system beginning from raw input to the final output of recognized objects. Feature selection is an important aspect of deep learning which can be applied for dimensionality reduction or attribute reduction and making the information more explicit and usable. Deep learning can build various learning models which can abstract unknown information by selecting a subset of relevant features. This property of deep learning makes it useful in analysis of highly complex information one which is present in intrusive data or information flowing with in a web system or a network which needs to be analyzed to detect anomalies. Our approach combines the intelligent ability of Deep Learning to build a smart Intrusion detection system.

2020-06-03
Cedillo, Priscila, Camacho, Jessica, Campos, Karina, Bermeo, Alexandra.  2019.  A Forensics Activity Logger to Extract User Activity from Mobile Devices. 2019 Sixth International Conference on eDemocracy eGovernment (ICEDEG). :286—290.

Nowadays, mobile devices have become one of the most popular instruments used by a person on its regular life, mainly due to the importance of their applications. In that context, mobile devices store user's personal information and even more data, becoming a personal tracker for daily activities that provides important information about the user. Derived from this gathering of information, many tools are available to use on mobile devices, with the restrain that each tool only provides isolated information about a specific application or activity. Therefore, the present work proposes a tool that allows investigators to obtain a complete report and timeline of the activities that were performed on the device. This report incorporates the information provided by many sources into a unique set of data. Also, by means of an example, it is presented the operation of the solution, which shows the feasibility in the use of this tool and shows the way in which investigators have to apply the tool.

2020-02-10
Ishtiaq, Asra, Islam, Muhammad Arshad, Azhar Iqbal, Muhammad, Aleem, Muhammad, Ahmed, Usman.  2019.  Graph Centrality Based Spam SMS Detection. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :629–633.

Short messages usage has been tremendously increased such as SMS, tweets and status updates. Due to its popularity and ease of use, many companies use it for advertisement purpose. Hackers also use SMS to defraud users and steal personal information. In this paper, the use of Graphs centrality metrics is proposed for spam SMS detection. The graph centrality measures: degree, closeness, and eccentricity are used for classification of SMS. Graphs for each class are created using labeled SMS and then unlabeled SMS is classified using the centrality scores of the token available in the unclassified SMS. Our results show that highest precision and recall is achieved by using degree centrality. Degree centrality achieved the highest precision i.e. 0.81 and recall i.e., 0.76 for spam messages.

2020-04-10
Chapla, Happy, Kotak, Riddhi, Joiser, Mittal.  2019.  A Machine Learning Approach for URL Based Web Phishing Using Fuzzy Logic as Classifier. 2019 International Conference on Communication and Electronics Systems (ICCES). :383—388.

Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.

2020-01-20
Halimaa A., Anish, Sundarakantham, K..  2019.  Machine Learning Based Intrusion Detection System. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :916–920.

In order to examine malicious activity that occurs in a network or a system, intrusion detection system is used. Intrusion Detection is software or a device that scans a system or a network for a distrustful activity. Due to the growing connectivity between computers, intrusion detection becomes vital to perform network security. Various machine learning techniques and statistical methodologies have been used to build different types of Intrusion Detection Systems to protect the networks. Performance of an Intrusion Detection is mainly depends on accuracy. Accuracy for Intrusion detection must be enhanced to reduce false alarms and to increase the detection rate. In order to improve the performance, different techniques have been used in recent works. Analyzing huge network traffic data is the main work of intrusion detection system. A well-organized classification methodology is required to overcome this issue. This issue is taken in proposed approach. Machine learning techniques like Support Vector Machine (SVM) and Naïve Bayes are applied. These techniques are well-known to solve the classification problems. For evaluation of intrusion detection system, NSL- KDD knowledge discovery Dataset is taken. The outcomes show that SVM works better than Naïve Bayes. To perform comparative analysis, effective classification methods like Support Vector Machine and Naive Bayes are taken, their accuracy and misclassification rate get calculated.

2019-11-12
Vizarreta, Petra, Sakic, Ermin, Kellerer, Wolfgang, Machuca, Carmen Mas.  2019.  Mining Software Repositories for Predictive Modelling of Defects in SDN Controller. 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). :80-88.

In Software Defined Networking (SDN) control plane of forwarding devices is concentrated in the SDN controller, which assumes the role of a network operating system. Big share of today's commercial SDN controllers are based on OpenDaylight, an open source SDN controller platform, whose bug repository is publicly available. In this article we provide a first insight into 8k+ bugs reported in the period over five years between March 2013 and September 2018. We first present the functional components in OpenDaylight architecture, localize the most vulnerable modules and measure their contribution to the total bug content. We provide high fidelity models that can accurately reproduce the stochastic behaviour of bug manifestation and bug removal rates, and discuss how these can be used to optimize the planning of the test effort, and to improve the software release management. Finally, we study the correlation between the code internals, derived from the Git version control system, and software defect metrics, derived from Jira issue tracker. To the best of our knowledge, this is the first study to provide a comprehensive analysis of bug characteristics in a production grade SDN controller.

2019-06-28
Hazari, S. S., Mahmoud, Q. H..  2019.  A Parallel Proof of Work to Improve Transaction Speed and Scalability in Blockchain Systems. 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC). :0916-0921.

A blockchain is a distributed ledger forming a distributed consensus on a history of transactions, and is the underlying technology for the Bitcoin cryptocurrency. However, its applications are far beyond the financial sector. The transaction verification process for cryptocurrencies is much slower than traditional digital transaction systems. One approach to increase transaction speed and scalability is to identify a solution that offers faster Proof of Work. In this paper, we propose a method for accelerating the process of Proof of Work based on parallel mining rather than solo mining. The goal is to ensure that no more than two or more miners put the same effort into solving a specific block. The proposed method includes a process for selection of a manager, distribution of work and a reward system. This method has been implemented in a test environment that contains all the characteristics needed to perform Proof of Work for Bitcoin and has been tested, using a variety of case scenarios, by varying the difficulty level and number of validators. Preliminary results show improvement in the scalability of Proof of Work up to 34% compared to the current system.

2020-06-26
Jaiswal, Prajwal Kumar, Das, Sayari, Panigrahi, Bijaya Ketan.  2019.  PMU Based Data Driven Approach For Online Dynamic Security Assessment in Power Systems. 2019 20th International Conference on Intelligent System Application to Power Systems (ISAP). :1—7.

This paper presents a methodology for utilizing Phasor Measurement units (PMUs) for procuring real time synchronized measurements for assessing the security of the power system dynamically. The concept of wide-area dynamic security assessment considers transient instability in the proposed methodology. Intelligent framework based approach for online dynamic security assessment has been suggested wherein the database consisting of critical features associated with the system is generated for a wide range of contingencies, which is utilized to build the data mining model. This data mining model along with the synchronized phasor measurements is expected to assist the system operator in assessing the security of the system pertaining to a particular contingency, thereby also creating possibility of incorporating control and preventive measures in order to avoid any unforeseen instability in the system. The proposed technique has been implemented on IEEE 39 bus system for accurately indicating the security of the system and is found to be quite robust in the case of noise in the measurement data obtained from the PMUs.

2020-12-01
SAADI, C., kandrouch, i, CHAOUI, H..  2019.  Proposed security by IDS-AM in Android system. 2019 5th International Conference on Optimization and Applications (ICOA). :1—7.

Mobile systems are always growing, automatically they need enough resources to secure them. Indeed, traditional techniques for protecting the mobile environment are no longer effective. We need to look for new mechanisms to protect the mobile environment from malicious behavior. In this paper, we examine one of the most popular systems, Android OS. Next, we will propose a distributed architecture based on IDS-AM to detect intrusions by mobile agents (IDS-AM).

2020-03-23
Bahrani, Ala, Bidgly, Amir Jalaly.  2019.  Ransomware detection using process mining and classification algorithms. 2019 16th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC). :73–77.

The fast growing of ransomware attacks has become a serious threat for companies, governments and internet users, in recent years. The increasing of computing power, memory and etc. and the advance in cryptography has caused the complicating the ransomware attacks. Therefore, effective methods are required to deal with ransomwares. Although, there are many methods proposed for ransomware detection, but these methods are inefficient in detection ransomwares, and more researches are still required in this field. In this paper, we have proposed a novel method for identify ransomware from benign software using process mining methods. The proposed method uses process mining to discover the process model from the events logs, and then extracts features from this process model and using these features and classification algorithms to classify ransomwares. This paper shows that the use of classification algorithms along with the process mining can be suitable to identify ransomware. The accuracy and performance of our proposed method is evaluated using a study of 21 ransomware families and some benign samples. The results show j48 and random forest algorithms have the best accuracy in our method and can achieve to 95% accuracy in detecting ransomwares.

2020-11-02
Ping, C., Jun-Zhe, Z..  2019.  Research on Intelligent Evaluation Method of Transient Analysis Software Function Test. 2019 International Conference on Advances in Construction Machinery and Vehicle Engineering (ICACMVE). :58–61.

In transient distributed cloud computing environment, software is vulnerable to attack, which leads to software functional completeness, so it is necessary to carry out functional testing. In order to solve the problem of high overhead and high complexity of unsupervised test methods, an intelligent evaluation method for transient analysis software function testing based on active depth learning algorithm is proposed. Firstly, the active deep learning mathematical model of transient analysis software function test is constructed by using association rule mining method, and the correlation dimension characteristics of software function failure are analyzed. Then the reliability of the software is measured by the spectral density distribution method of software functional completeness. The intelligent evaluation model of transient analysis software function testing is established in the transient distributed cloud computing environment, and the function testing and reliability intelligent evaluation are realized. Finally, the performance of the transient analysis software is verified by the simulation experiment. The results show that the accuracy of the software functional integrity positioning is high and the intelligent evaluation of the transient analysis software function testing has a good self-adaptability by using this method to carry out the function test of the transient analysis software. It ensures the safe and reliable operation of the software.

2020-03-23
Tu, Qingqing, Jing, Yulin, Zhu, Weiwei.  2019.  Research on Privacy Security Risk Evaluation of Intelligent Recommendation Mobile Applications Based on a Hierarchical Risk Factor Set. 2019 4th International Conference on Mechanical, Control and Computer Engineering (ICMCCE). :638–6384.

Intelligent recommendation applications based on data mining have appeared as prospective solution for consumer's demand recognition in large-scale data, and it has contained a great deal of consumer data, which become the most valuable wealth of application providers. However, the increasing threat to consumer privacy security in intelligent recommendation mobile application (IR App) makes it necessary to have a risk evaluation to narrow the gap between consumers' need for convenience with efficiency and need for privacy security. For the previous risk evaluation researches mainly focus on the network security or information security for a single work, few of which consider the whole data lifecycle oriented privacy security risk evaluation, especially for IR App. In this paper, we analyze the IR App's features based on the survey on both algorithm research and market prospect, then provide a hierarchical factor set based privacy security risk evaluation method, which includes whole data lifecycle factors in different layers.

2019-06-24
Ijaz, M., Durad, M. H., Ismail, M..  2019.  Static and Dynamic Malware Analysis Using Machine Learning. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :687–691.

Malware detection is an indispensable factor in security of internet oriented machines. The combinations of different features are used for dynamic malware analysis. The different combinations are generated from APIs, Summary Information, DLLs and Registry Keys Changed. Cuckoo sandbox is used for dynamic malware analysis, which is customizable, and provide good accuracy. More than 2300 features are extracted from dynamic analysis of malware and 92 features are extracted statically from binary malware using PEFILE. Static features are extracted from 39000 malicious binaries and 10000 benign files. Dynamically 800 benign files and 2200 malware files are analyzed in Cuckoo Sandbox and 2300 features are extracted. The accuracy of dynamic malware analysis is 94.64% while static analysis accuracy is 99.36%. The dynamic malware analysis is not effective due to tricky and intelligent behaviours of malwares. The dynamic analysis has some limitations due to controlled network behavior and it cannot be analyzed completely due to limited access of network.

2019-11-25
Pham, Dinh-Lam, Ahn, Hyun, Kim, Kwanghoon.  2019.  A Temporal Work Transference Event Log Trace Classification Algorithm and Its Experimental Analysis. 2019 21st International Conference on Advanced Communication Technology (ICACT). :692–696.

In the field of process mining, a lot of information about what happened inside the information system has been exploited and has yielded significant results. However, information related to the relationship between performers and performers is only utilized and evaluated in certain aspects. In this paper, we propose an algorithm to classify the temporal work transference from workflow enactment event log. This result may be used to reduce system memory, increase the computation speed. Furthermore, it can be used as one of the factors to evaluate the performer, active role of resources in the information system.

2020-01-21
Aldairi, Maryam, Karimi, Leila, Joshi, James.  2019.  A Trust Aware Unsupervised Learning Approach for Insider Threat Detection. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :89–98.

With the rapidly increasing connectivity in cyberspace, Insider Threat is becoming a huge concern. Insider threat detection from system logs poses a tremendous challenge for human analysts. Analyzing log files of an organization is a key component of an insider threat detection and mitigation program. Emerging machine learning approaches show tremendous potential for performing complex and challenging data analysis tasks that would benefit the next generation of insider threat detection systems. However, with huge sets of heterogeneous data to analyze, applying machine learning techniques effectively and efficiently to such a complex problem is not straightforward. In this paper, we extract a concise set of features from the system logs while trying to prevent loss of meaningful information and providing accurate and actionable intelligence. We investigate two unsupervised anomaly detection algorithms for insider threat detection and draw a comparison between different structures of the system logs including daily dataset and periodically aggregated one. We use the generated anomaly score from the previous cycle as the trust score of each user fed to the next period's model and show its importance and impact in detecting insiders. Furthermore, we consider the psychometric score of users in our model and check its effectiveness in predicting insiders. As far as we know, our model is the first one to take the psychometric score of users into consideration for insider threat detection. Finally, we evaluate our proposed approach on CERT insider threat dataset (v4.2) and show how it outperforms previous approaches.