Visible to the public Biblio

Filters: Keyword is data mining techniques  [Clear All Filters]
2020-08-24
Raghavan, Pradheepan, Gayar, Neamat El.  2019.  Fraud Detection using Machine Learning and Deep Learning. 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE). :334–339.
Frauds are known to be dynamic and have no patterns, hence they are not easy to identify. Fraudsters use recent technological advancements to their advantage. They somehow bypass security checks, leading to the loss of millions of dollars. Analyzing and detecting unusual activities using data mining techniques is one way of tracing fraudulent transactions. transactions. This paper aims to benchmark multiple machine learning methods such as k-nearest neighbor (KNN), random forest and support vector machines (SVM), while the deep learning methods such as autoencoders, convolutional neural networks (CNN), restricted boltzmann machine (RBM) and deep belief networks (DBN). The datasets which will be used are the European (EU) Australian and German dataset. The Area Under the ROC Curve (AUC), Matthews Correlation Coefficient (MCC) and Cost of failure are the 3-evaluation metrics that would be used.
2019-02-08
Sisiaridis, D., Markowitch, O..  2018.  Reducing Data Complexity in Feature Extraction and Feature Selection for Big Data Security Analytics. 2018 1st International Conference on Data Intelligence and Security (ICDIS). :43-48.

Feature extraction and feature selection are the first tasks in pre-processing of input logs in order to detect cybersecurity threats and attacks by utilizing data mining techniques in the field of Artificial Intelligence. When it comes to the analysis of heterogeneous data derived from different sources, these tasks are found to be time-consuming and difficult to be managed efficiently. In this paper, we present an approach for handling feature extraction and feature selection utilizing machine learning algorithms for security analytics of heterogeneous data derived from different network sensors. The approach is implemented in Apache Spark, using its python API, named pyspark.

2018-11-14
Zhang, J., Zheng, L., Gong, L., Gu, Z..  2018.  A Survey on Security of Cloud Environment: Threats, Solutions, and Innovation. 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). :910–916.

With the extensive application of cloud computing technology developing, security is of paramount importance in Cloud Computing. In the cloud computing environment, surveys have been provided on several intrusion detection techniques for detecting intrusions. We will summarize some literature surveys of various attack taxonomy, which might cause various threats in cloud environment. Such as attacks in virtual machines, attacks on virtual machine monitor, and attacks in tenant network. Besides, we review massive existing solutions proposed in the literature, such as misuse detection techniques, behavior analysis of network traffic, behavior analysis of programs, virtual machine introspection (VMI) techniques, etc. In addition, we have summarized some innovations in the field of cloud security, such as CloudVMI, data mining techniques, artificial intelligence, and block chain technology, etc. At the same time, our team designed and implemented the prototype system of CloudI (Cloud Introspection). CloudI has characteristics of high security, high performance, high expandability and multiple functions.

2018-03-19
Baron, G..  2017.  On Sequential Selection of Attributes to Be Discretized for Authorship Attribution. 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA). :229–234.

Different data mining techniques are employed in stylometry domain for performing authorship attribution tasks. Sometimes to improve the decision system the discretization of input data can be applied. In many cases such approach allows to obtain better classification results. On the other hand, there were situations in which discretization decreased overall performance of the system. Therefore, the question arose what would be the result if only some selected attributes were discretized. The paper presents the results of the research performed for forward sequential selection of attributes to be discretized. The influence of such approach on the performance of the decision system, based on Naive Bayes classifier in authorship attribution domain, is presented. Some basic discretization methods and different approaches to discretization of the test datasets are taken into consideration.

2015-05-05
Lomotey, R.K., Deters, R..  2014.  Terms Mining in Document-Based NoSQL: Response to Unstructured Data. Big Data (BigData Congress), 2014 IEEE International Congress on. :661-668.

Unstructured data mining has become topical recently due to the availability of high-dimensional and voluminous digital content (known as "Big Data") across the enterprise spectrum. The Relational Database Management Systems (RDBMS) have been employed over the past decades for content storage and management, but, the ever-growing heterogeneity in today's data calls for a new storage approach. Thus, the NoSQL database has emerged as the preferred storage facility nowadays since the facility supports unstructured data storage. This creates the need to explore efficient data mining techniques from such NoSQL systems since the available tools and frameworks which are designed for RDBMS are often not directly applicable. In this paper, we focused on topics and terms mining, based on clustering, in document-based NoSQL. This is achieved by adapting the architectural design of an analytics-as-a-service framework and the proposal of the Viterbi algorithm to enhance the accuracy of the terms classification in the system. The results from the pilot testing of our work show higher accuracy in comparison to some previously proposed techniques such as the parallel search.