Visible to the public Biblio

Filters: Keyword is Labeling  [Clear All Filters]
2023-08-18
Bukharev, Dmitriy A., Ragozin, Andrey N., Sokolov, Alexander N..  2022.  Method for Determining the Optimal Number of Clusters for ICS Information Processes Analysis During Cyberattacks Based on Hierarchical Clustering. 2022 Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT). :309—312.
The development of industrial automation tools and the integration of industrial and corporate networks in order to improve the quality of production management have led to an increase in the risks of successful cyberattacks and, as a result, to the necessity to solve the problems of practical information security of industrial control systems (ICS). Detection of cyberattacks of both known and unknown types is could be implemented as anomaly detection in dynamic information processes recorded during the operation of ICS. Anomaly detection methods do not require preliminary analysis and labeling of the training sample. In the context of detecting attacks on ICS, cluster analysis is used as one of the methods that implement anomaly detection. The application of hierarchical cluster analysis for clustering data of ICS information processes exposed to various cyberattacks is studied, the problem of choosing the level of the cluster hierarchy corresponding to the minimum set of clusters aggregating separately normal and abnormal data is solved. It is shown that the Ward method of hierarchical cluster division produces the best division into clusters. The next stage of the study involves solving the problem of classifying the formed minimum set of clusters, that is, determining which cluster is normal and which cluster is abnormal.
2023-08-16
Liu, Lisa, Engelen, Gints, Lynar, Timothy, Essam, Daryl, Joosen, Wouter.  2022.  Error Prevalence in NIDS datasets: A Case Study on CIC-IDS-2017 and CSE-CIC-IDS-2018. 2022 IEEE Conference on Communications and Network Security (CNS). :254—262.
Benchmark datasets are heavily depended upon by the research community to validate theoretical findings and track progression in the state-of-the-art. NIDS dataset creation presents numerous challenges on account of the volume, heterogeneity, and complexity of network traffic, making the process labor intensive, and thus, prone to error. This paper provides a critical review of CIC-IDS-2017 and CIC-CSE-IDS-2018, datasets which have seen extensive usage in the NIDS literature, and are currently considered primary benchmarking datasets for NIDS. We report a large number of previously undocumented errors throughout the dataset creation lifecycle, including in attack orchestration, feature generation, documentation, and labeling. The errors destabilize the results and challenge the findings of numerous publications that have relied on it as a benchmark. We demonstrate the implications of these errors through several experiments. We provide comprehensive documentation to summarize the discovery of these issues, as well as a fully-recreated dataset, with labeling logic that has been reverse-engineered, corrected, and made publicly available for the first time. We demonstrate the implications of dataset errors through a series of experiments. The findings serve to remind the research community of common pitfalls with dataset creation processes, and of the need to be vigilant when adopting new datasets. Lastly, we strongly recommend the release of labeling logic for any dataset released, to ensure full transparency.
2023-07-21
Paul, Shuva, Kundu, Ripan Kumar.  2022.  A Bagging MLP-based Autoencoder for Detection of False Data Injection Attack in Smart Grid. 2022 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT). :1—5.
The accelerated move toward adopting the Smart Grid paradigm has resulted in numerous drawbacks as far as security is concerned. Traditional power grids are becoming more vulnerable to cyberattacks as all the control decisions are generated based on the data the Smart Grid generates during its operation. This data can be tampered with or attacked in communication lines to mislead the control room in decision-making. The false data injection attack (FDIA) is one of the most severe cyberattacks on today’s cyber-physical power system, as it has the potential to cause significant physical and financial damage. However, detecting cyberattacks are incredibly challenging since they have no known patterns. In this paper, we launch a random FDIA on IEEE-39 bus system. Later, we propose a Bagging MLP-based autoencoder to detect the FDIAs in the power system and compare the result with a single ML model. The Bagging MLP-based autoencoder outperforms the Isolation forest while detecting FDIAs.
2023-06-02
Dalvi, Ashwini, Patil, Gunjan, Bhirud, S G.  2022.  Dark Web Marketplace Monitoring - The Emerging Business Trend of Cybersecurity. 2022 International Conference on Trends in Quantum Computing and Emerging Business Technologies (TQCEBT). :1—6.

Cyber threat intelligence (CTI) is vital for enabling effective cybersecurity decisions by providing timely, relevant, and actionable information about emerging threats. Monitoring the dark web to generate CTI is one of the upcoming trends in cybersecurity. As a result, developing CTI capabilities with the dark web investigation is a significant focus for cybersecurity companies like Deepwatch, DarkOwl, SixGill, ThreatConnect, CyLance, ZeroFox, and many others. In addition, the dark web marketplace (DWM) monitoring tools are of much interest to law enforcement agencies (LEAs). The fact that darknet market participants operate anonymously and online transactions are pseudo-anonymous makes it challenging to identify and investigate them. Therefore, keeping up with the DWMs poses significant challenges for LEAs today. Nevertheless, the offerings on the DWM give insights into the dark web economy to LEAs. The present work is one such attempt to describe and analyze dark web market data collected for CTI using a dark web crawler. After processing and labeling, authors have 53 DWMs with their product listings and pricing.

2023-03-06
Le, Trung-Nghia, Akihiro, Sugimoto, Ono, Shintaro, Kawasaki, Hiroshi.  2020.  Toward Interactive Self-Annotation For Video Object Bounding Box: Recurrent Self-Learning And Hierarchical Annotation Based Framework. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). :3220–3229.
Amount and variety of training data drastically affect the performance of CNNs. Thus, annotation methods are becoming more and more critical to collect data efficiently. In this paper, we propose a simple yet efficient Interactive Self-Annotation framework to cut down both time and human labor cost for video object bounding box annotation. Our method is based on recurrent self-supervised learning and consists of two processes: automatic process and interactive process, where the automatic process aims to build a supported detector to speed up the interactive process. In the Automatic Recurrent Annotation, we let an off-the-shelf detector watch unlabeled videos repeatedly to reinforce itself automatically. At each iteration, we utilize the trained model from the previous iteration to generate better pseudo ground-truth bounding boxes than those at the previous iteration, recurrently improving self-supervised training the detector. In the Interactive Recurrent Annotation, we tackle the human-in-the-loop annotation scenario where the detector receives feedback from the human annotator. To this end, we propose a novel Hierarchical Correction module, where the annotated frame-distance binarizedly decreases at each time step, to utilize the strength of CNN for neighbor frames. Experimental results on various video datasets demonstrate the advantages of the proposed framework in generating high-quality annotations while reducing annotation time and human labor costs.
ISSN: 2642-9381
2022-12-09
Lin, Yuhang, Tunde-Onadele, Olufogorehan, Gu, Xiaohui, He, Jingzhu, Latapie, Hugo.  2022.  SHIL: Self-Supervised Hybrid Learning for Security Attack Detection in Containerized Applications. 2022 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). :41—50.
Container security has received much research attention recently. Previous work has proposed to apply various machine learning techniques to detect security attacks in containerized applications. On one hand, supervised machine learning schemes require sufficient labelled training data to achieve good attack detection accuracy. On the other hand, unsupervised machine learning methods are more practical by avoiding training data labelling requirements, but they often suffer from high false alarm rates. In this paper, we present SHIL, a self-supervised hybrid learning solution, which combines unsupervised and supervised learning methods to achieve high accuracy without requiring any manual data labelling. We have implemented a prototype of SHIL and conducted experiments over 41 real world security attacks in 28 commonly used server applications. Our experimental results show that SHIL can reduce false alarms by 39-91% compared to existing supervised or unsupervised machine learning schemes while achieving a higher or similar detection rate.
2022-10-16
Trautsch, Alexander, Herbold, Steffen, Grabowski, Jens.  2020.  Static source code metrics and static analysis warnings for fine-grained just-in-time defect prediction. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). :127–138.
Software quality evolution and predictive models to support decisions about resource distribution in software quality assurance tasks are an important part of software engineering research. Recently, a fine-grained just-in-time defect prediction approach was proposed which has the ability to find bug-inducing files within changes instead of only complete changes. In this work, we utilize this approach and improve it in multiple places: data collection, labeling and features. We include manually validated issue types, an improved SZZ algorithm which discards comments, whitespaces and refactorings. Additionally, we include static source code metrics as well as static analysis warnings and warning density derived metrics as features. To assess whether we can save cost we incorporate a specialized defect prediction cost model. To evaluate our proposed improvements of the fine-grained just-in-time defect prediction approach we conduct a case study that encompasses 38 Java projects, 492,241 file changes in 73,598 commits and spans 15 years. We find that static source code metrics and static analysis warnings are correlated with bugs and that they can improve the quality and cost saving potential of just-in-time defect prediction models.
2022-02-07
Lee, Shan-Hsin, Lan, Shen-Chieh, Huang, Hsiu-Chuan, Hsu, Chia-Wei, Chen, Yung-Shiu, Shieh, Shiuhpyng.  2021.  EC-Model: An Evolvable Malware Classification Model. 2021 IEEE Conference on Dependable and Secure Computing (DSC). :1–8.
Malware evolves quickly as new attack, evasion and mutation techniques are commonly used by hackers to build new malicious malware families. For malware detection and classification, multi-class learning model is one of the most popular machine learning models being used. To recognize malicious programs, multi-class model requires malware types to be predefined as output classes in advance which cannot be dynamically adjusted after the model is trained. When a new variant or type of malicious programs is discovered, the trained multi-class model will be no longer valid and have to be retrained completely. This consumes a significant amount of time and resources, and cannot adapt quickly to meet the timely requirement in dealing with dynamically evolving malware types. To cope with the problem, an evolvable malware classification deep learning model, namely EC-Model, is proposed in this paper which can dynamically adapt to new malware types without the need of fully retraining. Consequently, the reaction time can be significantly reduced to meet the timely requirement of malware classification. To our best knowledge, our work is the first attempt to adopt multi-task, deep learning for evolvable malware classification.
Zhang, Ruichao, Wang, Shang, Burton, Renee, Hoang, Minh, Hu, Juhua, Nascimento, Anderson C A.  2021.  Clustering Analysis of Email Malware Campaigns. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :95–102.
The task of malware labeling on real datasets faces huge challenges—ever-changing datasets and lack of ground-truth labels—owing to the rapid growth of malware. Clustering malware on their respective families is a well known tool used for improving the efficiency of the malware labeling process. In this paper, we addressed the challenge of clustering email malware, and carried out a cluster analysis on a real dataset collected from email campaigns over a 13-month period. Our main original contribution is to analyze the usefulness of email’s header information for malware clustering (a novel approach proposed by Burton [1]), and compare it with features collected from the malware directly. We compare clustering based on email header’s information with traditional features extracted from varied resources provided by VirusTotal [2], including static and dynamic analysis. We show that email header information has an excellent performance.
2021-09-21
Chamotra, Saurabh, Barbhuiya, Ferdous Ahmed.  2020.  Analysis and Modelling of Multi-Stage Attacks. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1268–1275.
Honeypots are the information system resources used for capturing and analysis of cyber attacks. Highinteraction Honeypots are capable of capturing attacks in their totality and hence are an ideal choice for capturing multi-stage cyber attacks. The term multi-stage attack is an abstraction that refers to a class of cyber attacks consisting of multiple attack stages. These attack stages are executed either by malicious codes, scripts or sometimes even inbuilt system tools. In the work presented in this paper we have proposed a framework for capturing, analysis and modelling of multi-stage cyber attacks. The objective of our work is to devise an effective mechanism for the classification of multi-stage cyber attacks. The proposed framework comprise of a network of high interaction honeypots augmented with an attack analysis engine. The analysis engine performs rule based labeling of captured honeypot data. The labeling engine labels the attack data as generic events. These events are further fused to generate attack graphs. The hence generated attack graphs are used to characterize and later classify the multi-stage cyber attacks.
Jin, Xiang, Xing, Xiaofei, Elahi, Haroon, Wang, Guojun, Jiang, Hai.  2020.  A Malware Detection Approach Using Malware Images and Autoencoders. 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). :1–6.
Most machine learning-based malware detection systems use various supervised learning methods to classify different instances of software as benign or malicious. This approach provides no information regarding the behavioral characteristics of malware. It also requires a large amount of training data and is prone to labeling difficulties and can reduce accuracy due to redundant training data. Therefore, we propose a malware detection method based on deep learning, which uses malware images and a set of autoencoders to detect malware. The method is to design an autoencoder to learn the functional characteristics of malware, and then to observe the reconstruction error of autoencoder to realize the classification and detection of malware and benign software. The proposed approach achieves 93% accuracy and comparatively better F1-score values while detecting malware and needs little training data when compared with traditional malware detection systems.
2021-05-20
Mheisn, Alaa, Shurman, Mohammad, Al-Ma’aytah, Abdallah.  2020.  WSNB: Wearable Sensors with Neural Networks Located in a Base Station for IoT Environment. 2020 7th International Conference on Internet of Things: Systems, Management and Security (IOTSMS). :1—4.
The Internet of Things (IoT) is a system paradigm that recently introduced, which includes different smart devices and applications, especially, in smart cities, e.g.; manufacturing, homes, and offices. To improve their awareness capabilities, it is attractive to add more sensors to their framework. In this paper, we propose adding a new sensor as a wearable sensor connected wirelessly with a neural network located on the base station (WSNB). WSNB enables the added sensor to refine their labels through active learning. The new sensors achieve an average accuracy of 93.81%, which is 4.5% higher than the existing method, removing human support and increasing the life cycle for the sensors by using neural network approach in the base station.
2021-05-18
Zeng, Jingxiang, Nie, Xiaofan, Chen, Liwei, Li, Jinfeng, Du, Gewangzi, Shi, Gang.  2020.  An Efficient Vulnerability Extrapolation Using Similarity of Graph Kernel of PDGs. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1664–1671.
Discovering the potential vulnerabilities in software plays a crucial role in ensuring the security of computer system. This paper proposes a method that can assist security auditors with the analysis of source code. When security auditors identify new vulnerabilities, our method can be adopted to make a list of recommendations that may have the same vulnerabilities for the security auditors. Our method relies on graph representation to automatically extract the mode of PDG(program dependence graph, a structure composed of control dependence and data dependence). Besides, it can be applied to the vulnerability extrapolation scenario, thus reducing the amount of audit code. We worked on an open-source vulnerability test set called Juliet. According to the evaluation results, the clustering effect produced is satisfactory, so that the feature vectors extracted by the Graph2Vec model are applied to labeling and supervised learning indicators are adopted to assess the model for its ability to extract features. On a total of 12,000 small data sets, the training score of the model can reach up to 99.2%, and the test score can reach a maximum of 85.2%. Finally, the recommendation effect of our work is verified as satisfactory.
2021-04-29
Lu, Y., Zhang, C..  2020.  Nontransitive Security Types for Coarse-grained Information Flow Control. 2020 IEEE 33rd Computer Security Foundations Symposium (CSF). :199—213.

Language-based information flow control (IFC) aims to provide guarantees about information propagation in computer systems having multiple security levels. Existing IFC systems extend the lattice model of Denning's, enforcing transitive security policies by tracking information flows along with a partially ordered set of security levels. They yield a transitive noninterference property of either confidentiality or integrity. In this paper, we explore IFC for security policies that are not necessarily transitive. Such nontransitive security policies avoid unwanted or unexpected information flows implied by transitive policies and naturally accommodate high-level coarse-grained security requirements in modern component-based software. We present a novel security type system for enforcing nontransitive security policies. Unlike traditional security type systems that verify information propagation by subtyping security levels of a transitive policy, our type system relaxes strong transitivity by inferring information flow history through security levels and ensuring that they respect the nontransitive policy in effect. Such a type system yields a new nontransitive noninterference property that offers more flexible information flow relations induced by security policies that do not have to be transitive, therefore generalizing the conventional transitive noninterference. This enables us to directly reason about the extent of information flows in the program and restrict interactions between security-sensitive and untrusted components.

2021-03-04
Wang, Y., Wang, Z., Xie, Z., Zhao, N., Chen, J., Zhang, W., Sui, K., Pei, D..  2020.  Practical and White-Box Anomaly Detection through Unsupervised and Active Learning. 2020 29th International Conference on Computer Communications and Networks (ICCCN). :1—9.

To ensure quality of service and user experience, large Internet companies often monitor various Key Performance Indicators (KPIs) of their systems so that they can detect anomalies and identify failure in real time. However, due to a large number of various KPIs and the lack of high-quality labels, existing KPI anomaly detection approaches either perform well only on certain types of KPIs or consume excessive resources. Therefore, to realize generic and practical KPI anomaly detection in the real world, we propose a KPI anomaly detection framework named iRRCF-Active, which contains an unsupervised and white-box anomaly detector based on Robust Random Cut Forest (RRCF), and an active learning component. Specifically, we novelly propose an improved RRCF (iRRCF) algorithm to overcome the drawbacks of applying original RRCF in KPI anomaly detection. Besides, we also incorporate the idea of active learning to make our model benefit from high-quality labels given by experienced operators. We conduct extensive experiments on a large-scale public dataset and a private dataset collected from a large commercial bank. The experimental resulta demonstrate that iRRCF-Active performs better than existing traditional statistical methods, unsupervised learning methods and supervised learning methods. Besides, each component in iRRCF-Active has also been demonstrated to be effective and indispensable.

2020-12-28
Abazar, T., Masjedi, P., Taheri, M..  2020.  A Binary Relevance Adaptive Model-Selection for Ensemble Steganalysis. 2020 17th International ISC Conference on Information Security and Cryptology (ISCISC). :77—81.

Steganalysis is an interesting classification problem in order to discriminate the images, including hidden messages from the clean ones. There are many methods, including deep CNN networks to extract fine features for this classification task. Nevertheless, a few researches have been conducted to improve the final classifier. Some state-of-the-art methods try to ensemble the networks by a voting strategy to achieve more stable performance. In this paper, a selection phase is proposed to filter improper networks before any voting. This filtering is done by a binary relevance multi-label classification approach. The Logistic Regression (LR) is chosen here as the last layer of network for classification. The large-margin Fisher’s linear discriminant (FLD) classifier is assigned to each one of the networks. It learns to discriminate the training instances which associated network is suitable for or not. Xu-Net, one of the most famous state-of-the-art Steganalysis models, is chosen as the base networks. The proposed method with different approaches is applied on the BOSSbase dataset and is compared with traditional voting and also some state-of-the-art related ensemble techniques. The results show significant accuracy improvement of the proposed method in comparison with others.

2020-09-18
Yao, Bing, Zhao, Meimei, Mu, Yarong, Sun, Yirong, Zhang, Xiaohui, Zhang, Mingjun, Yang, Sihua.  2019.  Matrices From Topological Graphic Coding of Network Security. 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 1:1992—1996.
Matrices as mathematical models have been used in each branch of scientific fields for hundred years. We propose a new type of matrices, called topological coding matrices (Topcode-matrices). Topcode-matrices show us the following advantages: Topcode-matrices can be saved in computer easily and run quickly in computation; since a Topcode-matrix corresponds two or more Topsnut-gpws, so Topcode-matrices can be used to encrypt networks such that the encrypted networks have higher security; Topcode-matrices can be investigated and applied by people worked in more domains; Topcode-matrices can help us to form new operations, new parameters and new topics of graph theory, such as vertex/edge splitting operations and connectivities of graphs. Several properties and applications on Topcode-matrices, and particular Topcode-matrices, as well as unknown problems are introduced.
2020-07-30
Holland, Martin, Stjepandić, Josip, Nigischer, Christopher.  2018.  Intellectual Property Protection of 3D Print Supply Chain with Blockchain Technology. 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC). :1—8.
Within “Industrie 4.0” approach 3D printing technology is characterized as one of the disruptive innovations. Conventional supply chains are replaced by value-added networks. The spatially distributed development of printed components, e.g. for the rapid delivery of spare parts, creates a new challenge when differentiating between “original part”, “copy” or “counterfeit” becomes necessary. This is especially true for safety-critical products. Based on these changes classic branded products adopt the characteristics of licensing models as we know them in the areas of software and digital media. This paper describes the use of digital rights management as a key technology for the successful transition to Additive Manufacturing methods and a key for its commercial implementation and the prevention of intellectual property theft. Risks will be identified along the process chain and solution concepts are presented. These are currently being developed by an 8-partner project named SAMPL (Secure Additive Manufacturing Platform).
2020-07-20
Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun.  2018.  Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.
Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.
2020-07-06
Chai, Yadeng, Liu, Yong.  2019.  Natural Spoken Instructions Understanding for Robot with Dependency Parsing. 2019 IEEE 9th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER). :866–871.
This paper presents a method based on syntactic information, which can be used for intent determination and slot filling tasks in a spoken language understanding system including the spoken instructions understanding module for robot. Some studies in recent years attempt to solve the problem of spoken language understanding via syntactic information. This research is a further extension of these approaches which is based on dependency parsing. In this model, the input for neural network are vectors generated by a dependency parsing tree, which we called window vector. This vector contains dependency features that improves performance of the syntactic-based model. The model has been evaluated on the benchmark ATIS task, and the results show that it outperforms many other syntactic-based approaches, especially in terms of slot filling, it has a performance level on par with some state of the art deep learning algorithms in recent years. Also, the model has been evaluated on FBM3, a dataset of the RoCKIn@Home competition. The overall rate of correctly understanding the instructions for robot is quite good but still not acceptable in practical use, which is caused by the small scale of FBM3.
2020-03-18
Li, Tao, Guo, Yuanbo, Ju, Ankang.  2019.  A Self-Attention-Based Approach for Named Entity Recognition in Cybersecurity. 2019 15th International Conference on Computational Intelligence and Security (CIS). :147–150.
With cybersecurity situation more and more complex, data-driven security has become indispensable. Numerous cybersecurity data exists in textual sources and data analysis is difficult for both security analyst and the machine. To convert the textual information into structured data for further automatic analysis, we extract cybersecurity-related entities and propose a self-attention-based neural network model for the named entity recognition in cybersecurity. Considering the single word feature not enough for identifying the entity, we introduce CNN to extract character feature which is then concatenated into the word feature. Then we add the self-attention mechanism based on the existing BiLSTM-CRF model. Finally, we evaluate the proposed model on the labelled dataset and obtain a better performance than the previous entity extraction model.
2019-12-16
Park, Chan Mi, Lee, Jung Yeon, Baek, Hyoung Woo, Lee, Hae-Sung, Lee, JeeHang, Kim, Jinwoo.  2019.  Lifespan Design of Conversational Agent with Growth and Regression Metaphor for the Natural Supervision on Robot Intelligence. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). :646–647.
Human's direct supervision on robot's erroneous behavior is crucial to enhance a robot intelligence for a `flawless' human-robot interaction. Motivating humans to engage more actively for this purpose is however difficult. To alleviate such strain, this research proposes a novel approach, a growth and regression metaphoric interaction design inspired from human's communicative, intellectual, social competence aspect of developmental stages. We implemented the interaction design principle unto a conversational agent combined with a set of synthetic sensors. Within this context, we aim to show that the agent successfully encourages the online labeling activity in response to the faulty behavior of robots as a supervision process. The field study is going to be conducted to evaluate the efficacy of our proposal by measuring the annotation performance of real-time activity events in the wild. We expect to provide a more effective and practical means to supervise robot by real-time data labeling process for long-term usage in the human-robot interaction.
2019-10-15
Panagiotakis, C., Papadakis, H., Fragopoulou, P..  2018.  Detection of Hurriedly Created Abnormal Profiles in Recommender Systems. 2018 International Conference on Intelligent Systems (IS). :499–506.

Recommender systems try to predict the preferences of users for specific items. These systems suffer from profile injection attacks, where the attackers have some prior knowledge of the system ratings and their goal is to promote or demote a particular item introducing abnormal (anomalous) ratings. The detection of both cases is a challenging problem. In this paper, we propose a framework to spot anomalous rating profiles (outliers), where the outliers hurriedly create a profile that injects into the system either random ratings or specific ratings, without any prior knowledge of the existing ratings. The proposed detection method is based on the unpredictable behavior of the outliers in a validation set, on the user-item rating matrix and on the similarity between users. The proposed system is totally unsupervised, and in the last step it uses the k-means clustering method automatically spotting the spurious profiles. For the cases where labeling sample data is available, a random forest classifier is trained to show how supervised methods outperforms unsupervised ones. Experimental results on the MovieLens 100k and the MovieLens 1M datasets demonstrate the high performance of the proposed schemata.

2019-06-10
Jain, D., Khemani, S., Prasad, G..  2018.  Identification of Distributed Malware. 2018 IEEE 3rd International Conference on Communication and Information Systems (ICCIS). :242-246.

Smartphones have evolved over the years from simple devices to communicate with each other to fully functional portable computers although with comparatively less computational power but inholding multiple applications within. With the smartphone revolution, the value of personal data has increased. As technological complexities increase, so do the vulnerabilities in the system. Smartphones are the latest target for attacks. Android being an open source platform and also the most widely used smartphone OS draws the attention of many malware writers to exploit the vulnerabilities of it. Attackers try to take advantage of these vulnerabilities and fool the user and misuse their data. Malwares have come a long way from simple worms to sophisticated DDOS using Botnets, the latest trends in computer malware tend to go in the distributed direction, to evade the multiple anti-virus apps developed to counter generic viruses and Trojans. However, the recent trend in android system is to have a combination of applications which acts as malware. The applications are benign individually but when grouped, these may result into a malicious activity. This paper proposes a new category of distributed malware in android system, how it can be used to evade the current security, and how it can be detected with the help of graph matching algorithm.

2019-02-25
Lekshmi, M. B., Deepthi, V. R..  2018.  Spam Detection Framework for Online Reviews Using Hadoop’ s Computational Capability. 2018 International CET Conference on Control, Communication, and Computing (IC4). :436–440.
Nowadays, online reviews have become one of the vital elements for customers to do online shopping. Organizations and individuals use this information to buy the right products and make business decisions. This has influenced the spammers or unethical business people to create false reviews and promote their products to out-beat competitions. Sophisticated systems are developed by spammers to create bulk of spam reviews in any websites within hours. To tackle this problem, studies have been conducted to formulate effective ways to detect the spam reviews. Various spam detection methods have been introduced in which most of them extracts meaningful features from the text or used machine learning techniques. These approaches gave little importance on extracted feature type and processing rate. NetSpam[1] defines a framework which can classify the review dataset based on spam features and maps them to a spam detection procedure which performs better than previous works in predictive accuracy. In this work, a method is proposed that can improve the processing rate by applying a distributed approach on review dataset using MapReduce feature. Parallel programming concept using MapReduce is used for processing big data in Hadoop. The solution involves parallelising the algorithm defined in NetSpam and it defines a spam detection procedure with better predictive accuracy and processing rate.