Visible to the public Biblio

Filters: Keyword is Vocabulary  [Clear All Filters]
2023-07-10
Zhang, Xiao, Chen, Xiaoming, He, Yuxiong, Wang, Youhuai, Cai, Yong, Li, Bo.  2022.  Neural Network-Based DDoS Detection on Edge Computing Architecture. 2022 4th International Conference on Applied Machine Learning (ICAML). :1—4.
The safety of the power system is inherently vital, due to the high risk of the electronic power system. In the wave of digitization in recent years, many power systems have been digitized to a certain extent. Under this circumstance, network security is particularly important, in order to ensure the normal operation of the power system. However, with the development of the Internet, network security issues are becoming more and more serious. Among all kinds of network attacks, the Distributed Denial of Service (DDoS) is a major threat. Once, attackers used huge volumes of traffic in short time to bring down the victim server. Now some attackers just use low volumes of traffic but for a long time to create trouble for attack detection. There are many methods for DDoS detection, but no one can fully detect it because of the huge volumes of traffic. In order to better detect DDoS and make sure the safety of electronic power system, we propose a novel detection method based on neural network. The proposed model and its service are deployed to the edge cloud, which can improve the real-time performance for detection. The experiment results show that our model can detect attacks well and has good real-time performance.
2023-02-17
Xu, Mingming, Zhang, Lu, Zhu, Haiting.  2022.  Finding Collusive Spam in Community Question Answering Platforms: A Pattern and Burstiness Based Method. 2021 Ninth International Conference on Advanced Cloud and Big Data (CBD). :89–94.
Community question answering (CQA) websites have become very popular platforms attracting numerous participants to share and acquire knowledge and information in Internet However, with the rapid growth of crowdsourcing systems, many malicious users organize collusive attacks against the CQA platforms for promoting a target (product or service) via posting suggestive questions and deceptive answers. These manipulate deceptive contents, aggregating into multiple collusive questions and answers (Q&As) spam groups, can fully control the sentiment of a target and distort the decision of users, which pollute the CQA environment and make it less credible. In this paper, we propose a Pattern and Burstiness based Collusive Q&A Spam Detection method (PBCSD) to identify the deceptive questions and answers. Specifically, we intensively study the campaign process of crowdsourcing tasks and summarize the clues in the Q&As’ vocabulary usage level when collusive attacks are launched. Based on the clues, we extract the Q&A groups using frequent pattern mining and further purify them by the burstiness on posting time of Q&As. By designing several discriminative features at the Q&A group level, multiple machine learning based classifiers can be used to judge the groups as deceptive or ordinary, and the Q&As in deceptive groups are finally identified as collusive Q&A spam. We evaluate the proposed PBCSD method in a real-world dataset collected from Baidu Zhidao, a famous CQA platform in China, and the experimental results demonstrate the PBCSD is effective for collusive Q&A spam detection and outperforms a number of state-of-art methods.
2022-09-09
Teodorescu, Horia-Nicolai.  2021.  Applying Chemical Linguistics and Stylometry for Deriving an Author’s Scientific Profile. 2021 International Symposium on Signals, Circuits and Systems (ISSCS). :1—4.
The study exercises computational linguistics, specifically chemical linguistics methods for profiling an author. We analyze the vocabulary and the style of the titles of the most visible works of Cristofor I. Simionescu, an internationally well-known chemist, for detecting specific patterns of his research interests and methods. Somewhat surprisingly, while the tools used are elementary and there is only a small number of words in the analysis, some interesting details emerged about the work of the analyzed personality. Some of these aspects were confirmed by experts in the field. We believe this is the first study aiming to author profiling in chemical linguistics, moreover the first to question the usefulness of Google Scholar for author profiling.
2022-06-06
Yeboah-Ofori, Abel, Ismail, Umar Mukhtar, Swidurski, Tymoteusz, Opoku-Boateng, Francisca.  2021.  Cyberattack Ontology: A Knowledge Representation for Cyber Supply Chain Security. 2021 International Conference on Computing, Computational Modelling and Applications (ICCMA). :65–70.
Cyberattacks on cyber supply chain (CSC) systems and the cascading impacts have brought many challenges and different threat levels with unpredictable consequences. The embedded networks nodes have various loopholes that could be exploited by the threat actors leading to various attacks, risks, and the threat of cascading attacks on the various systems. Key factors such as lack of common ontology vocabulary and semantic interoperability of cyberattack information, inadequate conceptualized ontology learning and hierarchical approach to representing the relationships in the CSC security domain has led to explicit knowledge representation. This paper explores cyberattack ontology learning to describe security concepts, properties and the relationships required to model security goal. Cyberattack ontology provides a semantic mapping between different organizational and vendor security goals has been inherently challenging. The contributions of this paper are threefold. First, we consider CSC security modelling such as goal, actor, attack, TTP, and requirements using semantic rules for logical representation. Secondly, we model a cyberattack ontology for semantic mapping and knowledge representation. Finally, we discuss concepts for threat intelligence and knowledge reuse. The results show that the cyberattack ontology concepts could be used to improve CSC security.
2022-02-03
García, Kimberly, Zihlmann, Zaira, Mayer, Simon, Tamò-Larrieux, Aurelia, Hooss, Johannes.  2021.  Towards Privacy-Friendly Smart Products. 2021 18th International Conference on Privacy, Security and Trust (PST). :1—7.
Smart products, such as toy robots, must comply with multiple legal requirements of the countries they are sold and used in. Currently, compliance with the legal environment requires manually customizing products for different markets. In this paper, we explore a design approach for smart products that enforces compliance with aspects of the European Union’s data protection principles within a product’s firmware through a toy robot case study. To this end, we present an exchange between computer scientists and legal scholars that identified the relevant data flows, their processing needs, and the implementation decisions that could allow a device to operate while complying with the EU data protection law. By designing a data-minimizing toy robot, we show that the variety, amount, and quality of data that is exposed, processed, and stored outside a user’s premises can be considerably reduced while preserving the device’s functionality. In comparison with a robot designed using a traditional approach, in which 90% of the collected types of information are stored by the data controller or a remote service, our proposed design leads to the mandatory exposure of only 7 out of 15 collected types of information, all of which are legally required by the data controller to demonstrate consent. Moreover, our design is aligned with the Data Privacy Vocabulary, which enables the toy robot to cross geographic borders and seamlessly adjust its data processing activities to the local regulations.
2022-01-31
Peitek, Norman, Apel, Sven, Parnin, Chris, Brechmann, André, Siegmund, Janet.  2021.  Program Comprehension and Code Complexity Metrics: An fMRI Study. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). :524–536.
Background: Researchers and practitioners have been using code complexity metrics for decades to predict how developers comprehend a program. While it is plausible and tempting to use code metrics for this purpose, their validity is debated, since they rely on simple code properties and rarely consider particularities of human cognition. Aims: We investigate whether and how code complexity metrics reflect difficulty of program comprehension. Method: We have conducted a functional magnetic resonance imaging (fMRI) study with 19 participants observing program comprehension of short code snippets at varying complexity levels. We dissected four classes of code complexity metrics and their relationship to neuronal, behavioral, and subjective correlates of program comprehension, overall analyzing more than 41 metrics. Results: While our data corroborate that complexity metrics can-to a limited degree-explain programmers' cognition in program comprehension, fMRI allowed us to gain insights into why some code properties are difficult to process. In particular, a code's textual size drives programmers' attention, and vocabulary size burdens programmers' working memory. Conclusion: Our results provide neuro-scientific evidence supporting warnings of prior research questioning the validity of code complexity metrics and pin down factors relevant to program comprehension. Future Work: We outline several follow-up experiments investigating fine-grained effects of code complexity and describe possible refinements to code complexity metrics.
2021-12-22
Poli, Jean-Philippe, Ouerdane, Wassila, Pierrard, Régis.  2021.  Generation of Textual Explanations in XAI: The Case of Semantic Annotation. 2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–6.
Semantic image annotation is a field of paramount importance in which deep learning excels. However, some application domains, like security or medicine, may need an explanation of this annotation. Explainable Artificial Intelligence is an answer to this need. In this work, an explanation is a sentence in natural language that is dedicated to human users to provide them clues about the process that leads to the decision: the labels assignment to image parts. We focus on semantic image annotation with fuzzy logic that has proven to be a useful framework that captures both image segmentation imprecision and the vagueness of human spatial knowledge and vocabulary. In this paper, we present an algorithm for textual explanation generation of the semantic annotation of image regions.
2021-11-29
Piazza, Nancirose.  2020.  Classification Between Machine Translated Text and Original Text By Part Of Speech Tagging Representation. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). :739–740.
Classification between machine-translated text and original text are often tokenized on vocabulary of the corpi. With N-grams larger than uni-gram, one can create a model that estimates a decision boundary based on word frequency probability distribution; however, this approach is exponentially expensive because of high dimensionality and sparsity. Instead, we let samples of the corpi be represented by part-of-speech tagging which is significantly less vocabulary. With less trigram permutations, we can create a model with its tri-gram frequency probability distribution. In this paper, we explore less conventional ways of approaching techniques for handling documents, dictionaries, and the likes.
2021-10-12
Martiny, Karsten, Denker, Grit.  2020.  Partial Decision Overrides in a Declarative Policy Framework. 2020 IEEE 14th International Conference on Semantic Computing (ICSC). :271–278.
The ability to specify various policies with different overriding criteria allows for complex sets of sharing policies. This is particularly useful in situations in which data privacy depends on various properties of the data, and complex policies are needed to express the conditions under which data is protected. However, if overriding policy decisions constrain the affected data, decisions from overridden policies should not be suppressed completely, because they can still apply to subsets of the affected data. This article describes how a privacy policy framework can be extended with a mechanism to partially override decisions based on specified constraints. Our solution automatically generates complementary sets of decisions for both the overridden and the complementary, non-overridden subsets of the data, and thus, provides a means to specify a complex policies tailored to specific properties of the protected data.
2021-09-07
Kumar, Nripesh, Srinath, G., Prataap, Abhishek, Nirmala, S. Jaya.  2020.  Attention-based Sequential Generative Conversational Agent. 2020 5th International Conference on Computing, Communication and Security (ICCCS). :1–6.
In this work, we examine the method of enabling computers to understand human interaction by constructing a generative conversational agent. An experimental approach in trying to apply the techniques of natural language processing using recurrent neural networks (RNNs) to emulate the concept of textual entailment or human reasoning is presented. To achieve this functionality, our experiment involves developing an integrated Long Short-Term Memory cell neural network (LSTM) system enhanced with an attention mechanism. The results achieved by the model are shown in terms of the number of epochs versus loss graphs as well as a brief illustration of the model's conversational capabilities.
2020-08-28
Khomytska, Iryna, Teslyuk, Vasyl.  2019.  The Software for Authorship and Style Attribution. 2019 IEEE 15th International Conference on the Experience of Designing and Application of CAD Systems (CADSM). :1—4.

A new program has been developed for style and authorship attribution. Differentiation of styles by transcription symbols has proved to be efficient The novel approach involves a combination of two ways of transforming texts into their transcription variants. The java programming language makes it possible to improve efficiency of style and authorship attribution.

2019-03-15
Deliu, I., Leichter, C., Franke, K..  2018.  Collecting Cyber Threat Intelligence from Hacker Forums via a Two-Stage, Hybrid Process Using Support Vector Machines and Latent Dirichlet Allocation. 2018 IEEE International Conference on Big Data (Big Data). :5008-5013.

Traditional security controls, such as firewalls, anti-virus and IDS, are ill-equipped to help IT security and response teams keep pace with the rapid evolution of the cyber threat landscape. Cyber Threat Intelligence (CTI) can help remediate this problem by exploiting non-traditional information sources, such as hacker forums and "dark-web" social platforms. Security and response teams can use the collected intelligence to identify emerging threats. Unfortunately, when manual analysis is used to extract CTI from non-traditional sources, it is a time consuming, error-prone and resource intensive process. We address these issues by using a hybrid Machine Learning model that automatically searches through hacker forum posts, identifies the posts that are most relevant to cyber security and then clusters the relevant posts into estimations of the topics that the hackers are discussing. The first (identification) stage uses Support Vector Machines and the second (clustering) stage uses Latent Dirichlet Allocation. We tested our model, using data from an actual hacker forum, to automatically extract information about various threats such as leaked credentials, malicious proxy servers, malware that evades AV detection, etc. The results demonstrate our method is an effective means for quickly extracting relevant and actionable intelligence that can be integrated with traditional security controls to increase their effectiveness.

2017-12-12
Jiang, L., Kuhn, W., Yue, P..  2017.  An interoperable approach for Sensor Web provenance. 2017 6th International Conference on Agro-Geoinformatics. :1–6.

The Sensor Web is evolving into a complex information space, where large volumes of sensor observation data are often consumed by complex applications. Provenance has become an important issue in the Sensor Web, since it allows applications to answer “what”, “when”, “where”, “who”, “why”, and “how” queries related to observations and consumption processes, which helps determine the usability and reliability of data products. This paper investigates characteristics and requirements of provenance in the Sensor Web and proposes an interoperable approach to building a provenance model for the Sensor Web. Our provenance model extends the W3C PROV Data Model with Sensor Web domain vocabularies. It is developed using Semantic Web technologies and thus allows provenance information of sensor observations to be exposed in the Web of Data using the Linked Data approach. A use case illustrates the applicability of the approach.

2017-11-20
Rudolph, M., Moucha, C., Feth, D..  2016.  A Framework for Generating User-and Domain-Tailored Security Policy Editors. 2016 IEEE 24th International Requirements Engineering Conference Workshops (REW). :56–61.

In modern enterprises, incorrect or inconsistent security policies can lead to massive damage, e.g., through unintended data leakage. As policy authors have different skills and background knowledge, usable policy editors have to be tailored to the author's individual needs and to the corresponding application domain. However, the development of individual policy editors and the customization of existing ones is an effort consuming task. In this paper, we present a framework for generating tailored policy editors. In order to empower user-friendly and less error-prone specification of security policies, the framework supports multiple platforms, policy languages, and specification paradigms.

2017-03-07
Almeida, Ricardo, Maio, Paulo, Oliveira, Paulo, Barroso, João.  2016.  Ontology Based Rewriting Data Cleaning Operations. Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering. :85–88.

Dealing with increasing amounts of data creates the need to deal with redundant, inconsistent and/or complementary repositories which may be different in their data models and/or in their schema. Current data cleaning techniques developed to tackle data quality problems are just suitable for scenarios were all repositories share the same model and schema. Recently, an ontology-based methodology was proposed to overcome this limitation. In this paper, this methodology is briefly described and applied to a real scenario in the health domain with data quality problems.