Biblio
Filters: Keyword is Semantics [Clear All Filters]
Detecting and Classifying Self-Deleting Windows Malware Using Prefetch Files. 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). :0745–0751.
.
2022. Malware detection and analysis can be a burdensome task for incident responders. As such, research has turned to machine learning to automate malware detection and malware family classification. Existing work extracts and engineers static and dynamic features from the malware sample to train classifiers. Despite promising results, such techniques assume that the analyst has access to the malware executable file. Self-deleting malware invalidates this assumption and requires analysts to find forensic evidence of malware execution for further analysis. In this paper, we present and evaluate an approach to detecting malware that executed on a Windows target and further classify the malware into its associated family to provide semantic insight. Specifically, we engineer features from the Windows prefetch file, a file system forensic artifact that archives process information. Results show that it is possible to detect the malicious artifact with 99% accuracy; furthermore, classifying the malware into a fine-grained family has comparable performance to techniques that require access to the original executable. We also provide a thorough security discussion of the proposed approach against adversarial diversity.
The Application of 1D-CNN in Microsoft Malware Detection. 2022 7th International Conference on Big Data Analytics (ICBDA). :181–187.
.
2022. In the computer field, cybersecurity has always been the focus of attention. How to detect malware is one of the focuses and difficulties in network security research effectively. Traditional existing malware detection schemes can be mainly divided into two methods categories: database matching and the machine learning method. With the rise of deep learning, more and more deep learning methods are applied in the field of malware detection. Deeper semantic features can be extracted via deep neural network. The main tasks of this paper are as follows: (1) Using machine learning methods and one-dimensional convolutional neural networks to detect malware (2) Propose a machine The method of combining learning and deep learning is used for detection. Machine learning uses LGBM to obtain an accuracy rate of 67.16%, and one-dimensional CNN obtains an accuracy rate of 72.47%. In (2), LGBM is used to screen the importance of features and then use a one-dimensional convolutional neural network, which helps to further improve the detection result has an accuracy rate of 78.64%.
Learning a Deep Reinforcement Learning Policy Over the Latent Space of a Pre-trained GAN for Semantic Age Manipulation. 2021 International Joint Conference on Neural Networks (IJCNN). :1–8.
.
2021. Learning a disentangled representation of the latent space has become one of the most fundamental problems studied in computer vision. Recently, many Generative Adversarial Networks (GANs) have shown promising results in generating high fidelity images. However, studies to understand the semantic layout of the latent space of pre-trained models are still limited. Several works train conditional GANs to generate faces with required semantic attributes. Unfortunately, in these attempts, the generated output is often not as photo-realistic as the unconditional state-of-the-art models. Besides, they also require large computational resources and specific datasets to generate high fidelity images. In our work, we have formulated a Markov Decision Process (MDP) over the latent space of a pre-trained GAN model to learn a conditional policy for semantic manipulation along specific attributes under defined identity bounds. Further, we have defined a semantic age manipulation scheme using a locally linear approximation over the latent space. Results show that our learned policy samples high fidelity images with required age alterations, while preserving the identity of the person.
Document-Level Biomedical Relation Extraction with Generative Adversarial Network and Dual-Attention Multi-Instance Learning. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). :438–443.
.
2021. Document-level relation extraction (RE) aims to extract relations among entities within a document, which is more complex than its sentence-level counterpart, especially in biomedical text mining. Chemical-disease relation (CDR) extraction aims to extract complex semantic relationships between chemicals and diseases entities in documents. In order to identify the relations within and across multiple sentences at the same time, existing methods try to build different document-level heterogeneous graph. However, the entity relation representations captured by these models do not make full use of the document information and disregard the noise introduced in the process of integrating various information. In this paper, we propose a novel model DAM-GAN to document-level biomedical RE, which can extract entity-level and mention-level representations of relation instances with R-GCN and Dual-Attention Multi-Instance Learning (DAM) respectively, and eliminate the noise with Generative Adversarial Network (GAN). Entity-level representations of relation instances model the semantic information of all entity pairs from the perspective of the whole document, while the mention-level representations from the perspective of mention pairs related to these entity pairs in different sentences. Therefore, entity- and mention-level representations can be better integrated to represent relation instances. Experimental results demonstrate that our model achieves superior performance on public document-level biomedical RE dataset BioCreative V Chemical Disease Relation(CDR).
Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :127–137.
.
2020. Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.
Chinese named entity recognition method for the field of network security based on RoBERTa. 2021 International Conference on Networking and Network Applications (NaNA). :420–425.
.
2021. As the mobile Internet is developing rapidly, people who use cell phones to access the Internet dominate, and the mobile Internet has changed the development environment of online public opinion and made online public opinion events spread more widely. In the online environment, any kind of public issues may become a trigger for the generation of public opinion and thus need to be controlled for network supervision. The method in this paper can identify entities from the event texts obtained from mobile Today's Headlines, People's Daily, etc., and informatize security of public opinion in event instances, thus strengthening network supervision and control in mobile, and providing sufficient support for national security event management. In this paper, we present a SW-BiLSTM-CRF model, as well as a model combining the RoBERTa pre-trained model with the classical neural network BiLSTM model. Our experiments show that this approach provided achieves quite good results on Chinese emergency corpus, with accuracy and F1 values of 87.21% and 78.78%, respectively.
Investigating the Design Space for Name Confidentiality in Named Data Networking. MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM). :570–576.
.
2021. As a fundamental departure from the IP design which encodes source and destination addresses in each packet, Named Data Networking (NDN) directly uses application-defined data names for network layer communications. While bringing important data-centric benefits, the semantic richness of NDN names has also raised confidentiality and privacy concerns. In this paper, we first define the problem of name confidentiality, and then investigate the solution space through a comprehensive examination of all the proposed solutions up to date. Our work shows that the proposed solutions are simply different means to hide the actual data names via a layer of translation; they differ in where and how the translation takes place, which lead to different trade-offs in feasibility, efficiency, security, scalability, and different degrees of adherence to NDN's data-centric communications. Our investigation suggests the feasibility of a systematic design that can enable NDN to provide stronger name confidentiality and user privacy as compared to today's TCP/IP Internet.
Efficient Modelling of ICS Communication For Anomaly Detection Using Probabilistic Automata. 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM). :81–89.
.
2021. Industrial Control System (ICS) communication transmits monitoring and control data between industrial processes and the control station. ICS systems cover various domains of critical infrastructure such as the power plants, water and gas distribution, or aerospace traffic control. Security of ICS systems is usually implemented on the perimeter of the network using ICS enabled firewalls or Intrusion Detection Systems (IDSs). These techniques are helpful against external attacks, however, they are not able to effectively detect internal threats originating from a compromised device with malicious software. In order to mitigate or eliminate internal threats against the ICS system, we need to monitor ICS traffic and detect suspicious data transmissions that differ from common operational communication. In our research, we obtain ICS monitoring data using standardized IPFIX flows extended with meta data extracted from ICS protocol headers. Unlike other anomaly detection approaches, we focus on modelling the semantics of ICS communication obtained from the IPFIX flows that describes typical conversational patterns. This paper presents a technique for modelling ICS conversations using frequency prefix trees and Deterministic Probabilistic Automata (DPA). As demonstrated on the attack scenarios, these models are efficient to detect common cyber attacks like the command injection, packet manipulation, network scanning, or lost connection. An important advantage of our approach is that the proposed technique can be easily integrated into common security information and event management (SIEM) systems with Netflow/IPFIX support. Our experiments are performed on IEC 60870-5-104 (aka IEC 104) control communication that is widely used for the substation control in smart grids.
Anomaly Detection in Unstructured Logs Using Attention-based Bi-LSTM Network. 2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC). :403–407.
.
2021. System logs record valuable information about the runtime status of IT systems. Therefore, system logs are a naturally excellent source of information for anomaly detection. Most of the existing studies on log-based anomaly detection construct a detection model to identify anomalous logs. Generally, the model treats historical logs as natural language sequences and learns the normal patterns from normal log sequences, and detects deviations from normal patterns as anomalies. However, the majority of existing methods focus on sequential and quantitative information and ignore semantic information hidden in log sequence so that they are inefficient in anomaly detection. In this paper, we propose a novel framework for automatically detecting log anomalies by utilizing an attention-based Bi-LSTM model. To demonstrate the effectiveness of our proposed model, we evaluate the performance on a public production log dataset. Extensive experimental results show that the proposed approach outperforms all comparison methods for anomaly detection.
Remote Non-Intrusive Malware Detection for PLCs based on Chain of Trust Rooted in Hardware. 2021 IEEE European Symposium on Security and Privacy (EuroS&P). :369—384.
.
2021. Digitization has been rapidly integrated with manufacturing industries and critical infrastructure to increase efficiency, productivity, and reduce wastefulness, a transition being labeled as Industry 4.0. However, this expansion, coupled with the poor cybersecurity posture of these Industrial Internet of Things (IIoT) devices, has made them prolific targets for exploitation. Moreover, modern Programmable Logic Controllers (PLC) used in the Operational Technology (OT) sector are adopting open-source operating systems such as Linux instead of proprietary software, making such devices susceptible to Linux-based malware. Traditional malware detection approaches cannot be applied directly or extended to such environments due to the unique restrictions of these PLC devices, such as limited computational power and real-time requirements. In this paper, we propose ORRIS, a novel lightweight and out-of-the-device framework that detects malware at both kernel and user-level by processing the information collected using the Joint Test Action Group (JTAG) interface. We evaluate ORRIS against in-the-wild Linux malware achieving maximum detection accuracy of ≈99.7% with very few false-positive occurrences, a result comparable to the state-of-the-art commercial products. Moreover, we also develop and demonstrate a real-time implementation of ORRIS for commercial PLCs.
AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). :1–12.
.
2019. Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-the-art pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases.
A Compositional Cost Model for the λ-calculus. 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). :1–13.
.
2021. We describe a (time) cost model for the (call-by-value) λ-calculus based on a natural presentation of its game semantics: the cost of computing a finite approximant to the denotation of a term (its evaluation tree) is the size of its smallest derivation in the semantics. This measure has an optimality property enabling compositional reasoning about cost bounds: for any term A, context C[\_] and approximants a and c to the trees of A and C[A], the cost of computing c from C[A] is no more than the cost of computing a from A and c from C[a].Although the natural semantics on which it is based is nondeterministic, our cost model is reasonable: we describe a deterministic algorithm for recognizing evaluation tree approximants which satisfies it (up to a constant factor overhead) on a Random Access Machine. This requires an implementation of the λv-calculus on the RAM which is completely lazy: compositionality of costs entails that work done to evaluate any part of a term cannot be duplicated. This is achieved by a novel implementation of graph reduction for nameless explicit substitutions, to which we compile the λv-calculus via a series of linear cost reductions.
IoT Cooking Workflows for End-Users: A Comparison Between Behaviour Trees and the DX-MAN Model. 2021 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C). :341–350.
.
2021. A kitchen underpinned by the Internet of Things (IoT) requires the management of complex procedural processes. This is due to the fact that when supporting an end-user in the preparation of even only one dish, various devices may need to coordinate with each other. Additionally, it is challenging— yet desirable—to enable an end-user to program their kitchen devices according to their preferred behaviour and to allow them to visualise and track their cooking workflows. In this paper, we compared two semantic representations, namely, Behaviour Trees and the DX-MAN model. We analysed these representations based on their suitability for a range of end-users (i.e., novice to experienced). The methodology required the analysis of smart kitchen user requirements, from which we inferred that the main architectural requirements for IoT cooking workflows are variability and compositionality. Guided by the user requirements, we examined various scenarios and analysed workflow complexity and feasibility for each representation. On the one hand, we found that execution complexity tends to be higher on Behaviour Trees. However, due to their fallback node, they provide more transparency on how to recover from unprecedented circumstances. On the other hand, parameter complexity tends to be somewhat higher for the DX-MAN model. Nevertheless, the DX-MAN model can be favourable due to its compositionality aspect and the ease of visualisation it can offer.
Mining Frequent and Rare Itemsets With Weighted Supports Using Additive Neural Itemset Embedding. 2021 International Joint Conference on Neural Networks (IJCNN). :1–8.
.
2021. Over the past two decades, itemset mining techniques have become an integral part of pattern mining in large databases. We present a novel system for mining frequent and rare itemsets simultaneously with supports weighted by cardinality in transactional datasets. Based on our neural item embedding with additive compositionality, the original mining problems are approximately reduced to polynomial-time convex optimization, namely a series of vector subset selection problems in Euclidean space. The numbers of transactions and items are no longer exponential factors of the time complexity under such reduction, except only the Euclidean space dimension, which can be assigned arbitrarily for a trade-off between mining speed and result quality. The efficacy of our method reveals that additive compositionality can be represented by linear translation in the itemset vector space, which resembles the linguistic regularities in word embedding by similar neural modeling. Experiments show that our learned embedding can bring pattern itemsets with higher accuracy than sampling-based lossy mining techniques in most cases, and the scalability of our mining approach triumphs over several state-of-the-art distributed mining algorithms.
On Compositional Information Flow Aware Refinement. 2021 IEEE 34th Computer Security Foundations Symposium (CSF). :1–16.
.
2021. The concepts of information flow security and refinement are known to have had a troubled relationship ever since the seminal work of McLean. In this work we study refinements that support changes in data representation and semantics, including the addition of state variables that may induce new observational power or side channels. We propose a new epistemic approach to ignorance-preserving refinement where an abstract model is used as a specification of a system's permitted information flows, that may include the declassification of secret information. The core idea is to require that refinement steps must not induce observer knowledge that is not already available in the abstract model. Our study is set in the context of a class of shared variable multiagent models similar to interpreted systems in epistemic logic. We demonstrate the expressiveness of our framework through a series of small examples and compare our approach to existing, stricter notions of information-flow secure refinement based on bisimulations and noninterference preservation. Interestingly, noninterference preservation is not supported “out of the box” in our setting, because refinement steps may introduce new secrets that are independent of secrets already present at abstract level. To support verification, we first introduce a “cube-shaped” unwinding condition related to conditions recently studied in the context of value-dependent noninterference, kernel verification, and secure compilation. A fundamental problem with ignorance-preserving refinement, caused by the support for general data and observation refinement, is that sequential composability is lost. We propose a solution based on relational pre-and postconditions and illustrate its use together with unwinding on the oblivious RAM construction of Chung and Pass.
Measuring Trust and Automatic Verification in Multi-Agent Systems. 2021 8th International Conference on Dependable Systems and Their Applications (DSA). :271—277.
.
2021. Due to the shortage of resources and services, agents are often in competition with each other. Excessive competition will lead to a social dilemma. Under the viewpoint of breaking social dilemma, we present a novel trust-based logic framework called Trust Computation Logic (TCL) for measure method to find the best partners to collaborate and automatically verifying trust in Multi-Agent Systems (MASs). TCL starts from defining trust state in Multi-Agent Systems, which is based on contradistinction between behavior in trust behavior library and in observation. In particular, a set of reasoning postulates along with formal proofs were put forward to support our measure process. Moreover, we introduce symbolic model checking algorithms to formally and automatically verify the system. Finally, the trust measure method and reported experimental results were evaluated by using DeepMind’s Sequential Social Dilemma (SSD) multi-agent game-theoretic environments.
Gradual Security Types and Gradual Guarantees. 2021 IEEE 34th Computer Security Foundations Symposium (CSF). :1—16.
.
2021. Information flow type systems enforce the security property of noninterference by detecting unauthorized data flows at compile-time. However, they require precise type annotations, making them difficult to use in practice as much of the legacy infrastructure is written in untyped or dynamically-typed languages. Gradual typing seamlessly integrates static and dynamic typing, providing the best of both approaches, and has been applied to information flow control, where information flow monitors are derived from gradual security types. Prior work on gradual information flow typing uncovered tensions between noninterference and the dynamic gradual guarantee- the property that less precise security type annotations in a program should not cause more runtime errors.This paper re-examines the connection between gradual information flow types and information flow monitors to identify the root cause of the tension between the gradual guarantees and noninterference. We develop runtime semantics for a simple imperative language with gradual information flow types that provides both noninterference and gradual guarantees. We leverage a proof technique developed for FlowML and reduce noninterference proofs to preservation proofs.
A Method for Finding Quasi-identifier of Single Structured Relational Data. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :93—98.
.
2021. Quasi-identifier is an attribute combined with other attributes to identify specific tuples or partial tuples. Improper selection of quasi-identifiers will lead to the failure of current privacy protection anonymization technology. Therefore, in this paper, we propose a method to solve single structured relational data quasi-identifiers based on functional dependency and determines the attribute classification standard. Firstly, the solution scope of quasi-identifier is determined to be all attributes except identity attributes and critical attributes. Secondly, the real data set is used to evaluate the dependency relationship between the indefinite attribute subset and the identity attribute to solve the quasi-identifiers set. Finally, we propose an algorithm to find all quasi-identifiers and experiment on real data sets of different sizes. The results show that our method can achieve better performance on the same dataset.
A Staffing Recommender System based on Domain-Specific Knowledge Graph. 2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS). :1—6.
.
2021. In the economics environment, Job Matching is always a challenge involving the evolution of knowledge and skills. A good matching of skills and jobs can stimulate the growth of economics. Recommender System (RecSys), as one kind of Job Matching, can help the candidates predict the future job relevant to their preferences. However, RecSys still has the problem of cold start and data sparsity. The content-based filtering in RecSys needs the adaptive data for the specific staffing tasks of Bidirectional Encoder Representations from Transformers (BERT). In this paper, we propose a job RecSys based on skills and locations using a domain-specific Knowledge Graph (KG). This system has three parts: a pipeline of Named Entity Recognition (NER) and Relation Extraction (RE) using BERT; a standardization system for pre-processing, semantic enrichment and semantic similarity measurement; a domain-specific Knowledge Graph (KG). Two different relations in the KG are computed by cosine similarity and Term Frequency-Inverse Document Frequency (TF-IDF) respectively. The raw data used in the staffing RecSys include 3000 descriptions of job offers from Indeed, 126 Curriculum Vitae (CV) in English from Kaggle and 106 CV in French from Linx of Capgemini Engineering. The staffing RecSys is integrated under an architecture of Microservices. The autonomy and effectiveness of the staffing RecSys are verified through the experiment using Discounted Cumulative Gain (DCG). Finally, we propose several potential research directions for this research.
A Cyber Physical System based Stochastic Process Language With NuSMV Model Checker. 2021 International Conference on Intelligent Technology, System and Service for Internet of Everything (ITSS-IoE). :1—8.
.
2021. Nowadays, cyber physical systems are playing an important role in human life in which they provide features that make interactions between human and machine easier. To design and analysis such systems, the main problem is their complexity. In this paper, we propose a description language for cyber physical systems based on stochastic processes. The proposed language is called SPDL (Stochastic Description Process Language). For designing SPDL, two main parts are considered for Cyber Physical Systems (CSP): embedded systems and physical environment. Then these parts are defined as stochastic processes and CPS is defined as a tuple. Syntax and semantics of SPDL are stated based on the proposed definition. Also, the semantics are defined as by set theory. For implementation of SPDL, dependencies between words of a requirements are extracted as a tree data structure. Based on the dependencies, SPDL is used for describing the CPS. Also, a lexical analyzer and a parser based on a defined BNF grammar for SPDL is designed and implemented. Finally, SPDL of CPS is transformed to NuSMV which is a symbolic model checker. The Experimental results show that SPDL is capable of describing cyber physical systems by natural language.
A Network Architecture Containing Both Push and Pull Semantics. 2021 7th International Conference on Computer and Communications (ICCC). :2211—2216.
.
2021. Recently, network usage has evolved from resource sharing between hosts to content distribution and retrieval. Some emerging network architectures, like Named Data Networking (NDN), focus on the design of content-oriented network paradigm. However, these clean-slate network architectures are difficult to be deployed progressively and deal with the new communication requirements. Multi-Identifier Network (MIN) is a promising network architecture that contains push and pull communication semantics and supports the resolution, routing and extension of multiple network identifiers. MIN's original design was proposed in 2019, which has been improved over the past two years. In this paper, we present the current design and implementation of MIN. We also propose a fallback-based identifier extension scheme to improve the extensibility of the network. We demonstrate that MIN outperforms NDN in the scenario of progressive deployment via IP tunnel.
Scalable Fact-checking with Human-in-the-Loop. 2021 IEEE International Workshop on Information Forensics and Security (WIFS). :1–6.
.
2021. Researchers have been investigating automated solutions for fact-checking in various fronts. However, current approaches often overlook the fact that information released every day is escalating, and a large amount of them overlap. Intending to accelerate fact-checking, we bridge this gap by proposing a new pipeline – grouping similar messages and summarizing them into aggregated claims. Specifically, we first clean a set of social media posts (e.g., tweets) and build a graph of all posts based on their semantics; Then, we perform two clustering methods to group the messages for further claim summarization. We evaluate the summaries both quantitatively with ROUGE scores and qualitatively with human evaluation. We also generate a graph of summaries to verify that there is no significant overlap among them. The results reduced 28,818 original messages to 700 summary claims, showing the potential to speed up the fact-checking process by organizing and selecting representative claims from massive disorganized and redundant messages.
Noise Reduction Framework for Distantly Supervised Relation Extraction with Human in the Loop. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :1–4.
.
2020. Distant supervision is a widely used data labeling method for relation extraction. While aligning knowledge base with the corpus, distant supervision leads to a mass of wrong labels which are defined as noise. The pattern-based denoising model has achieved great progress in selecting trustable sentences (instances). However, the writing of relation-specific patterns heavily relies on expert’s knowledge and is a high labor intensity work. To solve these problems, we propose a noise reduction framework, NOIR, to iteratively select trustable sentences with a little help of a human. Under the guidance of experts, the iterative process can avoid semantic drift. Besides, NOIR can help experts discover relation-specific tokens that are hard to think of. Experimental results on three real-world datasets show the effectiveness of the proposed method compared with state-of-the-art methods.
Cyberattack Ontology: A Knowledge Representation for Cyber Supply Chain Security. 2021 International Conference on Computing, Computational Modelling and Applications (ICCMA). :65–70.
.
2021. Cyberattacks on cyber supply chain (CSC) systems and the cascading impacts have brought many challenges and different threat levels with unpredictable consequences. The embedded networks nodes have various loopholes that could be exploited by the threat actors leading to various attacks, risks, and the threat of cascading attacks on the various systems. Key factors such as lack of common ontology vocabulary and semantic interoperability of cyberattack information, inadequate conceptualized ontology learning and hierarchical approach to representing the relationships in the CSC security domain has led to explicit knowledge representation. This paper explores cyberattack ontology learning to describe security concepts, properties and the relationships required to model security goal. Cyberattack ontology provides a semantic mapping between different organizational and vendor security goals has been inherently challenging. The contributions of this paper are threefold. First, we consider CSC security modelling such as goal, actor, attack, TTP, and requirements using semantic rules for logical representation. Secondly, we model a cyberattack ontology for semantic mapping and knowledge representation. Finally, we discuss concepts for threat intelligence and knowledge reuse. The results show that the cyberattack ontology concepts could be used to improve CSC security.
Mixed-mode Information Flow Tracking with Compile-time Taint Semantics Extraction and Offline Replay. 2021 IEEE Conference on Dependable and Secure Computing (DSC). :1–8.
.
2021. Static information flow analysis (IFA) and dynamic information flow tracking (DIFT) have been widely employed in offline security analysis of computer programs. As security attacks become more sophisticated, there is a rising need for IFA and DIFT in production environment. However, existing systems usually deal with IFA and DIFT separately, and most DIFT systems incur significant performance overhead. We propose MIT to facilitate IFA and DIFT in online production environment. MIT offers mixed-mode information flow tracking at byte-granularity and incurs moderate runtime performance overhead. The core techniques consist of the extraction of taint semantics intermediate representation (TSIR) at compile-time and the decoupled execution of TSIR for information flow analysis. We conducted an extensive performance overhead evaluation on MIT to confirm its applicability in production environment. We also outline potential applications of MIT, including the implementation of data provenance checking and information flow based anomaly detection in real-world applications.