Janapriya, N., Anuradha, K., Srilakshmi, V..
2021.
Adversarial Deep Learning Models With Multiple Adversaries. 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA). :522–525.
Adversarial machine learning calculations handle adversarial instance age, producing bogus data information with the ability to fool any machine learning model. As the word implies, “foe” refers to a rival, whereas “rival” refers to a foe. In order to strengthen the machine learning models, this section discusses about the weakness of machine learning models and how effectively the misinterpretation occurs during the learning cycle. As definite as it is, existing methods such as creating adversarial models and devising powerful ML computations, frequently ignore semantics and the general skeleton including ML section. This research work develops an adversarial learning calculation by considering the coordinated portrayal by considering all the characteristics and Convolutional Neural Networks (CNN) explicitly. Figuring will most likely express minimal adjustments via data transport represented over positive and negative class markings, as well as a specific subsequent data flow misclassified by CNN. The final results recommend a certain game theory and formative figuring, which obtain incredible favored ensuring about significant learning models against the execution of shortcomings, which are reproduced as attack circumstances against various adversaries.
Abdelnabi, Sahar, Fritz, Mario.
2021.
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding. 2021 IEEE Symposium on Security and Privacy (SP). :121–140.
Recent advances in natural language generation have introduced powerful language models with high-quality output text. However, this raises concerns about the potential misuse of such models for malicious purposes. In this paper, we study natural language watermarking as a defense to help better mark and trace the provenance of text. We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training that, given an input text and a binary message, generates an output text that is unobtrusively encoded with the given message. We further study different training and inference strategies to achieve minimal changes to the semantics and correctness of the input text.AWT is the first end-to-end model to hide data in text by automatically learning -without ground truth- word substitutions along with their locations in order to encode the message. We empirically show that our model is effective in largely preserving text utility and decoding the watermark while hiding its presence against adversaries. Additionally, we demonstrate that our method is robust against a range of attacks.
Letychevskyi, Oleksandr, Peschanenko, Volodymyr, Radchenko, Viktor, Hryniuk, Yaroslav, Yakovlev, Viktor.
2019.
Algebraic Patterns of Vulnerabilities in Binary Code. 2019 10th International Conference on Dependable Systems, Services and Technologies (DESSERT). :70–73.
This paper presents an algebraic approach for formalizing and detecting vulnerabilities in binary code. It uses behaviour algebra equations for creating patterns of vulnerabilities and algebraic matching methods for vulnerability detection. Algebraic matching is based on symbolic modelling. This paper considers a known vulnerability, buffer overflow, as an example to demonstrate an algebraic approach for pattern creation.
Ge, X., Pan, Y., Fan, Y., Fang, C..
2019.
AMDroid: Android Malware Detection Using Function Call Graphs. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). :71—77.
With the rapid development of the mobile Internet, Android has been the most popular mobile operating system. Due to the open nature of Android, c countless malicious applications are hidden in a large number of benign applications, which pose great threats to users. Most previous malware detection approaches mainly rely on features such as permissions, API calls, and opcode sequences. However, these approaches fail to capture structural semantics of applications. In this paper, we propose AMDroid that leverages function call graphs (FCGs) representing the behaviors of applications and applies graph kernels to automatically learn the structural semantics of applications from FCGs. We evaluate AMDroid on the Genome Project, and the experimental results show that AMDroid is effective to detect Android malware with 97.49% detection accuracy.
Anusha, M, Leelavathi, R.
2021.
Analysis on Sentiment Analytics Using Deep Learning Techniques. 2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). :542–547.
Sentiment analytics is the process of applying natural language processing and methods for text-based information to define and extract subjective knowledge of the text. Natural language processing and text classifications can deal with limited corpus data and more attention has been gained by semantic texts and word embedding methods. Deep learning is a powerful method that learns different layers of representations or qualities of information and produces state-of-the-art prediction results. In different applications of sentiment analytics, deep learning methods are used at the sentence, document, and aspect levels. This review paper is based on the main difficulties in the sentiment assessment stage that significantly affect sentiment score, pooling, and polarity detection. The most popular deep learning methods are a Convolution Neural Network and Recurrent Neural Network. Finally, a comparative study is made with a vast literature survey using deep learning models.
Xixuan, Ren, Lirui, Zhao, Kai, Wang, Zhixing, Xue, Anran, Hou, Qiao, Shao.
2022.
Android Malware Detection Based on Heterogeneous Information Network with Cross-Layer Features. 2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). :1—4.
As a mature and open mobile operating system, Android runs on many IoT devices, which has led to Android-based IoT devices have become a hotbed of malware. Existing static detection methods for malware using artificial intelligence algorithms focus only on the java code layer when extracting API features, however there is a lot of malicious behavior involving native layer code. Thus, to make up for the neglect of the native code layer, we propose a heterogeneous information network-based Android malware detection method with cross-layer features. We first translate the semantic information of apps and API calls into the form of meta-paths, and construct the adjacency of apps based on API calls, then combine information from different meta-paths using multi-core learning. We implemented our method on the dataset from VirusShare and AndroZoo, and the experimental results show that the accuracy of our method is 93.4%, which is at least 2% higher than other related methods using heterogeneous information networks for malware detection.
Bakour, K., Ünver, H. M., Ghanem, R..
2018.
The Android Malware Static Analysis: Techniques, Limitations, and Open Challenges. 2018 3rd International Conference on Computer Science and Engineering (UBMK). :586-593.
This paper aims to explain static analysis techniques in detail, and to highlight the weaknesses and challenges which face it. To this end, more than 80 static analysis-based framework have been studied, and in their light, the process of detecting malicious applications has been divided into four phases that were explained in a schematic manner. Also, the features that is used in static analysis were discussed in detail by dividing it into four categories namely, Manifest-based features, code-based features, semantic features and app's metadata-based features. Also, the challenges facing methods based on static analysis were discussed in detail. Finally, a case study was conducted to test the strength of some known commercial antivirus and one of the stat-of-art academic static analysis frameworks against obfuscation techniques used by developers of malicious applications. The results showed a significant impact on the performance of the most tested antiviruses and frameworks, which is reflecting the urgent need for more accurately tools.
Yu, Dongqing, Hou, Xiaowei, Li, Ce, Lv, Qiujian, Wang, Yan, Li, Ning.
2021.
Anomaly Detection in Unstructured Logs Using Attention-based Bi-LSTM Network. 2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC). :403–407.
System logs record valuable information about the runtime status of IT systems. Therefore, system logs are a naturally excellent source of information for anomaly detection. Most of the existing studies on log-based anomaly detection construct a detection model to identify anomalous logs. Generally, the model treats historical logs as natural language sequences and learns the normal patterns from normal log sequences, and detects deviations from normal patterns as anomalies. However, the majority of existing methods focus on sequential and quantitative information and ignore semantic information hidden in log sequence so that they are inefficient in anomaly detection. In this paper, we propose a novel framework for automatically detecting log anomalies by utilizing an attention-based Bi-LSTM model. To demonstrate the effectiveness of our proposed model, we evaluate the performance on a public production log dataset. Extensive experimental results show that the proposed approach outperforms all comparison methods for anomaly detection.
Xi, Lanlan, Xin, Yang, Luo, Shoushan, Shang, Yanlei, Tang, Qifeng.
2021.
Anomaly Detection Mechanism Based on Hierarchical Weights through Large-Scale Log Data. 2021 International Conference on Computer Communication and Artificial Intelligence (CCAI). :106—115.
In order to realize Intelligent Disaster Recovery and break the traditional reactive backup mode, it is necessary to forecast the potential system anomalies, and proactively backup the real-time datas and configurations. System logs record the running status as well as the critical events (including errors and warnings), which can help to detect system performance, debug system faults and analyze the causes of anomalies. What's more, with the features of real-time, hierarchies and easy-access, log data can be an ideal source for monitoring system status. To reduce the complexity and improve the robustness and practicability of existing log-based anomaly detection methods, we propose a new anomaly detection mechanism based on hierarchical weights, which can deal with unstable log data. We firstly extract semantic information of log strings, and get the word-level weights by SIF algorithm to embed log strings into vectors, which are then feed into attention-based Long Short-Term Memory(LSTM) deep learning network model. In addition to get sentence-level weight which can be used to explore the interdependence between different log sequences and improve the accuracy, we utilize attention weights to help with building workflow to diagnose the abnormal points in the execution of a specific task. Our experimental results show that the hierarchical weights mechanism can effectively improve accuracy of perdition task and reduce complexity of the model, which provides the feasibility foundation support for Intelligent Disaster Recovery.
Zhang, Cuicui, Sun, Jiali, Lu, Ruixuan, Wang, Peng.
2021.
Anomaly Detection Model of Power Grid Data Based on STL Decomposition. 2021 IEEE 5th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC). 5:1262—1265.
This paper designs a data anomaly detection method for power grid data centers. The method uses cloud computing architecture to realize the storage and calculation of large amounts of data from power grid data centers. After that, the STL decomposition method is used to decompose the grid data, and then the decomposed residual data is used for anomaly analysis to complete the detection of abnormal data in the grid data. Finally, the feasibility of the method is verified through experiments.
Huo, Da, Li, Xiaoyong, Li, Linghui, Gao, Yali, Li, Ximing, Yuan, Jie.
2022.
The Application of 1D-CNN in Microsoft Malware Detection. 2022 7th International Conference on Big Data Analytics (ICBDA). :181–187.
In the computer field, cybersecurity has always been the focus of attention. How to detect malware is one of the focuses and difficulties in network security research effectively. Traditional existing malware detection schemes can be mainly divided into two methods categories: database matching and the machine learning method. With the rise of deep learning, more and more deep learning methods are applied in the field of malware detection. Deeper semantic features can be extracted via deep neural network. The main tasks of this paper are as follows: (1) Using machine learning methods and one-dimensional convolutional neural networks to detect malware (2) Propose a machine The method of combining learning and deep learning is used for detection. Machine learning uses LGBM to obtain an accuracy rate of 67.16%, and one-dimensional CNN obtains an accuracy rate of 72.47%. In (2), LGBM is used to screen the importance of features and then use a one-dimensional convolutional neural network, which helps to further improve the detection result has an accuracy rate of 78.64%.
Wenbing Zhao.
2014.
Application-Aware Byzantine Fault Tolerance. Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on. :45-50.
Byzantine fault tolerance has been intensively studied over the past decade as a way to enhance the intrusion resilience of computer systems. However, state-machine-based Byzantine fault tolerance algorithms require deterministic application processing and sequential execution of totally ordered requests. One way of increasing the practicality of Byzantine fault tolerance is to exploit the application semantics, which we refer to as application-aware Byzantine fault tolerance. Application-aware Byzantine fault tolerance makes it possible to facilitate concurrent processing of requests, to minimize the use of Byzantine agreement, and to identify and control replica nondeterminism. In this paper, we provide an overview of recent works on application-aware Byzantine fault tolerance techniques. We elaborate the need for exploiting application semantics for Byzantine fault tolerance and the benefits of doing so, provide a classification of various approaches to application-aware Byzantine fault tolerance, and outline the mechanisms used in achieving application-aware Byzantine fault tolerance according to our classification.
Wenbing Zhao.
2014.
Application-Aware Byzantine Fault Tolerance. Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on. :45-50.
Byzantine fault tolerance has been intensively studied over the past decade as a way to enhance the intrusion resilience of computer systems. However, state-machine-based Byzantine fault tolerance algorithms require deterministic application processing and sequential execution of totally ordered requests. One way of increasing the practicality of Byzantine fault tolerance is to exploit the application semantics, which we refer to as application-aware Byzantine fault tolerance. Application-aware Byzantine fault tolerance makes it possible to facilitate concurrent processing of requests, to minimize the use of Byzantine agreement, and to identify and control replica nondeterminism. In this paper, we provide an overview of recent works on application-aware Byzantine fault tolerance techniques. We elaborate the need for exploiting application semantics for Byzantine fault tolerance and the benefits of doing so, provide a classification of various approaches to application-aware Byzantine fault tolerance, and outline the mechanisms used in achieving application-aware Byzantine fault tolerance according to our classification.
Ktob, A., Li, Z..
2017.
The Arabic Knowledge Graph: Opportunities and Challenges. 2017 IEEE 11th International Conference on Semantic Computing (ICSC). :48–52.
Semantic Web has brought forth the idea of computing with knowledge, hence, attributing the ability of thinking to machines. Knowledge Graphs represent a major advancement in the construction of the Web of Data where machines are context-aware when answering users' queries. The English Knowledge Graph was a milestone realized by Google in 2012. Even though it is a useful source of information for English users and applications, it does not offer much for the Arabic users and applications. In this paper, we investigated the different challenges and opportunities prone to the life-cycle of the construction of the Arabic Knowledge Graph (AKG) while following some best practices and techniques. Additionally, this work suggests some potential solutions to these challenges. The proprietary factor of data creates a major problem in the way of harvesting this latter. Moreover, when the Arabic data is openly available, it is generally in an unstructured form which requires further processing. The complexity of the Arabic language itself creates a further problem for any automatic or semi-automatic extraction processes. Therefore, the usage of NLP techniques is a feasible solution. Some preliminary results are presented later in this paper. The AKG has very promising outcomes for the Semantic Web in general and the Arabic community in particular. The goal of the Arabic Knowledge Graph is mainly the integration of the different isolated datasets available on the Web. Later, it can be used in both the academic (by providing a large dataset for many different research fields and enhance discovery) and commercial sectors (by improving search engines, providing metadata, interlinking businesses).
Tizio, G. Di, Ngo, C. Nam.
2020.
Are You a Favorite Target For Cryptojacking? A Case-Control Study On The Cryptojacking Ecosystem 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :515—520.
Illicitly hijacking visitors' computational resources for mining cryptocurrency via compromised websites is a consolidated activity.Previous works mainly focused on large-scale analysis of the cryptojacking ecosystem, technical means to detect browser-based mining as well as economic incentives of cryptojacking. So far, no one has studied if certain technical characteristics of a website can increase (decrease) the likelihood of being compromised for cryptojacking campaigns.In this paper, we propose to address this unanswered question by conducting a case-control study with cryptojacking websites obtained crawling the web using Minesweeper. Our preliminary analysis shows some association for certain website characteristics, however, the results obtained are not statistically significant. Thus, more data must be collected and further analysis must be conducted to obtain a better insight into the impact of these relations.
Ding, Steven H. H., Fung, Benjamin C. M., Charland, Philippe.
2019.
Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. 2019 IEEE Symposium on Security and Privacy (SP). :472–489.
Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \textbackslashemphAsm2Vec. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.
Yamaguchi, M., Kikuchi, H..
2017.
Audio-CAPTCHA with distinction between random phoneme sequences and words spoken by multi-speaker. 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :3071–3076.
Audio-CAPTCHA prevents malicious bots from attacking Web services and provides Web accessibility for visually-impaired persons. Most of the conventional methods employ statistical noise to distort sounds and let users remember and spell the words, which are difficult and laborious work for humans. In this paper, we utilize the difficulty on speaker-independent recognition for ASR machines instead of distortion with statistical noise. Our scheme synthesizes various voices by changing voice speed, pitch and native language of speakers. Moreover, we employ semantic identification problems between random phoneme sequences and meaningful words to release users from remembering and spelling words, so it improves the accuracy of humans and usability. We also evaluated our scheme in several experiments.
Gaston, J., Narayanan, M., Dozier, G., Cothran, D. L., Arms-Chavez, C., Rossi, M., King, M. C., Xu, J..
2018.
Authorship Attribution vs. Adversarial Authorship from a LIWC and Sentiment Analysis Perspective. 2018 IEEE Symposium Series on Computational Intelligence (SSCI). :920-927.
Although Stylometry has been effectively used for Authorship Attribution, there is a growing number of methods being developed that allow authors to mask their identity [2, 13]. In this paper, we investigate the usage of non-traditional feature sets for Authorship Attribution. By using non-traditional feature sets, one may be able to reveal the identity of adversarial authors who are attempting to evade detection from Authorship Attribution systems that are based on more traditional feature sets. In addition, we demonstrate how GEFeS (Genetic & Evolutionary Feature Selection) can be used to evolve high-performance hybrid feature sets composed of two non-traditional feature sets for Authorship Attribution: LIWC (Linguistic Inquiry & Word Count) and Sentiment Analysis. These hybrids were able to reduce the Adversarial Effectiveness on a test set presented in [2] by approximately 33.4%.
Pfeffer, Tobias, Göthel, Thomas, Glesner, Sabine.
2019.
Automatic Analysis of Critical Sections for Efficient Secure Multi-Execution. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :318–325.
Enforcement of hypersafety security policies such as noninterference can be achieved through Secure Multi-Execution (SME). While this is typically very resource-intensive, more efficient solutions such as Demand-Driven Secure Multi-Execution (DDSME) exist. Here, the resource requirements are reduced by restricting multi-execution enforcement to critical sections in the code. However, the current solution requires manual binary analysis. In this paper, we propose a fully automatic critical section analysis. Our analysis extracts a context-sensitive boundary of all nodes that handle information from the reachability relation implied by the control-flow graph. We also provide evaluation results, demonstrating the correctness and acceleration of DDSME with our analysis.
Yang, Ping, Shu, Hui, Kang, Fei, Bu, Wenjuan.
2020.
Automatically Generating Malware Summary Using Semantic Behavior Graphs (SBGs). 2020 Information Communication Technologies Conference (ICTC). :282–291.
In malware behavior analysis, there are limitations in the analysis method of control flow and data flow. Researchers analyzed data flow by dynamic taint analysis tools, however, it cost a lot. In this paper, we proposed a method of generating malware summary based on semantic behavior graphs (SBGs, Semantic Behavior Graphs) to address this issue. In this paper, we considered various situation where behaviors be capable of being associated, thus an algorithm of generating semantic behavior graphs was given firstly. Semantic behavior graphs are composed of behavior nodes and associated data edges. Then, we extracted behaviors and logical relationships between behaviors from semantic behavior graphs, and finally generated a summary of malware behaviors with true intension. Experimental results showed that our approach can effectively identify and describe malicious behaviors and generate accurate behavior summary.
Kelly, Daniel M., Wellons, Christopher C., Coffman, Joel, Gearhart, Andrew S..
2019.
Automatically Validating the Effectiveness of Software Diversity Schemes. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks – Supplemental Volume (DSN-S). :1–2.
Software diversity promises to invert the current balance of power in cybersecurity by preventing exploit reuse. Nevertheless, the comparative evaluation of diversity techniques has received scant attention. In ongoing work, we use the DARPA Cyber Grand Challenge (CGC) environment to assess the effectiveness of diversifying compilers in mitigating exploits. Our approach provides a quantitative comparison of diversity strategies and demonstrates wide variation in their effectiveness.
Liu, Kui, Koyuncu, Anil, Kim, Dongsun, Bissyandè, Tegawende F..
2019.
AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). :1–12.
Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-the-art pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases.
Lee, Jungbeom, Yi, Jihun, Shin, Chaehun, Yoon, Sungroh.
2021.
BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :2643–2651.
Weakly supervised segmentation methods using bounding box annotations focus on obtaining a pixel-level mask from each box containing an object. Existing methods typically depend on a class-agnostic mask generator, which operates on the low-level information intrinsic to an image. In this work, we utilize higher-level information from the behavior of a trained object detector, by seeking the smallest areas of the image from which the object detector produces almost the same result as it does from the whole image. These areas constitute a bounding-box attribution map (BBAM), which identifies the target object in its bounding box and thus serves as pseudo ground-truth for weakly supervised semantic and instance segmentation. This approach significantly outperforms recent comparable techniques on both the PASCAL VOC and MS COCO benchmarks in weakly supervised semantic and instance segmentation. In addition, we provide a detailed analysis of our method, offering deeper insight into the behavior of the BBAM.
Bharati, Aparna, Moreira, Daniel, Brogan, Joel, Hale, Patricia, Bowyer, Kevin, Flynn, Patrick, Rocha, Anderson, Scheirer, Walter.
2019.
Beyond Pixels: Image Provenance Analysis Leveraging Metadata. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). :1692–1702.
Creative works, whether paintings or memes, follow unique journeys that result in their final form. Understanding these journeys, a process known as "provenance analysis," provides rich insights into the use, motivation, and authenticity underlying any given work. The application of this type of study to the expanse of unregulated content on the Internet is what we consider in this paper. Provenance analysis provides a snapshot of the chronology and validity of content as it is uploaded, re-uploaded, and modified over time. Although still in its infancy, automated provenance analysis for online multimedia is already being applied to different types of content. Most current works seek to build provenance graphs based on the shared content between images or videos. This can be a computationally expensive task, especially when considering the vast influx of content that the Internet sees every day. Utilizing non-content-based information, such as timestamps, geotags, and camera IDs can help provide important insights into the path a particular image or video has traveled during its time on the Internet without large computational overhead. This paper tests the scope and applicability of metadata-based inferences for provenance graph construction in two different scenarios: digital image forensics and cultural analytics.