Yin, Yifei, Zulkernine, Farhana, Dahan, Samuel.
2020.
Determining Worker Type from Legal Text Data Using Machine Learning. 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :444–450.
This project addresses a classic employment law question in Canada and elsewhere using machine learning approach: how do we know whether a worker is an employee or an independent contractor? This is a central issue for self-represented litigants insofar as these two legal categories entail very different rights and employment protections. In this interdisciplinary research study, we collaborated with the Conflict Analytics Lab to develop machine learning models aimed at determining whether a worker is an employee or an independent contractor. We present a number of supervised learning models including a neural network model that we implemented using data labeled by law researchers and compared the accuracy of the models. Our neural network model achieved an accuracy rate of 91.5%. A critical discussion follows to identify the key features in the data that influence the accuracy of our models and provide insights about the case outcomes.
Hu, Shengze, He, Chunhui, Ge, Bin, Liu, Fang.
2020.
Enhanced Word Embedding Method in Text Classification. 2020 6th International Conference on Big Data and Information Analytics (BigDIA). :18–22.
For the task of natural language processing (NLP), Word embedding technology has a certain impact on the accuracy of deep neural network algorithms. Considering that the current word embedding method cannot realize the coexistence of words and phrases in the same vector space. Therefore, we propose an enhanced word embedding (EWE) method. Before completing the word embedding, this method introduces a unique sentence reorganization technology to rewrite all the sentences in the original training corpus. Then, all the original corpus and the reorganized corpus are merged together as the training corpus of the distributed word embedding model, so as to realize the coexistence problem of words and phrases in the same vector space. We carried out experiment to demonstrate the effectiveness of the EWE algorithm on three classic benchmark datasets. The results show that the EWE method can significantly improve the classification performance of the CNN model.
Hou, Xiaolu, Breier, Jakub, Jap, Dirmanto, Ma, Lei, Bhasin, Shivam, Liu, Yang.
2020.
Security Evaluation of Deep Neural Network Resistance Against Laser Fault Injection. 2020 IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA). :1–6.
Deep learning is becoming a basis of decision making systems in many application domains, such as autonomous vehicles, health systems, etc., where the risk of misclassification can lead to serious consequences. It is necessary to know to which extent are Deep Neural Networks (DNNs) robust against various types of adversarial conditions. In this paper, we experimentally evaluate DNNs implemented in embedded device by using laser fault injection, a physical attack technique that is mostly used in security and reliability communities to test robustness of various systems. We show practical results on four activation functions, ReLu, softmax, sigmoid, and tanh. Our results point out the misclassification possibilities for DNNs achieved by injecting faults into the hidden layers of the network. We evaluate DNNs by using several different attack strategies to show which are the most efficient in terms of misclassification success rates. Outcomes of this work should be taken into account when deploying devices running DNNs in environments where malicious attacker could tamper with the environmental parameters that would bring the device into unstable conditions. resulting into faults.
She, Dongdong, Chen, Yizheng, Shah, Abhishek, Ray, Baishakhi, Jana, Suman.
2020.
Neutaint: Efficient Dynamic Taint Analysis with Neural Networks. 2020 IEEE Symposium on Security and Privacy (SP). :1527–1543.
Dynamic taint analysis (DTA) is widely used by various applications to track information flow during runtime execution. Existing DTA techniques use rule-based taint-propagation, which is neither accurate (i.e., high false positive rate) nor efficient (i.e., large runtime overhead). It is hard to specify taint rules for each operation while covering all corner cases correctly. Moreover, the overtaint and undertaint errors can accumulate during the propagation of taint information across multiple operations. Finally, rule-based propagation requires each operation to be inspected before applying the appropriate rules resulting in prohibitive performance overhead on large real-world applications.In this work, we propose Neutaint, a novel end-to-end approach to track information flow using neural program embeddings. The neural program embeddings model the target's programs computations taking place between taint sources and sinks, which automatically learns the information flow by observing a diverse set of execution traces. To perform lightweight and precise information flow analysis, we utilize saliency maps to reason about most influential sources for different sinks. Neutaint constructs two saliency maps, a popular machine learning approach to influence analysis, to summarize both coarse-grained and fine-grained information flow in the neural program embeddings.We compare Neutaint with 3 state-of-the-art dynamic taint analysis tools. The evaluation results show that Neutaint can achieve 68% accuracy, on average, which is 10% improvement while reducing 40× runtime overhead over the second-best taint tool Libdft on 6 real world programs. Neutaint also achieves 61% more edge coverage when used for taint-guided fuzzing indicating the effectiveness of the identified influential bytes. We also evaluate Neutaint's ability to detect real world software attacks. The results show that Neutaint can successfully detect different types of vulnerabilities including buffer/heap/integer overflows, division by zero, etc. Lastly, Neutaint can detect 98.7% of total flows, the highest among all taint analysis tools.
Ma, Chuang, You, Haisheng, Wang, Li, Zhang, Jiajun.
2020.
Intelligent Cybersecurity Situational Awareness Model Based on Deep Neural Network. 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :76–83.
In recent years, we have faced a series of online threats. The continuous malicious attacks on the network have directly caused a huge threat to the user's spirit and property. In order to deal with the complex security situation in today's network environment, an intelligent network situational awareness model based on deep neural networks is proposed. Use the nonlinear characteristics of the deep neural network to solve the nonlinear fitting problem, establish a network security situation assessment system, take the situation indicators output by the situation assessment system as a guide, and collect on the main data features according to the characteristics of the network attack method, the main data features are collected and the data is preprocessed. This model designs and trains a 4-layer neural network model, and then use the trained deep neural network model to understand and analyze the network situation data, so as to build the network situation perception model based on deep neural network. The deep neural network situational awareness model designed in this paper is used as a network situational awareness simulation attack prediction experiment. At the same time, it is compared with the perception model using gray theory and Support Vector Machine(SVM). The experiments show that this model can make perception according to the changes of state characteristics of network situation data, establish understanding through learning, and finally achieve accurate prediction of network attacks. Through comparison experiments, datatypized neural network deep neural network situation perception model is proved to be effective, accurate and superior.