Visible to the public Biblio

Filters: Keyword is grammars  [Clear All Filters]
2020-10-05
Su, Jinsong, Zeng, Jiali, Xiong, Deyi, Liu, Yang, Wang, Mingxuan, Xie, Jun.  2018.  A Hierarchy-to-Sequence Attentional Neural Machine Translation Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 26:623—632.

Although sequence-to-sequence attentional neural machine translation (NMT) has achieved great progress recently, it is confronted with two challenges: learning optimal model parameters for long parallel sentences and well exploiting different scopes of contexts. In this paper, partially inspired by the idea of segmenting a long sentence into short clauses, each of which can be easily translated by NMT, we propose a hierarchy-to-sequence attentional NMT model to handle these two challenges. Our encoder takes the segmented clause sequence as input and explores a hierarchical neural network structure to model words, clauses, and sentences at different levels, particularly with two layers of recurrent neural networks modeling semantic compositionality at the word and clause level. Correspondingly, the decoder sequentially translates segmented clauses and simultaneously applies two types of attention models to capture contexts of interclause and intraclause for translation prediction. In this way, we can not only improve parameter learning, but also well explore different scopes of contexts for translation. Experimental results on Chinese-English and English-German translation demonstrate the superiorities of the proposed model over the conventional NMT model.

Li, Xilai, Song, Xi, Wu, Tianfu.  2019.  AOGNets: Compositional Grammatical Architectures for Deep Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :6213—6223.

Neural architectures are the foundation for improving performance of deep neural networks (DNNs). This paper presents deep compositional grammatical architectures which harness the best of two worlds: grammar models and DNNs. The proposed architectures integrate compositionality and reconfigurability of the former and the capability of learning rich features of the latter in a principled way. We utilize AND-OR Grammar (AOG) as network generator in this paper and call the resulting networks AOGNets. An AOGNet consists of a number of stages each of which is composed of a number of AOG building blocks. An AOG building block splits its input feature map into N groups along feature channels and then treat it as a sentence of N words. It then jointly realizes a phrase structure grammar and a dependency grammar in bottom-up parsing the “sentence” for better feature exploration and reuse. It provides a unified framework for the best practices developed in state-of-the-art DNNs. In experiments, AOGNet is tested in the ImageNet-1K classification benchmark and the MS-COCO object detection and segmentation benchmark. In ImageNet-1K, AOGNet obtains better performance than ResNet and most of its variants, ResNeXt and its attention based variants such as SENet, DenseNet and DualPathNet. AOGNet also obtains the best model interpretability score using network dissection. AOGNet further shows better potential in adversarial defense. In MS-COCO, AOGNet obtains better performance than the ResNet and ResNeXt backbones in Mask R-CNN.

2020-07-06
Chai, Yadeng, Liu, Yong.  2019.  Natural Spoken Instructions Understanding for Robot with Dependency Parsing. 2019 IEEE 9th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER). :866–871.
This paper presents a method based on syntactic information, which can be used for intent determination and slot filling tasks in a spoken language understanding system including the spoken instructions understanding module for robot. Some studies in recent years attempt to solve the problem of spoken language understanding via syntactic information. This research is a further extension of these approaches which is based on dependency parsing. In this model, the input for neural network are vectors generated by a dependency parsing tree, which we called window vector. This vector contains dependency features that improves performance of the syntactic-based model. The model has been evaluated on the benchmark ATIS task, and the results show that it outperforms many other syntactic-based approaches, especially in terms of slot filling, it has a performance level on par with some state of the art deep learning algorithms in recent years. Also, the model has been evaluated on FBM3, a dataset of the RoCKIn@Home competition. The overall rate of correctly understanding the instructions for robot is quite good but still not acceptable in practical use, which is caused by the small scale of FBM3.
2020-02-10
Gao, Hongcan, Zhu, Jingwen, Liu, Lei, Xu, Jing, Wu, Yanfeng, Liu, Ao.  2019.  Detecting SQL Injection Attacks Using Grammar Pattern Recognition and Access Behavior Mining. 2019 IEEE International Conference on Energy Internet (ICEI). :493–498.
SQL injection attacks are a kind of the greatest security risks on Web applications. Much research has been done to detect SQL injection attacks by rule matching and syntax tree. However, due to the complexity and variety of SQL injection vulnerabilities, these approaches fail to detect unknown and variable SQL injection attacks. In this paper, we propose a model, ATTAR, to detect SQL injection attacks using grammar pattern recognition and access behavior mining. The most important idea of our model is to extract and analyze features of SQL injection attacks in Web access logs. To achieve this goal, we first extract and customize Web access log fields from Web applications. Then we design a grammar pattern recognizer and an access behavior miner to obtain the grammatical and behavioral features of SQL injection attacks, respectively. Finally, based on two feature sets, machine learning algorithms, e.g., Naive Bayesian, SVM, ID3, Random Forest, and K-means, are used to train and detect our model. We evaluated our model on these two feature sets, and the results show that the proposed model can effectively detect SQL injection attacks with lower false negative rate and false positive rate. In addition, comparing the accuracy of our model based on different algorithms, ID3 and Random Forest have a better ability to detect various kinds of SQL injection attacks.
2018-02-15
Bieschke, T., Hermerschmidt, L., Rumpe, B., Stanchev, P..  2017.  Eliminating Input-Based Attacks by Deriving Automated Encoders and Decoders from Context-Free Grammars. 2017 IEEE Security and Privacy Workshops (SPW). :93–101.

Software systems nowadays communicate via a number of complex languages. This is often the cause of security vulnerabilities like arbitrary code execution, or injections. Whereby injections such as cross-site scripting are widely known from textual languages such as HTML and JSON that constantly gain more popularity. These systems use parsers to read input and unparsers write output, where these security vulnerabilities arise. Therefore correct parsing and unparsing of messages is of the utmost importance when developing secure and reliable systems. Part of the challenge developers face is to correctly encode data during unparsing and decode it during parsing. This paper presents McHammerCoder, an (un)parser and encoding generator supporting textual and binary languages. Those (un)parsers automatically apply the generated encoding, that is derived from the language's grammar. Therefore manually defining and applying encoding is not required to effectively prevent injections when using McHammerCoder. By specifying the communication language within a grammar, McHammerCoder provides developers with correct input and output handling code for their custom language.

2017-12-20
Mohammadi, M., Chu, B., Lipford, H. R..  2017.  Detecting Cross-Site Scripting Vulnerabilities through Automated Unit Testing. 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). :364–373.

The best practice to prevent Cross Site Scripting (XSS) attacks is to apply encoders to sanitize untrusted data. To balance security and functionality, encoders should be applied to match the web page context, such as HTML body, JavaScript, and style sheets. A common programming error is the use of a wrong encoder to sanitize untrusted data, leaving the application vulnerable. We present a security unit testing approach to detect XSS vulnerabilities caused by improper encoding of untrusted data. Unit tests for the XSS vulnerability are automatically constructed out of each web page and then evaluated by a unit test execution framework. A grammar-based attack generator is used to automatically generate test inputs. We evaluate our approach on a large open source medical records application, demonstrating that we can detect many 0-day XSS vulnerabilities with very low false positives, and that the grammar-based attack generator has better test coverage than industry best practices.