A Hierarchy-to-Sequence Attentional Neural Machine Translation Model

Submitted by grigby1 on Mon, 10/05/2020 - 2:01pm

Title	A Hierarchy-to-Sequence Attentional Neural Machine Translation Model
Publication Type	Journal Article
Year of Publication	2018
Authors	Su, Jinsong, Zeng, Jiali, Xiong, Deyi, Liu, Yang, Wang, Mingxuan, Xie, Jun
Journal	IEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume	26
Pagination	623—632
Date Published	March 2018
ISSN	2329-9304
Keywords	attention models, Chinese-English translation, clause level, compositionality, Context modeling, conventional NMT model, Decoding, English-German translation, grammars, hierarchical neural network structure, Hierarchy-to-sequence, hierarchy-to-sequence attentional neural machine translation model, hierarchy-to-sequence attentional NMT model, language translation, learning (artificial intelligence), long parallel sentences, natural language processing, neural machine translation, neural nets, optimal model parameters, parameter learning, pubcrawl, recurrent neural nets, Recurrent neural networks, segmented clause sequence, segmented clauses, semantic compositionality modeling, Semantics, sequence-to-sequence attentional neural machine translation, short clauses, Speech, speech processing, text analysis, Training, translation prediction
Abstract	Although sequence-to-sequence attentional neural machine translation (NMT) has achieved great progress recently, it is confronted with two challenges: learning optimal model parameters for long parallel sentences and well exploiting different scopes of contexts. In this paper, partially inspired by the idea of segmenting a long sentence into short clauses, each of which can be easily translated by NMT, we propose a hierarchy-to-sequence attentional NMT model to handle these two challenges. Our encoder takes the segmented clause sequence as input and explores a hierarchical neural network structure to model words, clauses, and sentences at different levels, particularly with two layers of recurrent neural networks modeling semantic compositionality at the word and clause level. Correspondingly, the decoder sequentially translates segmented clauses and simultaneously applies two types of attention models to capture contexts of interclause and intraclause for translation prediction. In this way, we can not only improve parameter learning, but also well explore different scopes of contexts for translation. Experimental results on Chinese-English and English-German translation demonstrate the superiorities of the proposed model over the conventional NMT model.
URL	https://ieeexplore.ieee.org/document/8246560
DOI	10.1109/TASLP.2018.2789721
Citation Key	su_hierarchy–sequence_2018

Groups:

Science of Security VO