Adria de Gispert

CL
4papers
4,401citations
Novelty40%
AI Score26

4 Papers

CLJun 11, 2019
Cued@wmt19:ewc&lms

Felix Stahlberg, Danielle Saunders, Adria de Gispert et al.

Two techniques provide the fabric of the Cambridge University Engineering Department's (CUED) entry to the WMT19 evaluation campaign: elastic weight consolidation (EWC) and different forms of language modelling (LMs). We report substantial gains by fine-tuning very strong baselines on former WMT test sets using a combination of checkpoint averaging and EWC. A sentence-level Transformer LM and a document-level LM based on a modified Transformer architecture yield further gains. As in previous years, we also extract $n$-gram probabilities from SMT lattices which can be seen as a source-conditioned $n$-gram LM.

CLJun 2, 2019
Domain Adaptive Inference for Neural Machine Translation

Danielle Saunders, Felix Stahlberg, Adria de Gispert et al.

We investigate adaptive ensemble weighting for Neural Machine Translation, addressing the case of improving performance on a new and potentially unknown domain without sacrificing performance on the original domain. We adapt sequentially across two Spanish-English and three English-German tasks, comparing unregularized fine-tuning, L2 and Elastic Weight Consolidation. We then report a novel scheme for adaptive NMT ensemble decoding by extending Bayesian Interpolation with source information, and show strong improvements across test domains without access to the domain label.

CLAug 28, 2018
The University of Cambridge's Machine Translation Systems for WMT18

Felix Stahlberg, Adria de Gispert, Bill Byrne

The University of Cambridge submission to the WMT18 news translation task focuses on the combination of diverse models of translation. We compare recurrent, convolutional, and self-attention-based neural models on German-English, English-German, and Chinese-English. Our final system combines all neural models together with a phrase-based SMT system in an MBR-based scheme. We report small but consistent gains on top of strong Transformer ensembles.

CLMay 1, 2018
Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT

Danielle Saunders, Felix Stahlberg, Adria de Gispert et al.

We explore strategies for incorporating target syntax into Neural Machine Translation. We specifically focus on syntax in ensembles containing multiple sentence representations. We formulate beam search over such ensembles using WFSTs, and describe a delayed SGD update training procedure that is especially effective for long representations like linearized syntax. Our approach gives state-of-the-art performance on a difficult Japanese-English task.