Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

7-2021

Abstract

We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OAXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OAXE alleviates the multimodality problem by reducing token repetitions and increasing prediction confidence. Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE.

Discipline

Artificial Intelligence and Robotics

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 38th International Conference on Machine Learning, Virtual Conference, 2021 July 18-24

First Page

Last Page

City or Country

Virtual Conference

Citation

DU, Cunxiao; TU, Zhaopeng; and JIANG, Jing. Order-agnostic cross entropy for non-autoregressive machine translation. (2021). Proceedings of the 38th International Conference on Machine Learning, Virtual Conference, 2021 July 18-24. 1-11.
Available at: https://ink.library.smu.edu.sg/sis_research/6660

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

Order-agnostic cross entropy for non-autoregressive machine translation

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Order-agnostic cross entropy for non-autoregressive machine translation

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links