학술논문

Unsupervised Quality Estimation via Multilingual Denoising Autoencoder / 多言語雑音除去自己符号化器による教師なし品質推定

Document Type

Journal Article

Author

Masato Yoshinaka; Takashi Ninomiya; Tetsuro Nishihara; Tomoyuki Kajiwara; Yuji Iwamoto; Yuki Arase; 二宮崇; 吉仲真人; 岩本裕司; 梶原智之; 荒瀬由紀; 西原哲郎

Source

自然言語処理 / Journal of Natural Language Processing. 2022, 29(2):669

Subject

Machine Translation
Quality Estimation
品質推定
機械翻訳

Language

Japanese

ISSN

1340-7619
2185-8314

Abstract

Supervised quality estimation methods require a corpus that manually annotates qualities of translation outputs. To avoid such costly annotation process, previous studies have proposed unsupervised quality estimation methods based on machine translation models trained on a large-scale parallel corpora. However, these methods are not applicable to low-resource or zero-resource language pairs. This study addresses this problem by utilising a pre-trained multilingual denoising autoencoder. Specifically, the proposed method constructs a machine translation model by fine-tuning the multilingual denoising autoencoder with parallel corpora. It then estimates the translation quality as a forced-decoding probability of a translation output given its source sentence. The pre-trained denoising autoencoder captures linguistic characteristics across languages, which allows our method to evaluate translation quality of low-resource and zero-resource language pairs. Evaluation results on the WMT20 quality estimation task confirm that the proposed method achieves the best unsupervised quality estimation performance for five language pairs under the black box settings. Detailed analysis shows that the proposed method also performs well on under the zero-shot setting.

Online Access

Open Access (JSTAGE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송