Creating a test suite for machine translation evaluation

Machine translation systems are usually evaluated on how similar their output is to a human reference translation. This results in a single score, which doesn't tell us much about the types of errors that a system makes. To this end, so-called test suites have been proposed. A typical test suite focuses on a particular aspect of translation, for instance translation of pronouns, translation of ambiguous words, or translation of subtle morphological distinctions.

For your thesis, you will first identify a challenging aspect of the translation between a particular language pair. Then, you will produce a test suite for this aspect, following the protocols proposed in earlier work. Finally, you will collect outputs from diverse machine translation systems and evaluate them with your test suite.

Pronoun test suites:

Word sense disambiguation test suites:

Morphology test suites:

Publisert 6. okt. 2023 10:29 - Sist endret 9. okt. 2023 12:41

Veileder(e)

Yves Scherrer Universitetet i Oslo
Vladislav Mikhailov Universitetet i Oslo

Creating a test suite for machine translation evaluation

Veileder(e)

Omfang (studiepoeng)