Oppgaven er ikke lenger tilgjengelig

Dependency Syntax as a Foreign Language

Albeit seemingly crazy, there has been some recent success in approaching grammatical analysis (e.g. dependency parsing) as an instance of Neural Machine Translation (NMT).

Neural Machine Translation (NMT) has led to noticeable improvements in the output quality of on-line translation services in recent years, and the broad success of so-called encoder–decoder or sequence-to-sequence neural network architectures has led to creative applications to other language processing problems.

Vinyals et al. (2015) show how syntactic parsing can be successfully modeled as a ‘translation’ problem between a natural-language utterance and a string-like serialization of a syntactic tree.  Abstractly, this approach can be likened to ‘translating’ an arithmetic expression like 5 + 3 * 2 to the bit string 00001011 (representing the numeric value 11), rather than using a conventional evaluation model of hierarchically nested operators and operands.  It is, mildly put, remarkable that this approach works reasonably well for syntactic structure.

While the original proposal focussed on parsing into constituent-structure representations of syntax, it has recently been adapted with some success to parsing into semantic dependency graphs.  A systematic exploration of the ‘parsing by translation’ approach for structurally simpler syntactic dependency trees, however, has yet to be performed.  This project will build and train syntactic dependency parsers from the open-source OpenNMT toolkit and will evaluate the success of this approach for various frameworks of syntactic dependencies, different techniques for serializing a dependency tree as a string, and optionally languages other than English.

The topic is in principle suitable for group work and affords some flexibility in relative weight given to engineering, linguistic, or experimental perspectives.  Syntactico-semantic parsing and neural machine translation are focus areas in the Nordic Language Processing Laboratory (NLPL) project, so there may be possibilities for a part-time or summer job funded by NLPL to improve its parsing and translation infrastructures.

Publisert 18. okt. 2018 13:25 - Sist endret 7. des. 2022 14:54

Veileder(e)

Student(er)

  • Alexandra Thuy-Lan Huynh

Omfang (studiepoeng)

60