Oppgaven er ikke lenger tilgjengelig

Context-Free Approximations of (Large) Unification Grammars

Unification-based grammatical frameworks (like HPSG or LFG) have gained popularity in Natural Language Processing because they enable a linguistically adequate account of the system of rules in natural languages. However, efficient parsing and generation with large unification-based grammars remains an active field of research, in part owed to the complexity of the descriptive formalism. At the same time, many existing unification-based grammars (UBGs) capture languages that are context free or ‘nearly context free’.

This project will adapt, implement, and experimentally validate methods for converting UBGs into context-free grammars (CFGs), either aiming to produce an equivalent grammar, or one that approximates the original UBG to a certain degree, i.e. recognizes a language that includes the original language (and also accepts additional utterances, which are ungrammatical according to UBG). Finding the optimal degree of approximation is a balancing act, as inclusion of more information from the original UBG in the approximation can lead to exponential growth in the size of the CFG. At the same time, even a relative crude CFG approximation may have practical value, for example to serve as a ‘filter’ on full unification-based parsing (for improved efficiency), or as the backbone of probabilistic disambiguation and pruning (for better disambiguation).

This work will take the algorithm of Kiefer & Krieger (2004) and the LinGO English Resource Grammar (ERG) as its points of departure. The project requires a good understanding of unification-based grammar, ideally some knowledge of English syntax, as well as good programming experience. Possible implementation languages are C++ (preferred) or Common Lisp. Please contact Stephan Oepen for details.

Publisert 14. mars 2011 11:27 - Sist endret 1. des. 2017 10:53

Veileder(e)

Omfang (studiepoeng)

60