Rodriguez, Kepa Joseba (2010) Resources for linguistically motivated Multilingual Anaphora Resolution. PhD thesis, University of Trento.
| PDF (PhD dissertation. Kepa J. Rodriguez) - Doctoral Thesis 2402Kb |
Abstract
An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.
Item Type: | Doctoral Thesis (PhD) |
---|---|
Doctoral School: | Cognitive and Brain Sciences |
PhD Cycle: | 23 |
Subjects: | Area 10 - Scienze dell'antichità, filologico-letterarie e storico-artistiche > L-LIN/01 GLOTTOLOGIA E LINGUISTICA |
Uncontrolled Keywords: | coreference, anaphora, corpus linguistics, corpus creation |
Funders: | Expert System |
Repository Staff approval on: | 18 Jan 2011 11:57 |
Repository Staff Only: item control page