Extracting conceptual structures from multiple sources

Barbu, Eduard (2010) Extracting conceptual structures from multiple sources. PhD thesis, University of Trento.

PDF (Extracting conceptual structures from multiple sources) - Doctoral Thesis


This thesis extracts conceptual structures from multiple sources: Wordnet, Web Corpora and Wikipedia. The conceptual structures extracted from Wordnet and Web Corpora are inspired by the feature norm effort in cognitive psychology. The conceptual structure extracted from Wikipedia makes the transition between feature norm structures and theory like structures. The main contribution of this thesis can be grouped in two categories: 1. Novel methods for the extraction of conceptual structures. More precisely, there are three new methods we developed: (a) Conceptual structure extraction from Wordnet. We devise a procedure for property extraction from Wordnet using the notion of semantic neighborhood. The procedure exploits the main relations organizing the nouns, the information in glosses and the inheritance of properties principle. (b) Feature Norms like extraction from corpora. We propose a method to acquire feature norm like structures from corpora using weakly supervised methods. (c) Conceptual Structure from Wikipedia. A novel unsupervised method for the extraction of conceptual structures from Wikipedia entries of similar concepts is put forward. The main idea we follow is that similar concepts (i.e. those classied under the same node in a taxonomy) are described in a comparable way in Wikipedia. Moreover, to understand the kind of information extracted from Wikipedia we annotate this knowledge with a set of property types. 2. Evaluation. We evaluate Wordnet as a model of semantic memory and suggest the addition of new semantic relations. We also assess the properties extracted from all sources for a unified test set, in a clustering experiment.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Cognitive and Brain Sciences
PhD Cycle:XXII
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Repository Staff approval on:15 Apr 2010 11:59

Repository Staff Only: item control page