Concept Search: Semantics Enabled Information Retrieval

Kharkevich, Uladzimir (2010) Concept Search: Semantics Enabled Information Retrieval. PhD thesis, University of Trento.

[img]
Preview
PDF - Doctoral Thesis
946Kb

Abstract

The goal of information retrieval (IR) is to map a natural language query, which specifies the user information needs, to a set of objects in a given collection, which meet these needs. Historically, there have been two major approaches to IR that we call syntactic IR and semantic IR. In syntactic IR, search engines use words or multi-word phrases that occur in document and query representations. The search procedure, used by these search engines, is principally based on the syntactic matching of document and query representations. The precision and recall achieved by these search engines might be negatively affected by the problems of (i) polysemy, (ii) synonymy, (iii) complex concepts, and (iv) related concepts. Semantic IR is based on fetching document and query representations through a semantic analysis of their contents using natural language processing techniques and then retrieving documents by matching these semantic representations. Semantic IR approaches are developed to improve the quality of syntactic approaches but, in practice, results of semantic IR are often inferior to that of syntactic one. In this thesis, we propose a novel approach to IR which extends syntactic IR with semantics, thus addressing the problem of low precision and low recall of syntactic IR. The main idea is to keep the same machinery which has made syntactic IR so successful, but to modify it so that, whenever possible (and useful), syntactic IR is substituted by semantic IR, thus improving the system performance. As instances of the general approach, we describe the semantics enabled approaches to: (i) document retrieval, (ii) document classification, and (iii) peer-to-peer search.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
PhD Cycle:XXI
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Repository Staff approval on:05 May 2010 11:54

Repository Staff Only: item control page