Bortoli, Stefano (2013) Knowledge Based Open Entity Matching. PhD thesis, University of Trento.
|PDF - Doctoral Thesis |
Available under License Creative Commons Attribution Non-commercial.
In this work we argue for the definition a knowledge-based entity matching framework for the implementation of a reliable and incrementally scalable solution. Such knowledge base is formed by an ontology and a set of entity matching rules suitable to be applied as a reliable equational theory in the context of the Semantic Web. In particular, we are going to prove that relying on the existence of a set of contextual mappings to ease the semantic heterogeneity characterizing descriptions on the Web, a knowledge-based solution can perform comparably, and sometimes better, than existing solutions at the state of the art. We further argue that a knowledge-based solution to the open entity matching problem ought to be considered under the open world assumption, as in some cases the descriptions to be matched may not contain the necessary information to take any accurate matching decision. The main goal of this work is to show how the framework proposed is suitable to pursue a reliable solution of the entity matching problem, regardless the set of rules for the ontology adopted. In fact, we believe that structural and syntactic heterogeneity affecting data on the Web undermine the definition of a global unique solution. However, we argue that a knowledge-driven approach, considering the semantic and meta-properties of compared attributes, can provide important benefits and lead to more reliable solutions. To achieve this goal, we are going to implement several experiments to evaluate different sets of rules, testing our thesis and learning important lessons for future developments. The sets of rules that we will consider to bootstrap the solution proposed in this work are the result of diverse complementary processes: first we want to investigate whether capturing the matching knowledge employed by people in taking entity matching decision by relying on machine learning techniques can produce an effective set of rules (bottom-up strategy); second, we investigate the application of formal ontology pools to analyze the features defined in the ontology and support the definition of entity matching rules (top-down strategy). Moreover, in this work we argue that by merging the rules resulting from these complementary processes, we can define a set of rules that can support reliably entity matching decision in an open context.
|Item Type:||Doctoral Thesis (PhD)|
|Doctoral School:||Information and Communication Technology|
|Subjects:||Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA|
|Uncontrolled Keywords:||semantic web, entity matching, knowledge-base, open world assumption, entity name system|
|Repository Staff approval on:||10 Jun 2013 15:49|
Repository Staff Only: item control page