Advanced Query Paradigms for the Novice User

Mottin, Davide (2015) Advanced Query Paradigms for the Novice User. PhD thesis, University of Trento.

PDF - Doctoral Thesis


Query answering is one of the most important processes in search systems, for it connects users to the information stored in data sources. A query is a set of specifications or constraints that the user provides to describe the objects of interest. As such, answering a query means retrieving those objects from the data source that match the user constraints. An answer from a search system may not fully satisfy the user. This happens if the answer does not contain the required object or it contains a number of irrelevant results. Commonly, the user does not know how to describe the query and ends up with one overly generic or specific or, even worse, she is not even aware of the correct conditions to describe the expected results. These problems are particularly evident when the database is interrogated by a novice user who, by definition, does not have sufficient technological skills to understand complicated query languages, or simply gives up if the system does not respond properly or timely. In this dissertation, we focus on three common problems in the broad query answering process to help novice users find the correct answers when the system does not provide sufficient support. First, we look at the empty answer problem, where the user provides a very specific query for which no answer exists in the database. In particular, we concentrate on interactive approaches for novice users. We end up with a rich probabilistic framework that includes user preferences and smoothly guides the user towards the most likely answers by means of simple yes/no questions on the query conditions to be discarded. Second, we analyze the information overload problem, that is complementary to the first. In this case the user provides a too generic query that returns a large set of potentially irrelevant results. We tackle the problem in structured databases, and more specifically, labeled graphs. The solution we propose returns a set of refinements (i.e., more specific queries) of the input query that, once executed, covers all the initial results. Third, we propose and study a completely novel query paradigm that assumes that the user is not able to describe the query conditions to retrieve the objects of interest. In this regard, we introduce exemplar queries, that allow the user to specify a single element in the result set and let the system infer the others. We provide clear semantics and a solution that works in large knowledge graphs. Finally, we validate the solutions for the three problems both in terms of theoretical results and experimental evaluation and we prove that the proposed methods efficiently scale up to very large real datasets.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
PhD Cycle:26
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Repository Staff approval on:16 Jun 2015 09:59

Repository Staff Only: item control page