Active learning methods for classification and regression problems

Pasolli, Edoardo (2011) Active learning methods for classification and regression problems. PhD thesis, University of Trento.

[img]
Preview
PDF - Doctoral Thesis
5Mb

Abstract

In the pattern recognition community, one of the most critical problems in the design of supervised classification and regression systems is given by the quality and the quantity of the exploited training samples (ground-truth). This problem is particularly important in such applications in which the process of training sample collection is an expensive and time consuming task subject to different sources of errors. Active learning represents an interesting approach proposed in the literature to address the problem of ground-truth collection, in which training samples are selected in an iterative way in order to minimize the number of involved samples and the intervention of human users. In this thesis, new methodologies of active learning for classification and regression problems are proposed and applied in three main application fields, which are the remote sensing, biomedical, and chemometrics fields. In particular, the proposed methodological contributions include: i) three strategies for the support vector machine (SVM) classification of electrocardiographic signals; ii) a strategy for SVM classification in the context of remote sensing images; iii) combination of spectral and spatial information in the context of active learning for remote sensing image classification; iv) exploitation of active learning to solve the problem of covariate shift, which may occur when a classifier trained on a portion of the image is applied to the rest of the image; moreover, several strategies for regression problems are proposed to estimate v) biophysical parameters from remote sensing data and vi) chemical concentrations from spectroscopic data; vii) a framework for assisting a human user in the design of a ground-truth for classifying a given optical remote sensing image. Experiments conducted on simulated and real data sets are reported and discussed. They all suggest that, despite their complexity, ground-truth collection problems can be tackled satisfactory by the proposed approaches.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
PhD Cycle:XXIV
Subjects:Area 09 - Ingegneria industriale e dell'informazione > ING-INF/03 TELECOMUNICAZIONI
Uncontrolled Keywords:Active learning, classification, electrocardiographic signals, Gaussian processes, ground-truth, regression, remote sensing, spectrometric data analysis, support vector machines.
Repository Staff approval on:24 Nov 2011 11:27

Repository Staff Only: item control page