Multimodal Distributional Semantics

Bruni, Elia (2013) Multimodal Distributional Semantics. PhD thesis, University of Trento.

PDF - Doctoral Thesis


Although being one very simple statement, the distributional hypothesis - namely, words that occur in similar contexts are semantically similar - has been granted the role of main assumption in many computational linguistic techniques. This is mostly due to the fact that it allows to easily and automatically construct a representation of word meaning from a large textual input. Among the computational linguistic techniques that are corpus-based and adopt the distributional hypothesis, Distributional semantic models (DSMs) have been shown to be a very effective method in many semantic-related tasks. DSMs approximate word meaning by vectors that keep track of the patterns of co-occurrence of words in the processed corpora. In addition, DSMs have been shown to be a very plausible computational model for human concept cognition, since they are able to simulate several psychological phenomena. Despite their success, one of their strongest limitations is that they entirely represent word meaning in terms of connections with other words. Cognitive scientists have argued that, in this way, DSMs neglect that humans rely also on non-verbal experiences and have access to rich sources of perceptual knowledge when they learn the meaning of words. In this work, the lack of perceptual grounding of distributional models is addressed by exploiting computer vision techniques that automatically identify discrete "visual words" in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. A flexible architecture to integrate text- and image-based distributional information is introduced and tested on a set of empirical evaluations, showing that an integrated model is superior to a purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Cognitive and Brain Sciences
PhD Cycle:26
Subjects:Area 10 - Scienze dell'antichità, filologico-letterarie e storico-artistiche > L-LIN/01 GLOTTOLOGIA E LINGUISTICA
Repository Staff approval on:29 Nov 2013 14:36

Repository Staff Only: item control page