Celli, Fabio (2012) Adaptive Personality Recogntion from Text. PhD thesis, University of Trento.
|PDF (Adaptive Personality Recogntion from Text) - Doctoral Thesis |
Available under License Creative Commons Attribution Non-commercial Share Alike.
We address the issue of domain adaptation for automatic Personality Recognition from Text (PRT). The PRT task consists in the classification of the personality traits of some authors, given some pieces of text they wrote. The purpose of our work is to improve current approaches to PRT in order to extract personality information from social network sites, which is a really challenging task. We argue that current approaches, based on supervised learning, have several limitations for the adaptation to social network domain, mainly due to 1) difficulties in data annotation, 2) overfitting, 3) lack of domain adaptability and 4) multilinguality issues. We propose and test a new approach to PRT, that we will call Adaptive Personality Recognition (APR). We argue that this new approach solves domain adaptability problems and it is suitable for the application in Social Network Sites. We start from an introduction that covers all the background knowledge required for understanding PRT. It includes arguments like personality, the the Big5 factor model, the sets of correlations between language features and personality traits and a brief survey on learning approaches, that includes also feature selection and domain adaptation. We also provide an overview of the state-of-theart in PRT and we outline the problems we see in the application of PRT to social network domain. Basically, our APR approach is based on 1) an external model: a set of features/correlations between language and Big5 personality traits (taken from literature); 2) an adaptive strategy, that makes the model fit the distribution of the features in the dataset at hand, before generating personality hypotheses; 3) an evaluation strategy, that compares all the hypotheses generated for each single text of each author, computing confidence scores. This allows domain adaptation, semi-supervised learning and the automatic extraction of patterns associated to personality traits, that can be added to the initial correlation set, thus combining top-down and bottom-up approaches. The main contributions of our approach to the research in the field of PRT are: 1) the possibility to run top-down PRT from models taken from literature, adapting them to new datasets; 2) the definition of a small, language-independent and resource-free feature/ correlation set, tested on Italian and English; 3) the possibility to integrate top-down and bottom-up PRT strategies, allowing the enrichment of the initial feature/correlation from the dataset at hand; 4) the development of a system for APR, that does not require large labeled datasets for training, but just a small one for testing, minimizing the data annotation problem. Finally, we describe some applications of APR to the analysis of personality in online social network sites, reporting results and findings. We argue that the APR approach is very useful for Social Network Analysis, social marketing, opinion mining, sentiment analysis, mood detection and related fields.
|Item Type:||Doctoral Thesis (PhD)|
|Doctoral School:||Cognitive and Brain Sciences|
|Subjects:||Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI|
Area 11 - Scienze storiche, filosofiche, pedagogiche e psicologiche > M-PSI/05 PSICOLOGIA SOCIALE
|Repository Staff approval on:||11 Dec 2012 16:38|
Repository Staff Only: item control page