Predictive Modeling of Human Behavior: Supervised Learning from Telecom Metadata

Bogomolov, Andrey (2017) Predictive Modeling of Human Behavior: Supervised Learning from Telecom Metadata. PhD thesis, University of Trento.

PDF (Predictive Modeling of Human Behavior: Supervised Learning from Telecom Metadata) - Doctoral Thesis
Available under License Creative Commons Attribution.

[img]PDF - Disclaimer
Restricted to Repository staff only until 9999.



Big data, specifically Telecom Metadata, opens new opportunities for human behavior understanding, applying machine learning and big data processing computational methods combined with interdisciplinary knowledge of human behavior. In this thesis new methods are developed for human behavior predictive modeling based on anonymized telecom metadata on individual level and on large scale group level, which were studied during research projects held in 2012-2016 in collaboration with Telecom Italia, Telefonica Research, MIT Media Lab and University of Trento. It is shown that human dynamics patterns could be reliably recognized based on human behavior metrics derived from the mobile phone and cellular network activity (call log, sms log, bluetooth interactions, internet consumption). On individual level the results are validated on use cases of detecting daily stress and estimating subjective happiness. An original approach is introduced for feature extraction, selection, recognition model training and validation. Experimental results based on ensemble stochastic classification and regression tree models are discussed. On large group level, following big data for social good challenges, the problem of crime hotspot prediction is formulated and solved. In the proposed approach we use demographic information along with human mobility characteristics as derived from anonymized and aggregated mobile network data. The models, built on and evaluated against real crime data from London, obtain accuracy of almost 70% when classifying whether a specific area in the city will be a crime hotspot or not in the following month. Electric energy consumption patterns are correlated with human behavior patterns in highly nonlinear way. Second large scale group behavior prediction result is formulated as predicting next week energy consumption based on human dynamics analysis derived out of the anonymized and aggregated telecom data, processed from GSM network call detail records (CDRs). The proposed solution could act on energy producers/distributors as an essential aid to smart meters data for making better decisions in reducing total primary energy consumption by limiting energy production when the demand is not predicted, reducing energy distribution costs by efficient buy-side planning in time and providing insights for peak load planning in geographic space. All the studied experimental results combine the introduced methodology, which is efficient to implement for most of multimedia and real-time applications due to highly reduced low-dimensional feature space and reduced machine learning pipelines. Also the indicators which have strong predictive power are discussed opening new horizons for computational social science studies.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
PhD Cycle:28
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Uncontrolled Keywords:telecom metadata, call detail records, supervised learning, individual characteristics, affective states, crime hotspots, electric energy consumption, human behavior understanding, social good, human dynamics, big data, machine learning, artificial intelligence
Repository Staff approval on:20 Apr 2017 15:05

Related URLs:

Repository Staff Only: item control page