Machine Learning Methods for Urban Computing

Barlacchi, Gianni (2019) Machine Learning Methods for Urban Computing. PhD thesis, University of Trento.

[img]
Preview
PDF - Disclaimer
328Kb
[img]
Preview
PDF - Doctoral Thesis
16Mb

Abstract

Machine Learning Methods for Urban Computing World population is increasingly moving from rural areas to urban centers, making large cities densely populated. In urban areas, there is greater access to work, a wide variety of options for education and training, ease of transport and the abundance of attractive places within a few kilometers. Across huge cities, people tend to move more and have to do it faster than in the past. On the other hand, heavy traffic (e.g., traffic jams), overbuilding and changes in the urban lifestyle can cause several new problems such as noise, atmospheric pollution (i.e., smog) and severe traffic congestions. However, the rise of novel data sources and machine learning techniques can help to tackle such problems and improve the quality of life of citizens. Indeed, in a smart city environment, the huge amount of data generated daily can be captured by sensors, actuators, and mobile devices. It goes without saying that using such data opens the door to several applications, including forecasting of urban displacements, land use classification and event detection in an urban environment. Motived by these opportunities, Urban Computing (UC) leverages on heterogeneous data sources and applies machine learning techniques to tackle these big challenges that modern cities are facing. In this perspective, one of the core questions when designing UC systems is how to enable models to learn from different urban data sources and thus how to represent urban spaces. The mainstream approach is to represent input objects as feature vectors that encode several aspects of the urban environment such as the presence of people, density of urban activities, and mobility flows. However, this tedious approach of manually feature engineering can be extremely complex, time-consuming and domain-specific dependent. Additionally, it can become even more complex when aggregating multiple geographical data sources such as point-of- interests, administrative boundaries, and mobility data. A valid alternative to feature-based methods is using kernels, which are non-linear functions that map input examples into some high dimensional space allowing for learning more powerful discriminative decision functions. Given a representation of the input object, kernels map it into some high-dimensional space where implicitly a large number of features are generated, allowing for learning robust discriminative functions. In this way the effort for the feature engineering pro- cess can be greatly reduced. Machine Learning Methods for Urban Computing Kernel methods have been widely applied in Natural Language Processing on tasks such as question answering, semantic role labeling and even for solving linguistic games. Taking inspiration from these successful cases, in this thesis we adapt kernel learning for solving novel tasks in UC. First, we focus on the problem of aggregating multiple urban data sources to provide datasets that fuse knowledge from a wide variety of data sources. Next, we focus on the problem of designing an input structure that is representative of urban space. In particular, we propose to model urban areas with tree structures that are fed to tree kernel functions for automatically generate expressive features. We propose several urban space representations that demonstrated to be very effecting in solving novel urban computing tasks such as land use classification and next location prediction in human mobility. Then, by applying a mining algorithm we enabled the interpretation of urban zones, providing help in the difficult problem of understanding the high-level urban characteristics of a city. In fact, our mined substructures provide help in identifying the different urban nature of cities. Finally, we explore the application of machine learning models to novel urban data sources by solving solve innovative tasks such as predicting the future presence of influenza-like symptoms looking at the people’s mobility behaviors.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
PhD Cycle:30
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Funders:TIM Telecom Italia
Repository Staff approval on:29 Apr 2019 13:41

Repository Staff Only: item control page