Publication Type

PhD Dissertation

Publication Date



This thesis proposes a general solution framework that integrates methods in machine
learning in creative ways to solve a diverse set of problems arising in urban environments.
It particularly focuses on modeling spatiotemporal data for the purpose of predicting
urban phenomena. Concretely, the framework is applied to solve three specific real-world
problems: human mobility prediction, traffic speed prediction and incident prediction.
For human mobility prediction, I use visitor trajectories collected a large theme park
in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered
sequence of attraction visits and corresponding timestamps produced by a visitor. This
problem has two related subproblems: (spatial) bundle prediction and trajectory prediction.
In the first problem, I apply the framework to predict a bundle (i.e., an unordered
set) of attractions that a given visitor would visit given a time budget. In the second
problem, the framework is applied to predict the visitor's actual trajectory given the
current partial trajectory and time budget. In both problems, I apply the methods of
trajectory clustering, hidden Markov model, revealed preference learning and (inverse)
reinforcement learning in the integrated framework.
In traffic speed prediction, I wish to predict the spatiotemporal distribution of traffic
speed over urban road networks. To this end, I propose local Gaussian processes which
combine non-negative matrix (NMF) factorization with Gaussian process (GP) in order
to enhance the efficiency of model training such that the solution could be deployed
in real-time use cases. NMF is essentially a spatiotemporal clustering technique. The
solution is extensively evaluated using real-world traffic data collected in two U.S. cities.
The incident prediction problem is about predicting the distribution of the number of
crime incidents over urban areas in future time periods. Because of its similarity to
the traffic prediction problem above, its solution greatly benefits from the GP model
developed earlier. Particularly, the GP kernel function is inherited and extended to
model the distribution of incidents in urban areas and their features. The proposed
solution is evaluated using real-world incident data collected in a large Asian city.
Conceptually, this thesis uses machine learning techniques to solve three separate urban
problems, whose contribution belongs to the large category of urban computing. At
the core, its technical contribution lies in the unification of separate solutions tailored
to those problems into an integrated framework that reasons with spatiotemporal data
and, thus, is highly generalizable to other problems of similar nature.


framework, machine learning, spatiotemporal, geospatial, reinforcement learning, craussian process

Degree Awarded

PhD in Information Systems


Databases and Information Systems | Environmental Design


LAU, Hoong Chuin

Copyright Owner and License

Singapore Management University

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.