Publication Type

PhD Dissertation

Version

publishedVersion

Publication Date

5-2023

Abstract

Capturing and modeling relationship networks consisting of entity nodes and attributes associated with these nodes is an important research topic in network or graph learning. In this dissertation, we focus on modeling an important class of networks present in many real-world domains. These networks involve i) attributes from multiple modalities, also known as multimodal attributes; ii) multimodal attributes that are not static but time-series information, i.e., dynamic multimodal attributes, and iii) relationships that evolve across time, i.e., dynamic networks. We refer to such networks as dynamic multimodal networks in this dissertation.

An example of a static multimodal network is one that consists of user interface (UI) design objects (e.g., UI element nodes, UI screen nodes, and element image nodes) as nodes, and links between these design objects as edges. For example, the links between UI screen nodes and their constituent UI element nodes are part of the edges between the respective nodes. The design objects may be associated with visual and element images, text, numerical values, and categorical labels as attributes. An example of dynamic company networks with dynamic multimodal attributes may involve relationships between company nodes that evolve across time (i.e., evolving commercial relationships between company nodes), and the company nodes may be associated with time-series of numerical stock prices, textual news, and categorical event attributes.

While there has been significant progress in the area of network or graph learning, most existing works do not focus on modeling such dynamic multimodal networks nor static networks with static or dynamic multimodal attributes.

In the first part of this dissertation, we focus on modeling networks with multimodal attributes. We develop four models that jointly capture static networks comprising different node and/or edge types with static multimodal and positional information. For model interpretability, we propose attention weight-based and learnable edge mask-based methods that enable end-users to understand and interpret the contribution of different parts of the network and information from different modalities. We show that our proposed models consistently out-perform other state-of-the-art models on six datasets across an extensive set of UI prediction tasks.

Next, in the second part of the dissertation, we focus on networks with dynamic multimodal attributes. We propose two models that jointly capture static networks comprising the same or different node types with dynamic attributes, i.e., time-series attributes, from different modalities, e.g., numerical stock price-related and textual news information, which may be local in nature (directly associated with specific nodes), or global in nature (relevant to multiple nodes). To address the noise inherent in multimodal time-series, we also propose knowledge-enrichment and curriculum learning methods. We show that our proposed models out-perform state-of-the-art network learning and time-series models on eight datasets across an extensive set of investment and risk management tasks and applications.

In the third and final part of the dissertation, we focus on modeling dynamic networks with dynamic multimodal attributes. We propose three models that capture dynamic implicit networks and/or dynamic explicit networks. The network nodes may be associated with local or global dynamic multimodal attributes that may be of varying lengths and frequencies. To address noisy and non-stationary dynamic networks and dynamic multimodal attributes, we also propose self-supervised learning and concept learning methods. Aside from applying the proposed models for dynamic networks with dynamic multimodal attributes to investment and risk management tasks and applications on another four datasets, we further apply our proposed models for dynamic networks with dynamic multimodal attributes to environmental, social, and governance rating forecasting tasks on six datasets, and demonstrate that our proposed models out-perform state-of-the-art models on these tasks.

Keywords

Network learning, graph neural networks, time-series modeling, multimodality, design, finance, sustainability

Degree Awarded

PhD in Computer Science

Discipline

Computer Sciences | OS and Networks

Supervisor(s)

LIM, Ee Peng

Publisher

Singapore Management University

City or Country

Singapore

Copyright Owner and License

Author

Available for download on Tuesday, October 01, 2024

Share

COinS