Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2025

Abstract

Understanding molecular structure and related knowledge is crucialfor scientific research. Recent studies integrate molecular graphswith their textual descriptions to enhance molecular representationlearning. However, they focus on the whole molecular graph andneglect frequently occurring subgraphs, known as motifs, whichare essential for determining molecular properties. Without suchfine-grained knowledge, these models struggle to generalize to un-seen molecules and tasks that require motif-level insights. To bridgethis gap, we propose FineMolTex, a novel Fine-grained Moleculargraph-Text pre-training framework to jointly learn coarse-grainedmolecule-level knowledge and fine-grained motif-level knowledge.Specifically, FineMolTex consists of two pre-training tasks: a con-trastive alignment task for coarse-grained matching and a maskedmulti-modal modeling task for fine-grained matching. In particular,the latter predicts the labels of masked motifs and words, whichare selected based on their importance. By leveraging insights fromboth modalities, FineMolTex is able to understand the fine-grainedmatching between motifs and words. Finally, we conduct extensiveexperiments across three downstream tasks, achieving up to 230%improvement in the text-based molecule editing task. Additionally,our case studies reveal that FineMolTex successfully captures fine-grained knowledge, potentially offering valuable insights for drugdiscovery and catalyst design.

Keywords

Graph Neural Networks, Molecular Graph Pre-training

Discipline

Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

KDD '25: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2, Toronto, Canada, August 3-7

Volume

2

First Page

1589

Last Page

1599

Identifier

10.1145/3711896.3736834

Publisher

ACM

City or Country

New York

Additional URL

https://doi.org/10.1145/3711896.3736834

Share

COinS