Publication Type
Journal Article
Version
acceptedVersion
Publication Date
3-2025
Abstract
Natural language (NL)-driven table discovery identifies relevant tables from large table repositories based on NL queries. While current deep-learning-based methods using the traditional dense vector search pipeline, i.e., representation-index-search, achieve remarkable accuracy, they face several limitations that impede further performance improvements: (i) the errors accumulated during the table representation and indexing phases affect the subsequent search accuracy; and (ii) insufficient query-table interaction hinders effective semantic alignment, impeding accuracy improvements. In this paper, we propose a novel framework Birdie, using a differentiate search index. It unifies the indexing and search into a single encoder-decoder language model, thus getting rid of error accumulations. Birdie first assigns each table a prefix-aware identifier and leverages a large language model-based query generator to create synthetic queries for each table. It then encodes the mapping between synthetic queries/tables and their corresponding table identifiers into the parameters of an encoder-decoder language model, enabling deep query-table interactions. During search, the trained model directly generates table identifiers for a given query. To accommodate the continual indexing of dynamic tables, we introduce an index update strategy via parameter isolation, which mitigates the issue of catastrophic forgetting. Extensive experiments demonstrate that Birdie outperforms state-of-the-art dense methods by 16.8% in accuracy, and reduces forgetting by over 90% compared to other continual learning approaches.
Discipline
Programming Languages and Compilers
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the VLDB Endowment
Volume
18
Issue
7
First Page
2070
Last Page
2083
ISSN
2150-8097
Identifier
10.14778/3734839.3734845
Publisher
VLDB Endowment
Citation
GUO, Yuxiang; HU, Zhonghao; MAO, Yuren; ZHENG, Baihua; GAO, Yunjun; and ZHOU, Mingwei.
Birdie: Natural language-driven table discovery using differentiable search index. (2025). Proceedings of the VLDB Endowment. 18, (7), 2070-2083.
Available at: https://ink.library.smu.edu.sg/sis_research/10354
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.14778/3734839.3734845