Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2022
Abstract
To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text. A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA. In addition, we propose a novel method, namely UniPCQA, to adapt a hybrid format of input and output content in PCQA into the Seq2Seq problem, including the reformulation of the numerical reasoning process as code generation. UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-k sampled Seq2Seq outputs. We benchmark the PACIFIC dataset with extensive baselines and provide comprehensive evaluations on each sub-task of PCQA.
Keywords
Hybrid formats, Key feature, Multitask learning, Novel methods, Numerical reasoning, Proactivity, Question Answering, Subtask, Tabular data, Textual data
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Data Science and Engineering; Information Systems and Management
Areas of Excellence
Digital transformation
Publication
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, December 7-11
First Page
6970
Last Page
6984
Identifier
10.18653/v1/2022.emnlp-main.469
Publisher
Association for Computational Linguistics
City or Country
Texas
Citation
DENG, Yang; LEI, Wenqiang; ZHANG, Wenxuan; LAM, Wai; and CHUA, Tat-Seng.
PACIFIC: Towards proactive conversational question answering over tabular and textual data in finance. (2022). Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, December 7-11. 6970-6984.
Available at: https://ink.library.smu.edu.sg/sis_research/9139
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2022.emnlp-main.469
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons