Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

12-2022

Abstract

To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text. A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA. In addition, we propose a novel method, namely UniPCQA, to adapt a hybrid format of input and output content in PCQA into the Seq2Seq problem, including the reformulation of the numerical reasoning process as code generation. UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-k sampled Seq2Seq outputs. We benchmark the PACIFIC dataset with extensive baselines and provide comprehensive evaluations on each sub-task of PCQA.

Keywords

Hybrid formats, Key feature, Multitask learning, Novel methods, Numerical reasoning, Proactivity, Question Answering, Subtask, Tabular data, Textual data

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering; Information Systems and Management

Areas of Excellence

Digital transformation

Publication

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, December 7-11

First Page

6970

Last Page

6984

Identifier

10.18653/v1/2022.emnlp-main.469

Publisher

Association for Computational Linguistics

City or Country

Texas

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.18653/v1/2022.emnlp-main.469

Share

COinS