Publication Type

Journal Article

Version

acceptedVersion

Publication Date

1-2024

Abstract

Existing task-oriented conversational systems heavily rely on domain ontologies with pre-defined slots and candidate values. In practical settings, these prerequisites are hard to meet, due to the emerging new user requirements and ever-changing scenarios. To mitigate these issues for better interaction performance, there are efforts working towards detecting out-of-vocabulary values or discovering new slots under unsupervised or semi-supervised learning paradigms. However, overemphasizing on the conversation data patterns alone induces these methods to yield noisy and arbitrary slot results. To facilitate the pragmatic utility, real-world systems tend to provide a stringent amount of human labeling quota, which offers an authoritative way to obtain accurate and meaningful slot assignments. Nonetheless, it also brings forward the high requirement of utilizing such quota efficiently. Hence, we formulate a general new slot discovery task in an information extraction fashion and incorporate it into an active learning framework to realize human-in-the-loop learning. Specifically, we leverage existing language tools to extract value candidates where the corresponding labels are further leveraged as weak supervision signals. Based on these, we propose a bi-criteria selection scheme which incorporates two major strategies, namely, uncertainty-based and diversity-based sampling to efficiently identify terms of interest. We conduct extensive experiments on several public datasets and compare with a bunch of competitive baselines to demonstrate the effectiveness of our method.

Keywords

active learning, Information retrieval, Labeling, language processing, New slot discovery, Noise measurement, Ontologies, Redundancy, Task analysis, task-oriented conversation, Uncertainty

Discipline

Artificial Intelligence and Robotics | Theory and Algorithms

Publication

IEEE/ACM Transactions on Audio, Speech and Language Processing

First Page

1

Last Page

11

ISSN

2329-9290

Identifier

10.1109/TASLP.2024.3374060

Publisher

Association for Computing Machinery (ACM)

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/TASLP.2024.3374060

Share

COinS