Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
9-2021
Abstract
Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction.
Keywords
descriptive meta-information, hypernetworks, pre-trained language mode
Discipline
Programming Languages and Compilers | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of the Annual Conference of the International Speech Communication Association, Brno, Czechia, 2021 August 30 - September 3
First Page
3216
Last Page
3220
Identifier
10.21437/Interspeech.2021-229
City or Country
Brno, Czechia
Citation
1
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.