Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

9-2021

Abstract

Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction.

Keywords

descriptive meta-information, hypernetworks, pre-trained language mode

Discipline

Programming Languages and Compilers | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the Annual Conference of the International Speech Communication Association, Brno, Czechia, 2021 August 30 - September 3

First Page

3216

Last Page

3220

Identifier

10.21437/Interspeech.2021-229

City or Country

Brno, Czechia

Share

COinS