Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2022
Abstract
A recent study by Feldman (2020) proposed a long-tail theory to explain the memorization behavior of deep learning models. However, memorization has not been empirically verified in the context of NLP, a gap addressed by this work. In this paper, we use three different NLP tasks to check if the long-tail theory holds. Our experiments demonstrate that top-ranked memorized training instances are likely atypical, and removing the top-memorized training instances leads to a more serious drop in test accuracy compared with removing training instances randomly. Furthermore, we develop an attribution method to better understand why a training instance is memorized. We empirically show that our memorization attribution method is faithful, and share our interesting finding that the top-memorized parts of a training instance tend to be features negatively correlated with the class label.
Keywords
Natural language processing
Discipline
Computer Engineering
Research Areas
Data Science and Engineering
Publication
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022 May 22 - 27
First Page
6265
Last Page
6278
Identifier
10.18653/v1/2022.acl-long.434
City or Country
Dublin, Ireland
Citation
ZHENG, Xiaosen and JIANG, Jing.
An empirical study of memorization in NLP. (2022). Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022 May 22 - 27. 6265-6278.
Available at: https://ink.library.smu.edu.sg/sis_research/7705
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2022.acl-long.434