Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
10-2020
Abstract
Extensive research has been conducted on sentiment analysis for software engineering (SA4SE). Researchers have invested much effort in developing customized tools (e.g., SentiStrength-SE, SentiCR) to classify the sentiment polarity for Software Engineering (SE) specific contents (e.g., discussions in Stack Overflow and code review comments). Even so, there is still much room for improvement. Recently, pre-trained Transformer-based models (e.g., BERT, XLNet) have brought considerable breakthroughs in the field of natural language processing (NLP). In this work, we conducted a systematic evaluation of five existing SA4SE tools and variants of four state-of-the-art pre-trained Transformer-based models on six SE datasets. Our work is the first to fine-tune pre-trained Transformer-based models for the SA4SE task. Empirically, across all six datasets, our fine-tuned pre-trained Transformer-based models outperform the existing SA4SE tools by 6.5-35.6% in terms of macro/micro-averaged F1 scores.
Keywords
Natural Language Processing, Pre-trained Models, Sentiment Analysis, Software Mining
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
2020 36th IEEE International Conference on Software Maintenance and Evolution (ICSME): Sep 27 - Oct 3, Adelaide, Australia: Proceedings
First Page
70
Last Page
80
ISBN
9781728156194
Identifier
10.1109/ICSME46990.2020.00017
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
ZHANG, Ting; XU, Bowen; Ferdian, Thung; AGUS HARYONO, Stefanus; LO, David; and JIANG, Lingxiao.
Sentiment analysis for software engineering: How far can pre-trained transformer models go?. (2020). 2020 36th IEEE International Conference on Software Maintenance and Evolution (ICSME): Sep 27 - Oct 3, Adelaide, Australia: Proceedings. 70-80.
Available at: https://ink.library.smu.edu.sg/sis_research/5535
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICSME46990.2020.00017