Publication Type
Journal Article
Version
acceptedVersion
Publication Date
2-2025
Abstract
Context: Software development creates and relies on a large volume of information, yet the volume of this information can make it challenging for developers to maintain an overview of all goings-on that a team and external actors contribute to a project. We posit that unexpected or “surprising” events could serve as important signposts amidst this information overload. These unexpected events may indicate underlying anomalies or emergent situations that require immediate attention. To explore this premise, our study leverages the concept of ‘surprisal’ from information theory to identify and quantify these unusual occurrences from the issues and pull requests of popular open-source software repositories. Objective: Drawing from a previously published research protocol, our study investigates whether a correlation exists between the ‘surprisal’ of issues and their perceived importance or difficulty within software repositories. Results: We performed a comprehensive analysis of approximately two million issues and pull requests, gathered from 1,270 repositories. Their ‘surprisal’ was then examined in relation to several indicative metrics of difficulty and perceived importance. Our results indicate only a weak correlation. This outcome underscores the need for further research to devise more effective strategies for helping developers prioritise issues.
Keywords
GitHub issues, n-gram, Self-information
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Empirical Software Engineering
Volume
30
Issue
1
First Page
1
Last Page
34
ISSN
1382-3256
Identifier
10.1007/s10664-024-10587-w
Publisher
Springer
Citation
CADDY, James; TREUDE, Christoph; WAGNER, Markus; and BARR, Earl T..
The role of surprisal in issue trackers. (2025). Empirical Software Engineering. 30, (1), 1-34.
Available at: https://ink.library.smu.edu.sg/sis_research/9845
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/s10664-024-10587-w