Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
5-2023
Abstract
Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our preliminary investigation of outliers shows an unexpected high frequency of events where there is considerable change in the information content of the project, suggesting that such outliers may inform a definition of surprisal.
Keywords
entropy, Information theory, software engineering
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of the 2nd Workshop on Natural Language-based Software Engineering, 2023 May 20
First Page
48
Last Page
55
ISBN
9798350301786
Identifier
10.1109/NLBSE59153.2023.00017
Publisher
IEEE
City or Country
Los Alamitos, CA
Citation
TORRES, Adriano; BALTES, Sebastian; TREUDE, Christoph; and WAGNER, Markus.
Applying information theory to software evolution. (2023). Proceedings of the 2nd Workshop on Natural Language-based Software Engineering, 2023 May 20. 48-55.
Available at: https://ink.library.smu.edu.sg/sis_research/8893
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.