Publication Type
Journal Article
Version
acceptedVersion
Publication Date
2-2022
Abstract
Stack Overflow hosts valuable programming-related knowledge with 11,926,354 links that reference to the third-party websites. The links that reference to the resources hosted outside the Stack Overflow websites extend the Stack Overflow knowledge base substantially. However, with the rapid development of programming-related knowledge, many resources hosted on the Internet are not available anymore. Based on our analysis of the Stack Overflow data that was released on Jun. 2, 2019, 14.2 percent of the links on Stack Overflow are broken links. The broken links on Stack Overflow can obstruct viewers from obtaining desired programming-related knowledge, and potentially damage the reputation of the Stack Overflow as viewers might regard the posts with broken links as obsolete. In this paper, we characterize the broken links on Stack Overflow. 65 percent of the broken links in our sampled questions are used to show examples, e.g., code examples. 70 percent of the broken links in our sampled answers are used to provide supporting information, e.g., explaining a certain concept and describing a step to solve a problem. Only 1.67 percent of the posts with broken links are highlighted as such by viewers in the posts’ comments. Only 5.8 percent of the posts with broken links removed the broken links. Viewers cannot fully rely on the vote scores to detect broken links, as broken links are common across posts with different vote scores. The websites that host resources that can be maintained by their users are referenced by broken links the most on Stack Overflow – a prominent example of such websites is GitHub. The posts and comments related to the web technologies, i.e., JavaScript, HTML, CSS, and jQuery, are associated with more broken links. Based on our findings, we shed lights for future directions and provide recommendations for practitioners and researchers.
Keywords
Empirical Software Engineering, Stack Overflow, Broken Link
Discipline
Programming Languages and Compilers | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
IEEE Transactions on Software Engineering
Volume
48
Issue
9
First Page
3242
Last Page
3267
ISSN
0098-5589
Identifier
10.1109/TSE.2021.3086494
Publisher
Institute of Electrical and Electronics Engineers
Citation
LIU, Jiakun; XIA, Xin; LO, David; ZHANG, Haoxiang; ZOU, Ying; HASSAN, Ahmed E.; and LI, Shanping.
Broken external links on stack overflow. (2022). IEEE Transactions on Software Engineering. 48, (9), 3242-3267.
Available at: https://ink.library.smu.edu.sg/sis_research/7647
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2021.3086494