Publication Type

Journal Article

Version

acceptedVersion

Publication Date

2-2022

Abstract

Stack Overflow hosts valuable programming-related knowledge with 11,926,354 links that reference to the third-party websites. The links that reference to the resources hosted outside the Stack Overflow websites extend the Stack Overflow knowledge base substantially. However, with the rapid development of programming-related knowledge, many resources hosted on the Internet are not available anymore. Based on our analysis of the Stack Overflow data that was released on Jun. 2, 2019, 14.2 percent of the links on Stack Overflow are broken links. The broken links on Stack Overflow can obstruct viewers from obtaining desired programming-related knowledge, and potentially damage the reputation of the Stack Overflow as viewers might regard the posts with broken links as obsolete. In this paper, we characterize the broken links on Stack Overflow. 65 percent of the broken links in our sampled questions are used to show examples, e.g., code examples. 70 percent of the broken links in our sampled answers are used to provide supporting information, e.g., explaining a certain concept and describing a step to solve a problem. Only 1.67 percent of the posts with broken links are highlighted as such by viewers in the posts’ comments. Only 5.8 percent of the posts with broken links removed the broken links. Viewers cannot fully rely on the vote scores to detect broken links, as broken links are common across posts with different vote scores. The websites that host resources that can be maintained by their users are referenced by broken links the most on Stack Overflow – a prominent example of such websites is GitHub. The posts and comments related to the web technologies, i.e., JavaScript, HTML, CSS, and jQuery, are associated with more broken links. Based on our findings, we shed lights for future directions and provide recommendations for practitioners and researchers.

Keywords

Empirical Software Engineering, Stack Overflow, Broken Link

Discipline

Programming Languages and Compilers | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Software Engineering

Volume

48

Issue

9

First Page

3242

Last Page

3267

ISSN

0098-5589

Identifier

10.1109/TSE.2021.3086494

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TSE.2021.3086494

Share

COinS