Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

4-2024

Abstract

While computer science papers frequently include their associated code repositories, establishing a clear link between papers and their corresponding implementations may be challenging due to the number of code repositories used in research publications. In this paper we describe a lightweight method for effectively identifying bidirectional links between papers and repositories from both LaTeX and PDF sources. We have used our approach to analyze more than 14000 PDF and Latex files in the Software Engineering category of Arxiv, generating a dataset of more than 1400 paper-code implementations and assessing current citation practices on it.

Keywords

Research software, article analysis, software citation, Open Science

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the 21st International Conference on Mining Software Repositories, Lisbon, Portugal, 2024 April 15-16

First Page

1

Last Page

5

Identifier

10.1145/3643991.3644876

Publisher

ACM

City or Country

New York

Additional URL

https://doi.org/10.1145/3643991.3644876

Share

COinS