Publication Type

Journal Article

Version

acceptedVersion

Publication Date

1-2021

Abstract

Many AI researchers are publishing code, data and other resources that accompany their papers in GitHub repositories. In this paper, we refer to these repositories as academic AI repositories. Our preliminary study shows that highly cited papers are more likely to have popular academic AI repositories (and vice versa). Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers. We collect 1,149 academic AI repositories, in which we label the top 20% repositories that have the most number of stars as popular, and we label the bottom 70% repositories as unpopular. The remaining 10% repositories are set as a gap between popular and unpopular academic AI repositories. We propose 21 features to characterize the software engineering practices of academic AI repositories. Our experimental results show that popular and unpopular academic AI repositories are statistically significantly different in 11 of the studied features—indicating that the two groups of repositories have significantly different software engineering practices. Furthermore, we find that the number of links to other GitHub repositories in the README file, the number of images in the README file and the inclusion of a license are the most important features for differentiating the two groups of academic AI repositories. Our dataset and code are made publicly available to share with the community.

Keywords

Academic AI repository, Software popularity, Mining software repositories

Discipline

Artificial Intelligence and Robotics | Software Engineering

Research Areas

Data Science and Engineering

Publication

Empirical Software Engineering

Volume

Issue

First Page

Last Page

ISSN

1382-3256

Identifier

10.1007/s10664-020-09916-6

Publisher

Springer Verlag (Germany)

Citation

FAN, Yuanrui; XIA, Xin; LO, David; HASSAN, Ahmed E.; and LI, Shanping. What makes a popular academic AI repository?. (2021). Empirical Software Engineering. 26, (2), 1-35.
Available at: https://ink.library.smu.edu.sg/sis_research/6713

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Artificial Intelligence and Robotics Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

What makes a popular academic AI repository?

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

What makes a popular academic AI repository?

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links