Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2022

Abstract

Software developers often use social media (such as Twitter) to shareprogramming knowledge such as new tools, sample code snippets,and tips on programming. One of the topics they talk about is thesoftware library. The tweets may contain useful information abouta library. A good understanding of this information, e.g., on thedeveloper’s views regarding a library can be beneficial to weigh thepros and cons of using the library as well as the general sentimentstowards the library. However, it is not trivial to recognize whethera word actually refers to a library or other meanings. For example,a tweet mentioning the word “pandas" may refer to the Pythonpandas library or to the animal. In this work, we created the firstbenchmark dataset and investigated the task to distinguish whethera tweet refers to a programming library or something else. Recently,the pre-trained Transformer models (PTMs) have achieved greatsuccess in the fields of natural language processing and computervision. Therefore, we extensively evaluated a broad set of modernPTMs, including both general-purpose and domain-specific ones,to solve this programming library recognition task in tweets. Experimental results show that the use of PTM can outperform thebest-performing baseline methods by 5% - 12% in terms of F1-scoreunder within-, cross-, and mixed-library settings.

Keywords

Software libraries, Tweets, Disambiguation, Benchmark study

Discipline

Artificial Intelligence and Robotics | Computer and Systems Architecture | Data Storage Systems | Information Security | Software Engineering

Research Areas

Data Science and Engineering; Cybersecurity; Intelligent Systems and Optimization; Software and Cyber-Physical Systems

Publication

Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, ICPC 2022, Virtual Event, May 16-17, 2022

First Page

343

Last Page

353

ISBN

9781450392983

Identifier

10.1145/3524610.3527916

Publisher

IEEE

City or Country

Virtual Event

Citation

ZHANG, Ting; CHANDRASEKARAN, Divya Prabha; THUNG, Ferdian; and LO, David. Benchmarking library recognition in tweets. (2022). Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, ICPC 2022, Virtual Event, May 16-17, 2022. 343-353.
Available at: https://ink.library.smu.edu.sg/sis_research/7632

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3524610.3527916

Download

Included in

Artificial Intelligence and Robotics Commons, Computer and Systems Architecture Commons, Data Storage Systems Commons, Information Security Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Benchmarking library recognition in tweets

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Benchmarking library recognition in tweets

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links