Publication Type
Journal Article
Version
acceptedVersion
Publication Date
1-2020
Abstract
Informal language and the absence of a standard taxonomy for software technologies make it difficult to reliably analyze technology trends on discussion forums and other on-line venues. We propose an automated approach called Witt for the categorization of software technologies (an expanded version of the hypernym discovery problem). Witt takes as input a phrase describing a software technology or concept and returns a general category that describes it (e.g., integrated development environment), along with attributes that further qualify it (commercial, php, etc.). By extension, the approach enables the dynamic creation of lists of all technologies of a given type (e.g., web application frameworks). Our approach relies on Stack Overflow and Wikipedia, and involves numerous original domain adaptations and a new solution to the problem of normalizing automatically-detected hypernyms. We compared Witt with six independent taxonomy tools and found that, when applied to software terms, Witt demonstrated better coverage than all evaluated alternative solutions, without a corresponding degradation in false positive rate.
Keywords
Software, Encyclopedias, Electronic publishing, Internet, Taxonomy, Tools
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
IEEE Transactions on Software Engineering
Volume
46
Issue
1
First Page
20
Last Page
32
ISSN
0098-5589
Identifier
10.1109/TSE.2018.2836450
Publisher
Institute of Electrical and Electronics Engineers
Citation
NASSIF, Mathieu; TREUDE, Christoph; and ROBILLARD, Martin P..
Automatically categorizing software technologies. (2020). IEEE Transactions on Software Engineering. 46, (1), 20-32.
Available at: https://ink.library.smu.edu.sg/sis_research/8784
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2018.2836450