Publication Type

Conference Proceeding Article

Version

Postprint

Publication Date

9-2012

Abstract

Millions of people, including those in the software engineering communities have turned to microblogging services, such as Twitter, as a means to quickly disseminate information. A number of past studies by Treude et al., Storey, and Yuan et al. have shown that a wealth of interesting information is stored in these microblogs. However, microblogs also contain a large amount of noisy content that are less relevant to software developers in engineering software systems. In this work, we perform a preliminary study to investigate the feasibility of automatic classification of microblogs into two categories: relevant and irrelevant to engineering software systems. We extract features from the textual content of the microblogs and the titles of any URLs mentioned in the microblogs. These features are then used to learn a discriminative model used in classifying relevant and irrelevant microblogs. We show that our trained model can achieve a promising classification performance.

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ICSM 2012: Proceedings of the 28th IEEE International Conference on Software Maintenance: Riva Del Garda, Trento, Italy: 23-28 September 2012

First Page

67

Last Page

76

ISBN

9781467323130

Identifier

10.1109/ICSM.2012.6405255

Publisher

IEEE

City or Country

Piscataway, NJ

Copyright Owner and License

Authors

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://doi.org/10.1109/ICSM.2012.6405255

Share

COinS