Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2008

Abstract

Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to mine or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program execution traces. A program trace can be viewed as a sequence of events collected when a program is run. A set of program traces in turn can be viewed as a sequence database. In this paper, we present some novel work in mining software specifications by employing novel pattern mining and rule mining techniques. Performance studies show the scalability of our technique. Case studies on traces of a real industrial application show the utility of our technique in recovering program specifications from execution traces.

Discipline

Software Engineering

Research Areas

Software Systems

Publication

Proceedings of the 34th International Conference on Very Large Data Bases (VLDB) 2008, August 23-28, Auckland, (PhD workshop)

First Page

1609

Last Page

1616

ISBN

9781605583068

Identifier

10.14778/1454159.1454234

Publisher

VLDB Endowment

City or Country

Stanford, CA

Additional URL

https://doi.org/10.14778/1454159.1454234

Share

COinS