Mining Patterns and Rules for Software Specification Discovery
Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to mine or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program execution traces. A program trace can be viewed as a sequence of events collected when a program is run. A set of program traces in turn can be viewed as a sequence database. In this paper, we present some novel work in mining software specifications by employing novel pattern mining and rule mining techniques. Performance studies show the scalability of our technique. Case studies on traces of a real industrial application show the utility of our technique in recovering program specifications from execution traces.
Databases and Information Systems
Proceedings of the 34th International Conference on Very Large Data Bases (VLDB) (PhD workshop)
LO, David and Khoo, Siau-Cheng, "Mining Patterns and Rules for Software Specification Discovery" (2008). Research Collection School of Information Systems (Open Access). Paper 425.