Empirical evaluation of the fault-detection effectiveness of smoke regression test cases for GUI-based software

TitleEmpirical evaluation of the fault-detection effectiveness of smoke regression test cases for GUI-based software
Publication TypeConference Papers
Year of Publication2004
AuthorsMemon AM, Xie Q
Conference NameSoftware Maintenance, 2004. Proceedings. 20th IEEE International Conference on
Date Published2004/09//
Keywordsdaily automated regression tester, daily builds, fault-detection effectiveness, graphical user interface, Graphical user interfaces, GUI-based software, program testing, Quality assurance, Regression analysis, smoke regression test cases, software development, software fault tolerance, Software maintenance, Software quality, software quality assurance, test oracle complexity, test oracles, test-case length

Daily builds and smoke regression tests have become popular quality assurance mechanisms to detect defects early during software development and maintenance. In previous work, we addressed a major weakness of current smoke regression testing techniques, i.e., their lack of ability to automatically (re)test graphical user interface (GUI) event interactions - we presented a GUI smoke regression testing process called daily automated regression tester (DART). We have deployed DART and have found several interesting characteristics of GUI smoke tests that we empirically demonstrate in this paper. We also combine smoke tests with different types of test oracles and present guidelines for practitioners to help them generate and execute the most effective combinations of test-case length and test oracle complexity. Our experimental subjects consist of four GUI-based applications. We generate 5000-8000 smoke tests (enough to be run in one night) for each application. Our results show that: (1) short GUI smoke tests with certain test oracles are effective at detecting a large number of faults; (2) there are classes of faults that our smoke test cannot detect; (3) short smoke tests execute a large percentage of code; and (4) the entire smoke testing process is feasible to do in terms of execution time and storage space.