%0 Journal Article %J Software Engineering, IEEE Transactions on %D 2011 %T Developing a Single Model and Test Prioritization Strategies for Event-Driven Software %A Bryce,R.C. %A Sampath,S. %A Memon, Atif M. %K EDS %K event-driven software %K graphical user interface %K Graphical user interfaces %K GUI testing %K Internet %K program testing %K service-oriented architecture %K test prioritization strategy %K Web application testing %X Event-Driven Software (EDS) can change state based on incoming events; common examples are GUI and Web applications. These EDSs pose a challenge to testing because there are a large number of possible event sequences that users can invoke through a user interface. While valuable contributions have been made for testing these two subclasses of EDS, such efforts have been disjoint. This work provides the first single model that is generic enough to study GUI and Web applications together. In this paper, we use the model to define generic prioritization criteria that are applicable to both GUI and Web applications. Our ultimate goal is to evolve the model and use it to develop a unified theory of how all EDS should be tested. An empirical study reveals that the GUI and Web-based applications, when recast using the new model, show similar behavior. For example, a criterion that gives priority to all pairs of event interactions did well for GUI and Web applications; another criterion that gives priority to the smallest number of parameter value settings did poorly for both. These results reinforce our belief that these two subclasses of applications should be modeled and studied together. %B Software Engineering, IEEE Transactions on %V 37 %P 48 - 64 %8 2011/// %@ 0098-5589 %G eng %N 1 %R 10.1109/TSE.2010.12 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2011 %T GUI Interaction Testing: Incorporating Event Context %A Xun Yuan %A Cohen,M. B %A Memon, Atif M. %K automatic test case generation %K automatic test pattern generation %K combinatorial interaction testing %K event driven nature %K graphical user interface %K Graphical user interfaces %K GUI interaction testing %K program testing %X Graphical user interfaces (GUIs), due to their event-driven nature, present an enormous and potentially unbounded way for users to interact with software. During testing, it is important to #x201C;adequately cover #x201D; this interaction space. In this paper, we develop a new family of coverage criteria for GUI testing grounded in combinatorial interaction testing. The key motivation of using combinatorial techniques is that they enable us to incorporate #x201C;context #x201D; into the criteria in terms of event combinations, sequence length, and by including all possible positions for each event. Our new criteria range in both efficiency (measured by the size of the test suite) and effectiveness (the ability of the test suites to detect faults). In a case study on eight applications, we automatically generate test cases and systematically explore the impact of context, as captured by our new criteria. Our study shows that by increasing the event combinations tested and by controlling the relative positions of events defined by the new criteria, we can detect a large number of faults that were undetectable by earlier techniques. %B Software Engineering, IEEE Transactions on %V 37 %P 559 - 574 %8 2011/08//july %@ 0098-5589 %G eng %N 4 %R 10.1109/TSE.2010.50 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2010 %T Generating Event Sequence-Based Test Cases Using GUI Runtime State Feedback %A Xun Yuan %A Memon, Atif M. %K automatic model driven technique %K event interaction coverage equivalent counterparts %K event semantic interaction relationships %K event sequence based test cases %K Graphical user interfaces %K GUI runtime state feedback %K program testing %K Software quality %X This paper presents a fully automatic model-driven technique to generate test cases for graphical user interfaces (GUIs)-based applications. The technique uses feedback from the execution of a ??seed test suite,?? which is generated automatically using an existing structural event interaction graph model of the GUI. During its execution, the runtime effect of each GUI event on all other events pinpoints event semantic interaction (ESI) relationships, which are used to automatically generate new test cases. Two studies on eight applications demonstrate that the feedback-based technique 1) is able to significantly improve existing techniques and helps identify serious problems in the software and 2) the ESI relationships captured via GUI state yield test suites that most often detect more faults than their code, event, and event-interaction-coverage equivalent counterparts. %B Software Engineering, IEEE Transactions on %V 36 %P 81 - 95 %8 2010/02//jan %@ 0098-5589 %G eng %N 1 %R 10.1109/TSE.2009.68 %0 Conference Paper %B Software Testing, Verification and Validation (ICST), 2010 Third International Conference on %D 2010 %T Repairing GUI Test Suites Using a Genetic Algorithm %A Huang,Si %A Cohen,M. B %A Memon, Atif M. %K automated functional testing %K genetic algorithm %K Genetic algorithms %K graph model %K graphical user interface %K Graphical user interfaces %K GUI test suite %K program testing %K random algorithm %K synthetic program %K test case %X Recent advances in automated functional testing of Graphical User Interfaces (GUIs) rely on deriving graph models that approximate all possible sequences of events that may be executed on the GUI, and then use the graphs to generate test cases (event sequences) that achieve a specified coverage goal. However, because these models are only approximations of the actual event flows, the generated test cases may suffer from problems of infeasibility, i.e., some events may not be available for execution causing the test case to terminate prematurely. In this paper we develop a method to automatically repair GUI test suites, generating new test cases that are feasible. We use a genetic algorithm to evolve new test cases that increase our test suite's coverage while avoiding infeasible sequences. We experiment with this algorithm on a set of synthetic programs containing different types of constraints and for test sequences of varying lengths. Our results suggest that we can generate new test cases to cover most of the feasible coverage and that the genetic algorithm outperforms a random algorithm trying to achieve the same goal in almost all cases. %B Software Testing, Verification and Validation (ICST), 2010 Third International Conference on %P 245 - 254 %8 2010/04// %G eng %R 10.1109/ICST.2010.39 %0 Conference Paper %B Software Testing, Verification, and Validation Workshops (ICSTW), 2010 Third International Conference on %D 2010 %T Using methods & measures from network analysis for gui testing %A Elsaka,E. %A Moustafa,W. E %A Nguyen,Bao %A Memon, Atif M. %K betweenness clustering method %K event sequences %K event-flow graph model %K Graphical user interfaces %K GUI quality assurance %K GUI testing %K network analysis %K network centrality measures %K program testing %K Software quality %X Graphical user interfaces (GUIs) for today's applications are extremely large. Moreover, they provide many degrees of freedom to the end-user, thus allowing the user to perform a very large number of event sequences on the GUI. The large sizes and degrees of freedom create severe problems for GUI quality assurance, including GUI testing. In this paper, we leverage methods and measures from network analysis to analyze and study GUIs, with the goal of aiding GUI testing activities. We apply these methods and measures on the event-flow graph model of GUIs. Results of a case study show that "network centrality measures" are able to identify the most important events in the GUI as well as the most important sequences of events. These events and sequences are good candidates for test prioritization. In addition, the "betweenness clustering" method is able to partition the GUI into regions that can be tested separately. %B Software Testing, Verification, and Validation Workshops (ICSTW), 2010 Third International Conference on %P 240 - 246 %8 2010/04// %G eng %R 10.1109/ICSTW.2010.61 %0 Conference Paper %B Software Testing, Verification and Validation Workshops, 2009. ICSTW '09. International Conference on %D 2009 %T An Extensible Heuristic-Based Framework for GUI Test Case Maintenance %A McMaster,S. %A Memon, Atif M. %K extensible heuristic-based framework %K graphical user interface %K Graphical user interfaces %K GUI code %K GUI test case maintenance %K program testing %K Software maintenance %X Graphical user interfaces (GUIs) make up a large portion of the code comprising many modern software applications. However, GUI testing differs significantly from testing of traditional software. One respect in which this is true is test case maintenance. Due to the way that GUI test cases are often implemented, relatively minor changes to the construction of the GUI can cause a large number of test case executions to malfunction, often because GUI elements referred to by the test cases have been renamed, moved, or otherwise altered. We posit that a general solution to the problem of GUI test case maintenance must be based on heuristics that attempt to match an applicationpsilas GUI elements across versions. We demonstrate the use of some heuristics with framework support. Our tool support is general in that it may be used with other heuristics if needed in the future. %B Software Testing, Verification and Validation Workshops, 2009. ICSTW '09. International Conference on %P 251 - 254 %8 2009/04// %G eng %R 10.1109/ICSTW.2009.11 %0 Conference Paper %B Software Testing Verification and Validation, 2009. ICST '09. International Conference on %D 2009 %T An Initial Characterization of Industrial Graphical User Interface Systems %A Brooks,P.A. %A Robinson,B.P. %A Memon, Atif M. %K Graphical user interfaces %K GUI-based software systems %K industrial graphical user interface systems %K model-based GUI testing techniques %K program testing %K software metrics %K source code change metrics %X To date we have developed and applied numerous model-based GUI testing techniques; however, we are unable to provide definitive improvement schemes to real-world GUI test planners, as our data was derived from open source applications, small compared to industrial systems. This paper presents a study of three industrial GUI-based software systems developed at ABB, including data on classified defects detected during late-phase testing and customer usage, test suites, and source code change metrics. The results show that (1) 50% of the defects found through the GUI are categorized as data access and handling, control flow and sequencing, correctness, and processing defects, (2) system crashes exposed defects 12-19% of the time, and (3) GUI and non-GUI components are constructed differently, in terms of source code metrics. %B Software Testing Verification and Validation, 2009. ICST '09. International Conference on %P 11 - 20 %8 2009/04// %G eng %R 10.1109/ICST.2009.11 %0 Conference Paper %B Software Maintenance, 2009. ICSM 2009. IEEE International Conference on %D 2009 %T Introducing a test suite similarity metric for event sequence-based test cases %A Brooks,P.A. %A Memon, Atif M. %K event driven software systems %K event sequence-based test cases %K open source systems %K program testing %K public domain software %K software metrics %K Software testing %K test suite similarity metric %X Most of today's event driven software (EDS) systems are tested using test cases that are carefully constructed as sequences of events; they test the execution of an event in the context of its preceding events. Because sizes of these test suites can be extremely large, researchers have developed techniques, such as reduction and minimization, to obtain test suites that are ldquosimilarrdquo to the original test suite, but smaller. Existing similarity metrics mostly use code coverage; they do not consider the contextual relationships between events. Consequently, reduction based on such metrics may eliminate desirable test cases. In this paper, we present a new parameterized metric, CONTeSSi(n) which uses the context of n preceding events in test cases to develop a new context-aware notion of test suite similarity for EDS. This metric is defined and evaluated by comparing four test suites for each of four open source applications. Our results show that CONT eSSi(n) is a better indicator of the similarity of EDS test suites than existing metrics. %B Software Maintenance, 2009. ICSM 2009. IEEE International Conference on %P 243 - 252 %8 2009/09// %G eng %R 10.1109/ICSM.2009.5306305 %0 Conference Paper %B Software Maintenance, 2009. ICSM 2009. IEEE International Conference on %D 2009 %T Prioritizing component compatibility tests via user preferences %A Yoon,Il-Chul %A Sussman, Alan %A Memon, Atif M. %A Porter, Adam %K compatibility testing prioritization %K component configurations %K computer clusters %K Middleware %K Middleware systems %K object-oriented programming %K program testing %K software engineering %K Software systems %K third-party components %K user preferences %X Many software systems rely on third-party components during their build process. Because the components are constantly evolving, quality assurance demands that developers perform compatibility testing to ensure that their software systems build correctly over all deployable combinations of component versions, also called configurations. However, large software systems can have many configurations, and compatibility testing is often time and resource constrained. We present a prioritization mechanism that enhances compatibility testing by examining the ldquomost importantrdquo configurations first, while distributing the work over a cluster of computers. We evaluate our new approach on two large scientific middleware systems and examine tradeoffs between the new prioritization approach and a previously developed lowest-cost-configuration-first approach. %B Software Maintenance, 2009. ICSM 2009. IEEE International Conference on %P 29 - 38 %8 2009/09// %G eng %R 10.1109/ICSM.2009.5306357 %0 Conference Paper %B Software Testing, Verification and Validation Workshops, 2009. ICSTW '09. International Conference on %D 2009 %T Towards Dynamic Adaptive Automated Test Generation for Graphical User Interfaces %A Xun Yuan %A Cohen,M. B %A Memon, Atif M. %K adaptive automated test generation %K computational complexity %K event sequence length %K evolutionary algorithm %K evolutionary computation %K graphical user interface %K Graphical user interfaces %K GUI test case %K program testing %X Graphical user interfaces (GUIs) present an enormous number of potential event sequences to users. During testing it is necessary to cover this space, however the complexity of modern GUIs has made this an increasingly difficult task. Our past work has demonstrated that it is important to incorporate "context” into GUI test cases, in terms of event combinations, event sequence length, and by considering all possible starting and ending positions for each event. Despite the use of our most refined modeling techniques, many of the generated test cases remain unexecutable. In this paper, we posit that due to the dynamic state-based nature of GUIs, it is important to incorporate feedback from the execution of tests into test case generation algorithms. We propose the use of an evolutionary algorithm to generate test suites with fewer unexecutable test cases and higher event interaction coverage. %B Software Testing, Verification and Validation Workshops, 2009. ICSTW '09. International Conference on %P 263 - 266 %8 2009/04// %G eng %R 10.1109/ICSTW.2009.26 %0 Conference Paper %B Software Testing, Verification, and Validation, 2008 1st International Conference on %D 2008 %T Relationships between Test Suites, Faults, and Fault Detection in GUI Testing %A Strecker,J. %A Memon, Atif M. %K Fault detection %K fault-related factors %K Graphical user interfaces %K GUI testing %K program testing %K software-testing %K test suites %K test-suite-related factors %X Software-testing researchers have long sought recipes for test suites that detect faults well. In the literature, empirical studies of testing techniques abound, yet the ideal technique for detecting the desired kinds of faults in a given situation often remains unclear. This work shows how understanding the context in which testing occurs, in terms of factors likely to influence fault detection, can make evaluations of testing techniques more readily applicable to new situations. We present a methodology for discovering which factors do statistically affect fault detection, and we perform an experiment with a set of test-suite- and fault-related factors in the GUI testing of two fielded, open-source applications. Statement coverage and GUI-event coverage are found to be statistically related to the likelihood of detecting certain kinds of faults. %B Software Testing, Verification, and Validation, 2008 1st International Conference on %P 12 - 21 %8 2008/04// %G eng %R 10.1109/ICST.2008.26 %0 Conference Paper %B Software Maintenance, 2007. ICSM 2007. IEEE International Conference on %D 2007 %T Fault Detection Probability Analysis for Coverage-Based Test Suite Reduction %A McMaster,S. %A Memon, Atif M. %K coverage-based test suite reduction %K fault detection probability analysis %K Fault diagnosis %K force coverage-based reduction %K percentage fault detection reduction %K percentage size reduction %K program testing %K software reliability %K statistical analysis %X Test suite reduction seeks to reduce the number of test cases in a test suite while retaining a high percentage of the original suite's fault detection effectiveness. Most approaches to this problem are based on eliminating test cases that are redundant relative to some coverage criterion. The effectiveness of applying various coverage criteria in test suite reduction is traditionally based on empirical comparison of two metrics derived from the full and reduced test suites and information about a set of known faults: (1) percentage size reduction and (2) percentage fault detection reduction, neither of which quantitatively takes test coverage data into account. Consequently, no existing measure expresses the likelihood of various coverage criteria to force coverage-based reduction to retain test cases that expose specific faults. In this paper, we develop and empirically evaluate, using a number of different coverage criteria, a new metric based on the "average expected probability of finding a fault" in a reduced test suite. Our results indicate that the average probability of detecting each fault shows promise for identifying coverage criteria that work well for test suite reduction. %B Software Maintenance, 2007. ICSM 2007. IEEE International Conference on %P 335 - 344 %8 2007/10// %G eng %R 10.1109/ICSM.2007.4362646 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2007 %T Reliable Effects Screening: A Distributed Continuous Quality Assurance Process for Monitoring Performance Degradation in Evolving Software Systems %A Yilmaz,C. %A Porter, Adam %A Krishna,A. S %A Memon, Atif M. %A Schmidt,D. C %A Gokhale,A.S. %A Natarajan,B. %K configuration subset %K distributed continuous quality assurance process %K evolving software systems %K in house testing %K main effects screening %K performance bottlenecks %K performance degradation monitoring %K performance intensive software systems %K process configuration %K process execution %K program testing %K regression testing %K reliable effects screening %K software benchmarks %K Software performance %K software performance evaluation %K Software quality %K software reliability %K tool support %X Developers of highly configurable performance-intensive software systems often use in-house performance-oriented "regression testing" to ensure that their modifications do not adversely affect their software's performance across its large configuration space. Unfortunately, time and resource constraints can limit in-house testing to a relatively small number of possible configurations, followed by unreliable extrapolation from these results to the entire configuration space. As a result, many performance bottlenecks escape detection until systems are fielded. In our earlier work, we improved the situation outlined above by developing an initial quality assurance process called "main effects screening". This process 1) executes formally designed experiments to identify an appropriate subset of configurations on which to base the performance-oriented regression testing, 2) executes benchmarks on this subset whenever the software changes, and 3) provides tool support for executing these actions on in-the-field and in-house computing resources. Our initial process had several limitations, however, since it was manually configured (which was tedious and error-prone) and relied on strong and untested assumptions for its accuracy (which made its use unacceptably risky in practice). This paper presents a new quality assurance process called "reliable effects screening" that provides three significant improvements to our earlier work. First, it allows developers to economically verify key assumptions during process execution. Second, it integrates several model-driven engineering tools to make process configuration and execution much easier and less error prone. Third, we evaluate this process via several feasibility studies of three large, widely used performance-intensive software frameworks. Our results indicate that reliable effects screening can detect performance degradation in large-scale systems more reliably and with significantly less resources than conventional t- echniques %B Software Engineering, IEEE Transactions on %V 33 %P 124 - 141 %8 2007/02// %@ 0098-5589 %G eng %N 2 %R 10.1109/TSE.2007.20 %0 Conference Paper %B Software Engineering, 2007. ICSE 2007. 29th International Conference on %D 2007 %T Using GUI Run-Time State as Feedback to Generate Test Cases %A Xun Yuan %A Memon, Atif M. %K application under test %K automated test case generation %K Feedback %K feedback-based technique %K Graphical user interfaces %K GUI run-time state %K model-driven technique %K open-source software %K program testing %K public domain software %K reverse engineering %K reverse-engineering algorithm %K seed test suite %X This paper presents a new automated model-driven technique to generate test cases by using feedback from the execution of a "seed test suite" on an application under test (AUT). The test cases in the seed suite are designed to be generated automatically and executed very quickly. During their execution, feedback obtained from the AUT's run-time state is used to generate new, "improved" test cases. The new test cases subsequently become part of the seed suite. This "anytime technique" continues iteratively, generating and executing additional test cases until resources are exhausted or testing goals have been met. The feedback-based technique is demonstrated for automated testing of graphical user interfaces (GUIs). An existing abstract model of the GUI is used to automatically generate the seed test suite. It is executed; during its execution, state changes in the GUI pinpoint important relationships between GUI events, which evolve the model and help to generate new test cases. Together with a reverse- engineering algorithm used to obtain the initial model and seed suite, the feedback-based technique yields a fully automatic, end-to-end GUI testing process. A feasibility study on four large fielded open-source software (OSS) applications demonstrates that this process is able to significantly improve existing techniques and help identify/report serious problems in the OSS. In response, these problems have been fixed by the developers of the OSS in subsequent versions. %B Software Engineering, 2007. ICSE 2007. 29th International Conference on %P 396 - 405 %8 2007/05// %G eng %R 10.1109/ICSE.2007.94 %0 Conference Paper %B Software Reliability Engineering, 2006. ISSRE '06. 17th International Symposium on %D 2006 %T Studying the Characteristics of a "Good" GUI Test Suite %A Xie,Qing %A Memon, Atif M. %K Fault detection %K Fault diagnosis %K graphical user interface testing %K Graphical user interfaces %K program debugging %K program testing %X The widespread deployment of graphical-user interfaces (GUIs) has increased the overall complexity of testing. A GUI test designer needs to perform the daunting task of adequately testing the GUI, which typically has very large input interaction spaces, while considering tradeoffs between GUI test suite characteristics such as the number of test cases (each modeled as a sequence of events), their lengths, and the event composition of each test case. There are no published empirical studies on GUI testing that a GUI test designer may reference to make decisions about these characteristics. Consequently, in practice, very few GUI testers know how to design their test suites. This paper takes the first step towards assisting in GUI test design by presenting an empirical study that evaluates the effect of these characteristics on testing cost and fault detection effectiveness. The results show that two factors significantly effect the fault-detection effectiveness of a test suite: (1) the diversity of states in which an event executes and (2) the event coverage of the suite. Test designers need to improve the diversity of states in which each event executes by developing a large number of short test cases to detect the majority of "shallow" faults, which are artifacts of modern GUI design. Additional resources should be used to develop a small number of long test cases to detect a small number of "deep" faults %B Software Reliability Engineering, 2006. ISSRE '06. 17th International Symposium on %P 159 - 168 %8 2006/11// %G eng %R 10.1109/ISSRE.2006.45 %0 Conference Paper %B Software Maintenance, 2005. ICSM'05. Proceedings of the 21st IEEE International Conference on %D 2005 %T Call stack coverage for test suite reduction %A McMaster,S. %A Memon, Atif M. %K call stack coverage %K component reuse %K Fault detection %K language-independent information %K multi-language implementation %K program testing %K software development %K software fault tolerance %K Software maintenance %K software reusability %K space antenna-steering application %K stringent performance requirement %K systems analysis %K test suite reduction algorithm %X Test suite reduction is an important test maintenance activity that attempts to reduce the size of a test suite with respect to some criteria. Emerging trends in software development such as component reuse, multi-language implementations, and stringent performance requirements present new challenges for existing reduction techniques that may limit their applicability. A test suite reduction technique that is not affected by these challenges is presented; it is based on dynamically generated language-independent information that can be collected with little run-time overhead. Specifically, test cases from the suite being reduced are executed on the application under test and the call stacks produced during execution are recorded. These call stacks are then used as a coverage requirement in a test suite reduction algorithm. Results of experiments on test suites for the space antenna-steering application show significant reduction in test suite size at the cost of a moderate loss in fault detection effectiveness. %B Software Maintenance, 2005. ICSM'05. Proceedings of the 21st IEEE International Conference on %P 539 - 548 %8 2005/09// %G eng %R 10.1109/ICSM.2005.29 %0 Conference Paper %B Software Maintenance, 2005. ICSM'05. Proceedings of the 21st IEEE International Conference on %D 2005 %T Rapid "crash testing" for continuously evolving GUI-based software applications %A Xie,Q. %A Memon, Atif M. %K crash testing %K graphical user interface software retesting %K Graphical user interfaces %K GUI-based software application %K immediate feedback %K program testing %K rapid-feedback-based quality assurance %K software evolution %K Software maintenance %K software prototyping %K Software quality %X Several rapid-feedback-based quality assurance mechanisms are used to manage the quality of continuously evolving software. Even though graphical user interfaces (GUIs) are one of the most important parts of software, there are currently no mechanisms to quickly retest evolving GUI software. We leverage our previous work on GUI testing to define a new automatic GUI re-testing process called "crash testing" that is integrated with GUI evolution. We describe two levels of crash testing: (1) immediate feedback-based in which a developer indicates that a GUI bug was fixed in response to a previously reported crash; only select crash test cases are rerun and the developer is notified of the results in a matter of seconds, and (2) between code changes in which new crash test cases are generated on-the-fly and executed on the GUI. Since the code may be changed by another developer before all the crash tests have been executed, hence requiring restarting of the process, we use a simple rotation-based scheme to ensure that all crash tests are executed over a series of code changes. We show, via empirical studies, that our crash tests are effective at revealing serious problems in the GUI. %B Software Maintenance, 2005. ICSM'05. Proceedings of the 21st IEEE International Conference on %P 473 - 482 %8 2005/09// %G eng %R 10.1109/ICSM.2005.72 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2005 %T Studying the fault-detection effectiveness of GUI test cases for rapidly evolving software %A Memon, Atif M. %A Xie,Q. %K daily automated regression tester %K Fault diagnosis %K fault-detection %K formal specification %K formal verification %K Graphical user interfaces %K GUI test cases %K program testing %K quality assurance mechanism %K rapidly evolving software %K smoke regression testing technique %K software development %K software fault tolerance %K Software maintenance %K software prototyping %K Software quality %K test oracles %X Software is increasingly being developed/maintained by multiple, often geographically distributed developers working concurrently. Consequently, rapid-feedback-based quality assurance mechanisms such as daily builds and smoke regression tests, which help to detect and eliminate defects early during software development and maintenance, have become important. This paper addresses a major weakness of current smoke regression testing techniques, i.e., their inability to automatically (re)test graphical user interfaces (GUIs). Several contributions are made to the area of GUI smoke testing. First, the requirements for GUI smoke testing are identified and a GUI smoke test is formally defined as a specialized sequence of events. Second, a GUI smoke regression testing process called daily automated regression tester (DART) that automates GUI smoke testing is presented. Third, the interplay between several characteristics of GUI smoke test suites including their size, fault detection ability, and test oracles is empirically studied. The results show that: 1) the entire smoke testing process is feasible in terms of execution time, storage space, and manual effort, 2) smoke tests cannot cover certain parts of the application code, 3) having comprehensive test oracles may make up for not having long smoke test cases, and 4) using certain oracles can make up for not having large smoke test suites. %B Software Engineering, IEEE Transactions on %V 31 %P 884 - 896 %8 2005/10// %@ 0098-5589 %G eng %N 10 %R 10.1109/TSE.2005.117 %0 Conference Paper %B Software Maintenance, 2004. Proceedings. 20th IEEE International Conference on %D 2004 %T Empirical evaluation of the fault-detection effectiveness of smoke regression test cases for GUI-based software %A Memon, Atif M. %A Xie,Qing %K daily automated regression tester %K daily builds %K fault-detection effectiveness %K graphical user interface %K Graphical user interfaces %K GUI-based software %K program testing %K Quality assurance %K Regression analysis %K smoke regression test cases %K software development %K software fault tolerance %K Software maintenance %K Software quality %K software quality assurance %K test oracle complexity %K test oracles %K test-case length %X Daily builds and smoke regression tests have become popular quality assurance mechanisms to detect defects early during software development and maintenance. In previous work, we addressed a major weakness of current smoke regression testing techniques, i.e., their lack of ability to automatically (re)test graphical user interface (GUI) event interactions - we presented a GUI smoke regression testing process called daily automated regression tester (DART). We have deployed DART and have found several interesting characteristics of GUI smoke tests that we empirically demonstrate in this paper. We also combine smoke tests with different types of test oracles and present guidelines for practitioners to help them generate and execute the most effective combinations of test-case length and test oracle complexity. Our experimental subjects consist of four GUI-based applications. We generate 5000-8000 smoke tests (enough to be run in one night) for each application. Our results show that: (1) short GUI smoke tests with certain test oracles are effective at detecting a large number of faults; (2) there are classes of faults that our smoke test cannot detect; (3) short smoke tests execute a large percentage of code; and (4) the entire smoke testing process is feasible to do in terms of execution time and storage space. %B Software Maintenance, 2004. Proceedings. 20th IEEE International Conference on %P 8 - 17 %8 2004/09// %G eng %R 10.1109/ICSM.2004.1357785 %0 Conference Paper %B Automated Software Engineering, 2004. Proceedings. 19th International Conference on %D 2004 %T Using transient/persistent errors to develop automated test oracles for event-driven software %A Memon, Atif M. %A Xie,Qing %K automated test oracles %K Automatic testing %K event driven software %K Graphical user interfaces %K persistent errors %K program testing %K resource allocation %K resource utilization %K software intensive systems %K test case execution %K transient errors %X Today's software-intensive systems contain an important class of software, namely event-driven software (EDS). All EDS take events as input, change their state, and (perhaps) output an event sequence. EDS is typically implemented as a collection of event-handlers designed to respond to individual events. The nature of EDS creates new challenges for test automation. In this paper, we focus on those relevant to automated test oracles. A test oracle is a mechanism that determines whether a software executed correctly for a test case. A test case for an EDS consists of a sequence of events. The test case is executed on the EDS, one event at a time. Errors in the EDS may "appear" and later ”disappear" at several points (e.g., after an event is executed) during test case execution. Because of the behavior of these transient (those that disappear) and persistent (those that don't disappear) errors, EDS require complex and expensive test oracles that compare the expected and actual output multiple times during test case execution. We leverage our previous work to study several applications and observe the occurrence of persistent/transient errors. Our studies show that in practice, a large number of errors in EDS are transient and that there are specific classes of events that lead to transient errors. We use the results of this study to develop a new test oracle that compares the expected and actual output at strategic points during test case execution. We show that the oracle is effective at detecting errors and efficient in terms of resource utilization %B Automated Software Engineering, 2004. Proceedings. 19th International Conference on %P 186 - 195 %8 2004/09// %G eng %R 10.1109/ASE.2004.1342736 %0 Conference Paper %B Software Maintenance, 2003. ICSM 2003. Proceedings. International Conference on %D 2003 %T DART: a framework for regression testing "nightly/daily builds" of GUI applications %A Memon, Atif M. %A Banerjee,I. %A Hashmi,N. %A Nagarajan,A. %K automated retesting %K automatic test software %K coverage evaluation %K daily automated regression tester %K DART %K frequent retesting %K graphical user interface %K Graphical user interfaces %K GUI software %K instrumentation coding %K program testing %K regression testing %K Software development management %K Software development process %K Software maintenance %K Software quality %K structural GUI analysis %K Test Case Generation %K test cases regeneration %K Test execution %K test oracle creation %X "Nightly/daily building and smoke testing" have become widespread since they often reveal bugs early in the software development process. During these builds, software is compiled, linked, and (re)tested with the goal of validating its basic functionality. Although successful for conventional software, smoke tests are difficult to develop and automatically rerun for software that has a graphical user interface (GUI). In this paper, we describe a framework called DART (daily automated regression tester) that addresses the needs of frequent and automated re-testing of GUI software. The key to our success is automation: DART automates everything from structural GUI analysis; test case generation; test oracle creation; to code instrumentation; test execution; coverage evaluation; regeneration of test cases; and their re-execution. Together with the operating system's task scheduler, DART can execute frequently with little input from the developer/tester to retest the GUI software. We provide results of experiments showing the time taken and memory required for GUI analysis, test case and test oracle generation, and test execution. We also empirically compare the relative costs of employing different levels of detail in the GUI test cases. %B Software Maintenance, 2003. ICSM 2003. Proceedings. International Conference on %P 410 - 419 %8 2003/09// %G eng %R 10.1109/ICSM.2003.1235451 %0 Conference Paper %B Automated Software Engineering, 2003. Proceedings. 18th IEEE International Conference on %D 2003 %T What test oracle should I use for effective GUI testing? %A Memon, Atif M. %A Banerjee,I. %A Nagarajan,A. %K empirical studies %K graphical user interface %K Graphical user interfaces %K GUI testing %K oracle information %K oracle procedure %K oracle space requirement %K oracle time requirements %K program testing %K software engineering %K Software testing %K test cost %K test effectiveness %K test oracle %X Test designers widely believe that the overall effectiveness and cost of software testing depends largely on the type and number of test cases executed on the software. In this paper we show that the test oracle used during testing also contributes significantly to test effectiveness and cost. A test oracle is a mechanism that determines whether software executed correctly for a test case. We define a test oracle to contain two essential parts: oracle information that represents expected output; and an oracle procedure that compares the oracle information with the actual output. By varying the level of detail of oracle information and changing the oracle procedure, a test designer can create different types of test oracles. We design 11 types of test oracles and empirically compare them on four software systems. We seed faults in software to create 100 faulty versions, execute 600 test cases on each version, for all 11 types of oracles. In all, we report results of 660,000 test runs on software. We show (1) the time and space requirements of the oracles, (2) that faults are detected early in the testing process when using detailed oracle information and complex oracle procedures, although at a higher cost per test case, and (3) that employing expensive oracles results in detecting a large number of faults using relatively smaller number of test cases. %B Automated Software Engineering, 2003. Proceedings. 18th IEEE International Conference on %P 164 - 173 %8 2003/10// %G eng %R 10.1109/ASE.2003.1240304 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2001 %T Hierarchical GUI test case generation using automated planning %A Memon, Atif M. %A Pollack,M. E %A Soffa,M. L %K Artificial intelligence %K automated planning %K automatic test case generation %K Automatic testing %K correctness testing %K goal state %K Graphical user interfaces %K hierarchical GUI test case generation %K initial state %K Microsoft WordPad %K operators %K plan-generation system %K planning (artificial intelligence) %K Planning Assisted Tester for Graphical User Interface Systems %K program testing %K software %X The widespread use of GUIs for interacting with software is leading to the construction of more and more complex GUIs. With the growing complexity come challenges in testing the correctness of a GUI and its underlying software. We present a new technique to automatically generate test cases for GUIs that exploits planning, a well-developed and used technique in artificial intelligence. Given a set of operators, an initial state, and a goal state, a planner produces a sequence of the operators that will transform the initial state to the goal state. Our test case generation technique enables efficient application of planning by first creating a hierarchical model of a GUI based on its structure. The GUI model consists of hierarchical planning operators representing the possible events in the GUI. The test designer defines the preconditions and effects of the hierarchical operators, which are input into a plan-generation system. The test designer also creates scenarios that represent typical initial and goal states for a GUI user. The planner then generates plans representing sequences of GUI interactions that a user might employ to reach the goal state from the initial state. We implemented our test case generation system, called Planning Assisted Tester for Graphical User Interface Systems (PATHS) and experimentally evaluated its practicality and effectiveness. We describe a prototype implementation of PATHS and report on the results of controlled experiments to generate test cases for Microsoft's WordPad %B Software Engineering, IEEE Transactions on %V 27 %P 144 - 155 %8 2001/02// %@ 0098-5589 %G eng %N 2 %R 10.1109/32.908959 %0 Conference Paper %D 1999 %T Fault injection based on a partial view of the global state of a distributed system %A Michel Cukier %A Chandra,R. %A Henke,D. %A Pistole,J. %A Sanders,W. H. %K bounding technique %K clock synchronization %K distributed programming %K distributed software systems %K fault injection %K Loki %K post-runtime analysis %K program testing %K program verification %K software reliability %K Synchronisation %X This paper describes the basis for and preliminary implementation of a new fault injector, called Loki, developed specifically for distributed systems. Loki addresses issues related to injecting correlated faults in distributed systems. In Loki, fault injection is performed based on a partial view of the global state of an application. In particular, facilities are provided to pass user-specified state information between nodes to provide a partial view of the global state in order to try to inject complex faults successfully. A post-runtime analysis, using an off-line clock synchronization and a bounding technique, is used to place events and injections on a single global time-line and determine whether the intended faults were properly injected. Finally, observations containing successful fault injections are used to estimate specified dependability measures. In addition to describing the details of our new approach, we present experimental results obtained from a preliminary implementation in order to illustrate Loki's ability to inject complex faults predictably %P 168 - 177 %8 1999/// %G eng %R 10.1109/RELDIS.1999.805093 %0 Conference Paper %B Software Engineering, 1999. Proceedings of the 1999 International Conference on %D 1999 %T Using a goal-driven approach to generate test cases for GUIs %A Memon, Atif M. %A Pollack,M. E %A Soffa,M. L %K Artificial intelligence %K automatic test case generation %K goal state %K goal-driven approach %K Graphical user interfaces %K GUIs %K hierarchical planning operators %K initial state %K Microsoft Word-Pad %K operators %K planning (artificial intelligence) %K program testing %K software %K verification commands %X The widespread use of GUIs for interacting with software is leading to the construction of more and more complex GUIs. With the growing complexity comes challenges in testing the correctness of a GUI and the underlying software. We present a new technique to automatically generate test cases for GUIs that exploits planning, a well developed and used technique in artificial intelligence. Given a set of operators, an initial state and a goal state, a planner produces a sequence of the operators that will change the initial state to the goal state. Our test case generation technique first analyzes a GUI and derives hierarchical planning operators from the actions in the GUI. The test designer determines the preconditions and effects of the hierarchical operators, which are then input into a planning system. With the knowledge of the GUI and the way in which the user will interact with the GUI, the test designer creates sets of initial and goal states. Given these initial and final states of the GUI, a hierarchical planner produces plans, or a set of test cases, that enable the goal state to be reached. Our technique has the additional benefit of putting verification commands into the test cases automatically. We implemented our technique by developing the GUI analyzer and extending a planner. We generated test cases for Microsoft's Word-Pad to demonstrate the viability and practicality of the approach. %B Software Engineering, 1999. Proceedings of the 1999 International Conference on %P 257 - 266 %8 1999/05// %G eng %0 Conference Paper %B , The 19th IEEE Real-Time Systems Symposium, 1998. Proceedings %D 1998 %T Performance measurement using low perturbation and high precision hardware assists %A Mink, A. %A Salamon, W. %A Hollingsworth, Jeffrey K %A Arunachalam, R. %K Clocks %K Computerized monitoring %K Counting circuits %K Debugging %K Hardware %K hardware performance monitor %K high precision hardware assists %K low perturbation %K measurement %K MPI message passing library %K MultiKron hardware performance monitor %K MultiKron PCI %K NIST %K online performance monitoring tools %K Paradyn parallel performance measurement tools %K PCI bus slot %K performance bug %K performance evaluation %K performance measurement %K program debugging %K program testing %K real-time systems %K Runtime %K Timing %X We present the design and implementation of MultiKron PCI, a hardware performance monitor that can be plugged into any computer with a free PCI bus slot. The monitor provides a series of high-resolution timers, and the ability to monitor the utilization of the PCI bus. We also demonstrate how the monitor can be integrated with online performance monitoring tools such as the Paradyn parallel performance measurement tools to improve the overhead of key timer operations by a factor of 25. In addition, we present a series of case studies using the MultiKron hardware performance monitor to measure and tune high-performance parallel completing applications. By using the monitor, we were able to find and correct a performance bug in a popular implementation of the MPI message passing library that caused some communication primitives to run at one half their potential speed %B , The 19th IEEE Real-Time Systems Symposium, 1998. Proceedings %I IEEE %P 379 - 388 %8 1998/12/02/4 %@ 0-8186-9212-X %G eng %R 10.1109/REAL.1998.739771 %0 Journal Article %J IEEE Transactions on Parallel and Distributed Systems %D 1995 %T Going beyond integer programming with the Omega test to eliminate false data dependences %A Pugh, William %A Wonnacott,D. %K Algorithm design and analysis %K Arithmetic %K Computer science %K Data analysis %K false data dependences %K integer programming %K Linear programming %K Omega test %K Privatization %K Production %K production compilers %K program compilers %K Program processors %K program testing %K program transformations %K Testing %X Array data dependence analysis methods currently in use generate false dependences that can prevent useful program transformations. These false dependences arise because the questions asked are conservative approximations to the questions we really should be asking. Unfortunately, the questions we really should be asking go beyond integer programming and require decision procedures for a subclass of Presburger formulas. In this paper, we describe how to extend the Omega test so that it can answer these queries and allow us to eliminate these false data dependences. We have implemented the techniques described here and believe they are suitable for use in production compilers %B IEEE Transactions on Parallel and Distributed Systems %V 6 %P 204 - 211 %8 1995/02// %@ 1045-9219 %G eng %N 2 %R 10.1109/71.342135