%0 Journal Article %J IEEE Transactions on Pattern Analysis and Machine Intelligence %D 2010 %T Online Empirical Evaluation of Tracking Algorithms %A Wu,Hao %A Sankaranarayanan,A. C %A Chellapa, Rama %K Back %K Biomedical imaging %K Computer vision %K Filtering %K formal model validation techniques %K formal verification %K ground truth %K Kanade Lucas Tomasi feature tracker %K Karhunen-Loeve transforms %K lighting %K Markov processes %K mean shift tracker %K model validation. %K online empirical evaluation %K particle filtering (numerical methods) %K Particle filters %K Particle tracking %K performance evaluation %K receiver operating characteristic curves %K Robustness %K SNR %K Statistics %K Surveillance %K time reversed Markov chain %K tracking %K tracking algorithms %K visual tracking %X Evaluation of tracking algorithms in the absence of ground truth is a challenging problem. There exist a variety of approaches for this problem, ranging from formal model validation techniques to heuristics that look for mismatches between track properties and the observed data. However, few of these methods scale up to the task of visual tracking, where the models are usually nonlinear and complex and typically lie in a high-dimensional space. Further, scenarios that cause track failures and/or poor tracking performance are also quite diverse for the visual tracking problem. In this paper, we propose an online performance evaluation strategy for tracking systems based on particle filters using a time-reversed Markov chain. The key intuition of our proposed methodology relies on the time-reversible nature of physical motion exhibited by most objects, which in turn should be possessed by a good tracker. In the presence of tracking failures due to occlusion, low SNR, or modeling errors, this reversible nature of the tracker is violated. We use this property for detection of track failures. To evaluate the performance of the tracker at time instant t, we use the posterior of the tracking algorithm to initialize a time-reversed Markov chain. We compute the posterior density of track parameters at the starting time t = 0 by filtering back in time to the initial time instant. The distance between the posterior density of the time-reversed chain (at t = 0) and the prior density used to initialize the tracking algorithm forms the decision statistic for evaluation. It is observed that when the data are generated by the underlying models, the decision statistic takes a low value. We provide a thorough experimental analysis of the evaluation methodology. Specifically, we demonstrate the effectiveness of our approach for tackling common challenges such as occlusion, pose, and illumination changes and provide the Receiver Operating Characteristic (ROC) curves. Finally, we also s how the applicability of the core ideas of the paper to other tracking algorithms such as the Kanade-Lucas-Tomasi (KLT) feature tracker and the mean-shift tracker. %B IEEE Transactions on Pattern Analysis and Machine Intelligence %V 32 %P 1443 - 1458 %8 2010/08// %@ 0162-8828 %G eng %N 8 %R 10.1109/TPAMI.2009.135 %0 Journal Article %J Software Engineering, IEEE Transactions on %D 2005 %T Studying the fault-detection effectiveness of GUI test cases for rapidly evolving software %A Memon, Atif M. %A Xie,Q. %K daily automated regression tester %K Fault diagnosis %K fault-detection %K formal specification %K formal verification %K Graphical user interfaces %K GUI test cases %K program testing %K quality assurance mechanism %K rapidly evolving software %K smoke regression testing technique %K software development %K software fault tolerance %K Software maintenance %K software prototyping %K Software quality %K test oracles %X Software is increasingly being developed/maintained by multiple, often geographically distributed developers working concurrently. Consequently, rapid-feedback-based quality assurance mechanisms such as daily builds and smoke regression tests, which help to detect and eliminate defects early during software development and maintenance, have become important. This paper addresses a major weakness of current smoke regression testing techniques, i.e., their inability to automatically (re)test graphical user interfaces (GUIs). Several contributions are made to the area of GUI smoke testing. First, the requirements for GUI smoke testing are identified and a GUI smoke test is formally defined as a specialized sequence of events. Second, a GUI smoke regression testing process called daily automated regression tester (DART) that automates GUI smoke testing is presented. Third, the interplay between several characteristics of GUI smoke test suites including their size, fault detection ability, and test oracles is empirically studied. The results show that: 1) the entire smoke testing process is feasible in terms of execution time, storage space, and manual effort, 2) smoke tests cannot cover certain parts of the application code, 3) having comprehensive test oracles may make up for not having long smoke test cases, and 4) using certain oracles can make up for not having large smoke test suites. %B Software Engineering, IEEE Transactions on %V 31 %P 884 - 896 %8 2005/10// %@ 0098-5589 %G eng %N 10 %R 10.1109/TSE.2005.117 %0 Journal Article %J Software, IEEE %D 2004 %T Preserving distributed systems critical properties: a model-driven approach %A Yilmaz,C. %A Memon, Atif M. %A Porter, Adam %A Krishna,A. S %A Schmidt,D. C %A Gokhale,A. %A Natarajan,B. %K configuration management %K formal verification %K Middleware %K middleware suite %K model-driven approach %K persistent software attributes %K QoS requirements %K Quality assurance %K quality of service %K quality-of-service %K Skoll distributed computing resources %K software configuration %K Software maintenance %K Software quality %K software quality assurance process %K system dependability %X The need for creating predictability in distributed systems is most often specified in terms of quality-of-service (QoS) requirements, which help define the acceptable levels of dependability with which capabilities such as processing capacity, data throughput, or service availability reach users. For longer-term properties such as scalability, maintainability, adaptability, and system security, we can similarly use persistent software attributes (PSAs) to specify how and to what degree such properties must remain intact as a network expands and evolves over time. The Skoll distributed continuous software quality assurance process helps to identify viable system and software configurations for meeting stringent QOS and PSA requirements by coordinating the use of distributed computing resources. The authors tested their process using the large, rapidly evolving ACE+TAO middleware suite. %B Software, IEEE %V 21 %P 32 - 40 %8 2004/// %@ 0740-7459 %G eng %N 6 %R 10.1109/MS.2004.50 %0 Conference Paper %D 2002 %T Formal specification and verification of a group membership protocol for an intrusion-tolerant group communication system %A Ramasamy,H. V. %A Michel Cukier %A Sanders,W. H. %K computer network reliability %K distributed processing %K distributed systems %K fault tolerant computing %K formal specification %K formal verification %K group membership protocol %K intrusion-tolerant group communication system %K PROMELA %K Protocols %X We describe a group membership protocol that is part of an intrusion-tolerant group communication system, and present an effort to use formal tools to model and validate our protocol. We describe in detail the most difficult part of the validation exercise, which was the determination of the right level of abstraction of the protocol for formally specifying the protocol. The validation exercise not only formally showed that the protocol satisfies its correctness claims, but also provided information that will help us make the protocol more efficient without violating correctness. %P 9 - 18 %8 2002/12// %G eng %R 10.1109/PRDC.2002.1185613 %0 Conference Paper %D 1997 %T Probabilistic verification of a synchronous round-based consensus protocol %A Duggal,H.S. %A Michel Cukier %A Sanders,W. H. %K business-critical applications %K consensus protocol correctness %K crash failures %K formal verification %K network environment %K probabilistic verification %K probabilities %K probability %K program verification %K proper behavior %K protocol behavior %K Protocols %K realistic environment %K reliable distributed systems %K safety-critical applications %K simple consensus protocol %K software reliability %K stochastic assumptions %K synchronous round based consensus protocol %K synchronous round based consensus protocols %K traditional proof techniques %X Consensus protocols are used in a variety of reliable distributed systems, including both safety-critical and business-critical applications. The correctness of a consensus protocol is usually shown, by making assumptions about the environment in which it executes, and then proving properties about the protocol. But proofs about a protocol's behavior are only as good as the assumptions which were made to obtain them, and violation of these assumptions can lead to unpredicted and serious consequences. We present a new approach for the probabilistic verification of synchronous round based consensus protocols. In doing so, we make stochastic assumptions about the environment in which a protocol operates, and derive probabilities of proper and non proper behavior. We thus can account for the violation of assumptions made in traditional proof techniques. To obtain the desired probabilities, the approach enumerates possible states that can be reached during an execution of the protocol, and computes the probability of achieving the desired properties for a given fault and network environment. We illustrate the use of this approach via the evaluation of a simple consensus protocol operating under a realistic environment which includes performance, omission, and crash failures %P 165 - 174 %8 1997/10// %G eng %R 10.1109/RELDIS.1997.632812 %0 Journal Article %J IEEE Transactions on Software Engineering %D 1995 %T Comparing detection methods for software requirements inspections: a replicated experiment %A Porter, Adam %A Votta,L. G. %A Basili, Victor R. %K Assembly %K Computer science %K Design for experiments %K detection methods %K Fault detection %K fault detection rate %K Fault diagnosis %K formal specification %K formal verification %K Gain measurement %K individual fault detection rate %K Inspection %K Loss measurement %K nonsystematic techniques %K performance evaluation %K Performance gain %K replicated experiment %K scenario-based method %K Software development management %K software requirements inspections %K software requirements specifications %K team fault detection rate %X Software requirements specifications (SRS) are often validated manually. One such process is inspection, in which several reviewers independently analyze all or part of the specification and search for faults. These faults are then collected at a meeting of the reviewers and author(s). Usually, reviewers use Ad Hoc or Checklist methods to uncover faults. These methods force all reviewers to rely on nonsystematic techniques to search for a wide variety of faults. We hypothesize that a Scenario-based method, in which each reviewer uses different, systematic techniques to search for different, specific classes of faults, will have a significantly higher success rate. We evaluated this hypothesis using a 3×24 partial factorial, randomized experimental design. Forty eight graduate students in computer science participated in the experiment. They were assembled into sixteen, three-person teams. Each team inspected two SRS using some combination of Ad Hoc, Checklist or Scenario methods. For each inspection we performed four measurements: (1) individual fault detection rate, (2) team fault detection rate, (3) percentage of faults first identified at the collection meeting (meeting gain rate), and (4) percentage of faults first identified by an individual, but never reported at the collection meeting (meeting loss rate). The experimental results are that (1) the Scenario method had a higher fault detection rate than either Ad Hoc or Checklist methods, (2) Scenario reviewers were more effective at detecting the faults their scenarios are designed to uncover, and were no less effective at detecting other faults than both Ad Hoc or Checklist reviewers, (3) Checklist reviewers were no more effective than Ad Hoc reviewers, and (4) Collection meetings produced no net improvement in the fault detection rate-meeting gains were offset by meeting losses %B IEEE Transactions on Software Engineering %V 21 %P 563 - 575 %8 1995/06// %@ 0098-5589 %G eng %N 6 %R 10.1109/32.391380 %0 Conference Paper %B Real-Time Systems Symposium, 1994., Proceedings. %D 1994 %T Verifying an intelligent structural control system: a case study %A Elseaidy,W. M %A Cleaveland, Rance %A Baugh,J. W. %K automatic verification tool %K case study %K Concurrency Workbench %K Distributed computing %K Distributed control %K distributed processing %K finite automata %K finite-state processes %K formal verification %K graphical specification language %K high-level design %K intelligent control %K intelligent structural control system verification %K Logic %K Modechart %K Process algebra %K Real time systems %K real-time systems %K Specification languages %K structural engineering computing %K temporal logic %K temporal process algebra %K time-varying systems %K Timing %K timing properties %K visual languages %X Describes the formal verification of the timing properties of the design of an intelligent structural control system using the Concurrency Workbench, an automatic verification tool for finite-state processes. The high-level design of the system is first given in Modechart, a graphical specification language for real-time systems, and then translated into a temporal process algebra supported by the Workbench. The facilities provided by this tool are then used to analyze the system and ultimately show it to be correct %B Real-Time Systems Symposium, 1994., Proceedings. %I IEEE %P 271 - 275 %8 1994/12/07/9 %@ 0-8186-6600-5 %G eng %R 10.1109/REAL.1994.342708