Dynamic node management and measure estimation in a state-driven fault injector

TitleDynamic node management and measure estimation in a state-driven fault injector
Publication TypeConference Papers
Year of Publication2000
AuthorsChandra R, Cukier M, Lefever R, Sanders W, others
Date Published2000///
Abstract

Validation of distributed systems using fault injectionis difficult because of their inherent complexity, lack of a
global clock, and lack of an easily accessible notion of a
global state. To address these challenges, the Loki fault
injector injects faults based on a partial view of the global
state of a distributed system, and performs a post-runtime
analysis using an off-line clock synchronization algorithm
to determine whether the faults were properly injected.
In this paper, we first describe an enhanced runtime
architecture for the Loki fault injector and then present a
new method for obtaining measures in Loki. The enhanced
runtime allows dynamic entry and exit of nodes in the
system. It also offers more efficient multicast of notification
messages and more efficient communication between state
machines on the same host, and is more scalable than
the previous runtime. We then detail a new and flexible
method for obtaining a wide range of performance and
dependability measures in Loki.

URLhttps://www.perform.csl.illinois.edu/Papers/USAN_papers/00CHA01.pdf