Automatic software interference detection in parallel applications

TitleAutomatic software interference detection in parallel applications
Publication TypeConference Papers
Year of Publication2007
AuthorsTabatabaee V, Hollingsworth J
Conference NameProceedings of the 2007 ACM/IEEE conference on Supercomputing
Date Published2007///
Conference LocationReno, Nevada
ISBN Number978-1-59593-764-3

We present an automated software interference detection methodology for Single Program, Multiple Data (SPMD) parallel applications. Interference comes from the system and unexpected processes. If not detected and corrected such interference may result in performance degradation. Our goal is to provide a reliable metric for software interference that can be used in soft-failure protection and recovery systems. A unique feature of our algorithm is that we measure the relative timing of application events (i.e. time between MPI calls) rather than system level events such as CPU utilization. This approach lets our system automatically accommodate natural variations in an application's utilization of resources. We use performance irregularities and degradation as signs of software interference. However, instead of relying on temporal changes in performance, our system detects spatial performance degradation across multiple processors. We also include a case study that demonstrates our technique's effectiveness, resilience and robustness.