TY - CONF T1 - Automatic software interference detection in parallel applications T2 - Proceedings of the 2007 ACM/IEEE conference on Supercomputing Y1 - 2007 A1 - Tabatabaee, Vahid A1 - Hollingsworth, Jeffrey K AB - We present an automated software interference detection methodology for Single Program, Multiple Data (SPMD) parallel applications. Interference comes from the system and unexpected processes. If not detected and corrected such interference may result in performance degradation. Our goal is to provide a reliable metric for software interference that can be used in soft-failure protection and recovery systems. A unique feature of our algorithm is that we measure the relative timing of application events (i.e. time between MPI calls) rather than system level events such as CPU utilization. This approach lets our system automatically accommodate natural variations in an application's utilization of resources. We use performance irregularities and degradation as signs of software interference. However, instead of relying on temporal changes in performance, our system detects spatial performance degradation across multiple processors. We also include a case study that demonstrates our technique's effectiveness, resilience and robustness. JA - Proceedings of the 2007 ACM/IEEE conference on Supercomputing T3 - SC '07 PB - ACM CY - Reno, Nevada SN - 978-1-59593-764-3 M3 - 10.1145/1362622.1362642 ER -