An integrated runtime and compile-time approach for parallelizing structured and block structured applications

TitleAn integrated runtime and compile-time approach for parallelizing structured and block structured applications
Publication TypeJournal Articles
Year of Publication1995
AuthorsAgrawal G, Sussman A, Saltz J
JournalIEEE Transactions on Parallel and Distributed Systems
Pagination747 - 754
Date Published1995/07//
ISBN Number1045-9219
KeywordsBandwidth, block structured applications, block structured codes, compile-time approach, compiling applications, data access patterns, Data analysis, Delay, distributed memory machines, distributed memory systems, FORTRAN, Fortran 90D/HPF compiler, High performance computing, HPF-like parallel programming languages, integrated runtime approach, irregularly coupled regular mesh problems, multigrid code, Navier-Stokes solver template, Parallel machines, parallel programming, Pattern analysis, performance evaluation, program compilers, Program processors, Runtime library, Uninterruptible power systems

In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). In this paper, we present runtime and compile-time analysis for compiling such applications on distributed memory parallel machines in an efficient and machine-independent fashion. We have designed and implemented a runtime library which supports the runtime analysis required. The library is currently implemented on several different systems. We have also developed compiler analysis for determining data access patterns at compile time and inserting calls to the appropriate runtime routines. Our methods can be used by compilers for HPF-like parallel programming languages in compiling codes in which data distribution, loop bounds and/or strides are unknown at compile-time. To demonstrate the efficacy of our approach, we have implemented our compiler analysis in the Fortran 90D/HPF compiler developed at Syracuse University. We have experimented with a multi-bloc Navier-Stokes solver template and a multigrid code. Our experimental results show that our primitives have low runtime communication overheads and the compiler parallelized codes perform within 20% of the codes parallelized by manually inserting calls to the runtime library