TY - CONF T1 - Using Hardware Counters to Automatically Improve Memory Performance T2 - Proceedings of the 2004 ACM/IEEE conference on Supercomputing Y1 - 2004 A1 - Tikir, Mustafa M. A1 - Hollingsworth, Jeffrey K AB - In this paper, we introduce a profile-driven online page migration scheme and investigate its impact on the performance of multithreaded applications. We use lightweight, inexpensive plug-in hardware counters to profile the memory access behavior of an application, and then migrate pages to memory local to the most frequently accessing processor. Using the Dyninst runtime instrumentation combined with hardware counters, we were able to add page migration capabilities to the system without having to modify the operating system kernel, or to re-compile application programs. This approach reduced the total number of non-local memory accesses of applications by up to 90%. Even on a system with small remote to local memory access latency rations, this resulted in up to 16% improvement in execution time. JA - Proceedings of the 2004 ACM/IEEE conference on Supercomputing T3 - SC '04 PB - IEEE Computer Society SN - 0-7695-2153-3 UR - http://dx.doi.org/10.1109/SC.2004.64 M3 - http://dx.doi.org/10.1109/SC.2004.64 ER -