TY - CONF T1 - The block distributed memory model for shared memory multiprocessors T2 - Parallel Processing Symposium, 1994. Proceedings., Eighth International Y1 - 1994 A1 - JaJa, Joseph F. A1 - Ryu,Kwan Woo KW - accesses;shared KW - address KW - algebra;parallel KW - algorithms;performance KW - algorithms;performance;pipelined KW - allocation;shared KW - balancing;matrix KW - bandwidth;computation KW - block KW - Communication KW - communication;load KW - complexity;computational KW - complexity;cost KW - complexity;distributed KW - complexity;optimal KW - data KW - distributed KW - evaluation;resource KW - Fourier KW - latency;optimal KW - locality;communication KW - measure;data KW - memory KW - model;communication KW - model;computational KW - multiplication;memory KW - multiprocessors;single KW - placement;interprocessor KW - prefetching;remote KW - problems;fast KW - rearrangement KW - space;sorting;spatial KW - speedup;parallel KW - systems;fast KW - systems;sorting; KW - transforms;input KW - transforms;matrix AB - Introduces a computation model for developing and analyzing parallel algorithms on distributed memory machines. The model allows the design of algorithms using a single address space and does not assume any particular interconnection topology. We capture performance by incorporating a cost measure for interprocessor communication induced by remote memory accesses. The cost measure includes parameters reflecting memory latency, communication bandwidth, and spatial locality. Our model allows the initial placement of the input data and pipelined prefetching. We use our model to develop parallel algorithms for various data rearrangement problems, load balancing, sorting, FFT, and matrix multiplication. We show that most of these algorithms achieve optimal or near optimal communication complexity while simultaneously guaranteeing an optimal speed-up in computational complexity JA - Parallel Processing Symposium, 1994. Proceedings., Eighth International M3 - 10.1109/IPPS.1994.288220 ER -