A typical operation in our image processing algorithms requires that each pixel be updated based on the pixel values of a small neighborhood. Such an operation amounts to a regular grid shift of a element parallel variable data array.
Two phases occur in the regular grid shift:
For processors, takes O =
Next we make some observations about the communication phase:
This communication pattern is a block permutation, and the time complexity for this operation is given in (3). Note that each node must send a constant number of upper rows of its subgrid to its north adjacent node in the major grid. There are elements along this border. Thus, , and the communications time is as follows:
Therefore, a regular grid shift of a data array on p nodes connected by a CM-5 fat-tree has the following time and communication complexity: