-
AIX System Software:
-
AIX Parallel System Support Programs (AIX PSSP) is a collection
of administrative software applications designed to provide the unique system
management functions required to manage the Scalable POWERparallel 2 (SP2)
system as a full function parallel system. AIX PSSP provides extensions to
AIX and contains all of the software needed to install, operate, and maintain
an SP2 system from a single point, the control workstation. Multiple components
are included which are integrated together into a single software program
product. In addition, a number of useful publicly available software packages
are distributed with AIX PSSP.
AIX PSSP components span the SP2 software framework including:
-
System Administration: Installation of AIX/6000 and
other system software in the SP2 parallel environment, user and password
management, job accounting, print management, system startup/shutdown, and file
collection support.
-
System Monitor: Performs monitoring and control of the
SP2 and key hardware variables.
-
System Data Repository: Provides repository for
administrative data needed to define and manage the SP2 system.
-
Resource Manager: Interfaces with other SP2 job
management software to allocate SP2 resources (node quantity, node type, disk)
for serial and parallel jobs and monitors and controls their execution.
-
Communication Subsystem Support: Provides the functions
needed to support the SP2 High-Performance Switch including switch fault
handling and parallel application communications interfaces.
-
Virtual Shared Disk: Allows creation of logical disk
volumes accessible from multiple nodes executing tasks of a parallel
application.
-
Parallel Operating Environment (POE):
-
The POE consists of components for developing, executing,
debugging, profiling, and tuning parallel programs.
POE eases the transition from serial to parallel processing by
allowing the use of standard AIX tools and techniques. This includes the areas
of program compilation, scheduling, execution and monitoring in a manner
familiar to the UNIX programmer.
-
Parallel Desktop:
-
POE has an AIX Windows/6000 Desktop interface, Parallel
Desktop, which simplifies use of POE. Source files, executables, and AIX
Parallel Environment tools are represented on the Parallel Desktop as icons,
easily manipulated to compile and execute parallel programs or start other AIX
Parallel Environment tools.
The Partition Manager of POE allows a user to compile his program using
the XL FORTRAN or C Set ++ (C and C++) compilers and link in all required
run-time libraries. It also controls parallel program execution by allocating
nodes required to run the job, copies the executable tasks into the individual
allocated nodes and invokes their execution, monitors the status of task
execution, and communicates with job scheduling systems.
-
System Analysis Tools:
-
POE contains two X Window System analysis tools:
Program Marker Array and System Status Array.
-
The
Program Marker Array
-
is a run-time analysis tool allowing the monitoring of a program's execution by
presenting a graphical representation of the program tasks. This window
representation consists of a number of small squares called lights that
change color under program control. Each parallel task has its own row of
lights. Subroutine calls inserted into these tasks change the light colors.
This visual feedback can be invaluable in detecting application problems.
-
The System Status Array
-
is an on-line analysis tool for examining the utilization of processor nodes.
A window consisting of a number of colored squares representing SP2 or RISC
System/6000 cluster processor nodes shows the percent of CPU utilization so
the nodes that are least busy may be better utilized.
For identifying problem areas in parallel programs and
to assist in tuning them, the Parallel Operating Environment contains the
familiar AIX profilers, prof and gprof, allowing the determination for each
executed task of:
Execution time in each routine
Number of times the routine was called
Average milliseconds per call.
-
Visualization Tool:
-
The AIX Parallel Environment's Visualization Tool (VT)
is designed to graphically show the performance and communications
characteristics of an application program, (Trace Visualization),
and also to act as an on-line monitor, (Performance Monitoring). Essentially,
the VT is a group of displays, or visuals, which each show some unique
execution characteristic of an application or system.
-
Debugger:
-
The AIX Parallel Environment builds on the familiar
and effective AIX dbx debugging facility but adds additional functions specific
to parallel program debugging. The Parallel Debugger provides a command line
interface, pdbx and a graphical user interface,
xpdbx
designed for users not familiar with dbx.
The Parallel Debugger lets you:
-
'Unhook' program processes not to be monitored
-
Debug object files
-
Set breakpoints at selected statements or run a program
line-by-line
-
Debug a program using symbolic values, displaying them
in the correct format.
-
Message Passing Library (MPL):
The Parallel Message Passing Library is a programming interface
for converting a serial FORTRAN, C, or C++ program into a parallel application
via subroutine calls. The MPL provides a rich and diverse set of subroutines
for coding simple operations, such as task-to-task message passing, as well as
the more advanced operations required for highly complex communications.
At execution time, the user specifies whether parallel task
communications can occur via the Internet Protocol (IP) through token ring,
ethernet or SP2 High-Performance Switch adapters (configured for IP) or via
'User Space' access to the SP2 High-Performance Switch for maximum performance.
Collective communication functions of MPL are subroutines for
frequently needed operations involving multiple tasks. For example, the
'scatter' call lets you distribute messages from a single source task to each
task in a group. Conversely, with the 'gather' call, you can gather messages
from each task in a group into a single destination task. These capabilities
can greatly reduce the time and effort spent on developing parallel
applications.