UMD

The University of Maryland and IBM Partnership
Previous Projects

IBM

Project 1: Netfinity Cluster Project

UMD PI: Jeff Hollingsworth, with participating faculty: Keleher, Saltz, Sussman, and Tseng;
IBM Collaborators: Marc Snir, Judy Engles (IBM Power Parallel)

Project 2: High Performance Tools for Data-Intensive Computations with Applications to Earth System Science

PI: Joseph JaJa;
IBM Collaborators: Chung-Sheng Li, Larry Bergman, John Smith

Project 3: Speech-Based Information Retrieval for Digital Libraries

PI: Doug Oard;
IBM Collaborators: Selim Roukos

Project 4: The University of Maryland Electronic Commerce Initiative



Project 1: Netfinity Cluster Project

UMD PI: Jeff Hollingsworth, with participating faculty: Keleher, Saltz, Sussman, and Tseng;
IBM Collaborators: Marc Snir, Judy Engles (IBM Power Parallel)

UMIACS and the Department of Computer Science are in the process of setting up a cluster of multiple types of commodity processors, disks, and networking technologies, managed as an integrated shared resource. The cluster will support research in systems software for information intensive and wide-area computing research, including applications in earth system science, digital pathology, computer vision, and computational neuroscience. The planned cluster will consist of three distinct types of nodes: server, compute, and specialized. The server nodes will each be equipped with multiple processors and multiple fast-wide SCSI disks, used either for performing computationally intensive computations or as I/O server nodes for simulated wide-area tests. The compute nodes will each consist of single processor nodes, a high speed interconnect, and a modest amount of disk space. The specialized nodes will be configured with inexpensive processors, and multiple relatively high capacity, low performance (but inexpensive) disks).

The software infrastructure will be based on Linux. For distributed memory message passing programming, we will use MPI (MPICH or LAM). In addition, we will likely run HPSS to manage migration of files from server nodes to the specialized nodes. We will be developing resource management policies based on Active Harmony, a software architecture that manages distributed execution of computational objects in dynamic environments. In addition, we will use this cluster to support our effort to port IBM's DPCL tools Linux clusters.

We propose to acquire Netfinity SMPs as the high-end server nodes for this cluster. This configuration will provide a significant compute platform, yet retain the cost-effectiveness of small SMP nodes. We will connect these nodes to each other and via the rest of the cluster via gigabit Ethernet.

More details about the research projects to be supported by this cluster are given next.

Active Harmony (Hollingsworth and Keleher)

The Active Harmony project is seeking to expand the interface between applications and system software. We have developed an API to permit applications to be written to be resource-aware. In particular, our model is based on having export options to the runtime system which then selects particular values for the application based on the available resources and other workload on the nodes. In addition, the applications indicate the expected resource utilization for each option and resulting performance level (e.g. completion time or speedup). For example, a parallel application might expose an option that would select the number of nodes the program would use. The expected resource information might indicate the application can effectively use 16 nodes. but that performance improves marginally after that. The harmony system could use this information to run the job on more than 16 nodes if nodes were available, but only on 16 nodes if other jobs could more effectively use those nodes.

To date, we have converted several simple applications to use the harmony interface. In addition, we have "harmonized" a client-server database system to automatically select between running queries on the client or the server node based on sever load and predicted resource utilization for the query. We are currently working with the Computer Vision group of the University of Maryland to develop a "harmonized" version of their real-time computer vision and object tracking system.

Additional information about the harmony project is available from Active Harmony

Active Data Repository (Sussman and Saltz)

We have been developing an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of large multi-dimensional datasets on distributed memory parallel architectures with multiple disks attached to each node. ADR targets applications with a particular form of processing. The processing steps consist of retrieving input and output data items that fall in the desired portion of the multi-dimensional space, mapping the coordinates of the retrieved input items to the corresponding output items, and aggregating, in some way, all the retrieved input items mapped to the same output data items. Correctness of the output usually does not depend on the order input data items are aggregated.

ADR is designed as a set of modular services implemented in C++. Through use of these services, ADR allows customization for application-specific processing, while providing support for common operations such as memory management, data retrieval, and scheduling of processing across a parallel machine. The system architecture of ADR consists of a front-end and a parallel back-end. The front-end interacts with clients, and forwards queries with references to user-defined processing functions to the parallel back-end. During query execution, back-end nodes retrieve input data and perform user-defined operations over the data items retrieved to generate the output products. Output products can be returned from the back-end nodes to the requesting client, or stored in ADR.

We have implemented several applications with very large storage and computational requirements with ADR. The Titan satellite data processing application stores four kilometer resolution global coverage data, and ten years of that data consists of over 1.4TB. For the Virtual Microscope ADR application, one focal plane of a single slide requires over 7GB (uncompressed) at high power, and a hospital such as Johns Hopkins produces hundreds of thousands of slides per year. Similarly, the computation for one ten day composite Titan query for the entire world takes about 100 seconds per processor on the Maryland sixteen node IBM SP2.

More information is available at Active Data Repository Project.

Multi-level Memory Hierarchies (Tseng)

Modern microprocessors provide high performance by exploiting data locality with carefully designed multilevel caches. However, advanced scientific computations have features which make it difficult to utilize caches effectively. To exploit locality for these complex applications will require sophisticated compile-time analyses to guide both compiler and run-time data and computation transformations. We propose to develop and evaluate software support for improving locality for advanced scientific applications. We will investigate techniques needed on both uniprocessors and shared-memory multiprocessors.

We are targeting three areas. First, iterative solvers for 3D partial differential equations have poor locality because accesses to nearby elements in higher-level dimensions are spread far apart in memory. Careful tiling and padding can frequently recapture such reuse. Second, computations on adaptive meshes and sparse matrices experience many cache misses because they access data in an irregular manner. Data layout and access order can be rearranged according to mesh connections or geometric location to improve locality, with cost models used to guide frequency of transformations for adaptive computations. Third, applications using pointer-based data structures (e.g., linked lists, trees) to organize data also access data irregularly in memory. Compiler and run-time analysis can improve locality by using simple analyses to change memory layout and allocation. Preliminary results show significant performance improvements are possible in all three cases, indicating the importance of providing software support for locality in advanced scientific computations.

More information is available at COSMIC

Software DSM (Keleher)

The CVM group study scalable software distributed shared memory (SDSM) protocols. SDSM is a technique that uses a software abstraction to emulate a shared address space across workstation clusters connected by general-purpose networks. Research in SDSM is enjoying a resurgence because of the new-found popularity of hardware shared-memory (SMP) machines. With more and more desktop platforms becoming multiprocessors, shared-memory becomes the dominant programming model.

Although SDSM was originally targeted at uniprocessors, it can also be used to combine multiprocessor platforms. These SMP-based systems are attractive both because of the ubiquity of small-scale SMP's, but also because they provide support for fine-grain sharing. Conventional wisdom maintains that SDSM's are incapable of supporting fine-grain applications due to the lack of efficient mechanisms for communication. However, unlike SDSM's that use uniprocessor nodes, SMP-based systems provide hardware cache-coherence between processors within each multiprocessor. The result is that an SMP-based SDSM can efficiently support fine-grain sharing for a large variety of application sharing patterns.


The proposed cluster of SMP's will allow us to develop scalable protocols, and to test them in a variety of configurations.
Coherent Virtual Machine (CVM)
Systems Software for Linux Clusters of PCs

To top of page


Project 2: High Performance Tools for Data-Intensive Computations with Applications to Earth System Science

PI: Joseph JaJa;
IBM Collaborators: Chung-Sheng Li, Larry Bergman, John Smith

The University of Maryland is one of the NASA-supported 12 Earth Science Information Partnerships (ESIPs) Type 2 which are developing innovative information technologies and data products based on satellite data for earth system scientists. In September 1998, the University of Maryland inaugurated the Global Land Cover Facility (GLCF) whose computing infrastructure is built on top of IBM SP/HPSS technologies. We have developed browsing, searching, visualization, and processing tools focusing on land cover data and related products. Since then, many scientists have accessed and ordered data sets from the GLCF.

We are planning to collaborate with the IBM research group led by Chung-Sheng Li (who is leading an ESIP effort on progressive mining of remotely sensed data for environmental and public health applications) in exploring the use of several IBM technologies for the GLCF and their potential applications to the Federation of ESIPs. In particular, we hope to test their newly developed tool SFGraph for the progressive transmission of images to help with the multiresolution/progressive browsing of the GLCF data products by remote users. The SFGraph integrates wavelets and spatial quad-tree to allow for progressive viewing of large images. We will also provide some of our data products to the IBM group so that they can test their content-based retrieval tools for classification.

In addition, we are collaborating with the HPSS team at San Diego Supercomputer Center in developing performance models for earth system applications on HPSS. All the GLCF data is currently supported by HPSS, but given the complexity of the software and the many possible configurations, we are exploring different models for optimizing HPSS for earth system science applications.

To top of page
 
Progress Reports Previous Projects Current Projects