NSF Workshop on High Performance Computing and Communications and Health Care

Authors

April 15, 1995

Introduction

During the summer of 1994 the National Science Foundation provided a grant to the University of Maryland Institute for Advanced Computer Studies (UMIACS) (in conjunction with the University of California Berkeley and the Center for Research in Parallel Computation at Rice) to organize a workshop on Computer Science and Health Care. The goal of the workshop was to identify fundamental research problems in computer science, especially high performance computing and communications, that are critical enabling technologies for addressing issues related to improving the quality and cost of health care in the United States.

The workshop was held in Washington D.C. on December 8-10 and brought together over 50 scientists from academia, industry and government representing the computer science, the medical informatics and medical science research communities. In order to focus the workshop, four health care application areas were identified, each considered in depth by a working group consisting of computer and medical scientists. The four working groups were:

  1. Robotics, Telemedicine and Image Processing , chaired by Prof. Eric Grimson of M.I.T. and Dr. Anthony DiGioia of the University of Pittsburgh. This working group focused on fundamental problems in vision and image analysis, visualization and planning that arise in robot surgery and telemedicine. It considered how the computational demands of these applications will drive them towards high performance solutions.
  2. Languages and Tools for Medical Computing , chaired by Prof. Kathy Yelick of University of California Berkeley and Dr. Larry Kingsland of the National Library of Medicine. Different data abstractions are useful at different times; this working group considered what HPC/NIE languages and methods would make it possible for users to specify how data should be abstracted and apply these abstractions to massive distributed databases. This will be particularly important when providers need to browse libraries which combine medical literature, patient data and care plans.
  3. Database and multi-database design , chaired by Prof. Geoffrey Fox of Syracuse University and Dr. Bill Braithwaite of the University of Colorado Medical School. This group covered health services research and the use of clinical and financial information in the management of health care organizations. The group envisaged a set of longitudinal patient records recording the care and health of every patient. It addressed issues such as how one should design multi-database architectures able to manage patient records as well as to manipulate large quantities of aggregate patient data, the roles HPC/NIE will play in supporting these multidatabases, and new HPC/NIE methods that might be employed to explore databases in order to find cost effective clinical strategies that lead to the best possible outcomes.
  4. Data capture and navigation, chaired by Dr. Ian Foster of Argonne National Laboratories and Dr. Mark Frisse of Washington University. The data capture and navigation group was dedicated to characterizing how medical databases and their interfaces should be structured so that the usefulness of these databases to patients and providers can be optimized. The need here is to provide support to make it possible for providers to pose clinical questions and view the results from those queries in a timely and intuitive nature. It will be necessary for providers to be able to access information in any clinical database that might contain data about a given patient.
Working group discussions were focused through the presentation of a sequence of "vignettes" by medical researchers, each representing a hypothetical situation in which an automated health care system assists in developing a solution to a health care application. These vignettes were designed to stress the current state of computing and lead subsequent discussion to the identification of those basic computer science problems which must be addressed for the hypothetical health care systems to become realizable.

The detailed recommendations of the individual working groups, along with the driving medical and health care applications, are presented in sections 2-5 of this report. Here we summarize those computer science research topics which are relevant to the health care applications addressed by some or all of the working groups. These topics were identified by workshop participants as research areas that the Foundation should emphasize in its high performance computing and communications research programs.

  1. Architectures for interoperability of information and computational resources. Making effective use of the database, networking, and computing infrastructure that will be deployed in hospitals in the next five to ten years is a very complex distributed computing problem. Applications will have to be able to locate patient records quickly in an emergency, from a large distributed database system. In addition, medical researchers and health care system administrators will want to link multiple patient databases to one another and to auxiliary databases used to define such items as hospital facilities and procedures. There is a fundamental challenge to develop system architectures that will provide the flexibility, scalability and extensibility required to support the wide range of software services needed by large integrated health service providers. A key challenge is to develop extensible systems software architectures so that the health care information systems developed in the next five years will not become incompatible, hard-to-extend legacy systems of tomorrow.
  2. Security . The extensive expansion of computing through high speed networking has focused attention on security issues throughout the computing community. There are security problems critical to the development of health care information systems whose solution would also contribute to providing security in many commercial systems. These include maintaining privacy while sharing medical information, and guarding against denial of service, guarantees that an individual's medical records cannot be lost or tampered with, etc. (the analogs in other domains, such as financial ones, being obvious).
  3. Reliability and robustness . Reliability and robustness are issues that arise in the design and development of any system whose performance will effect people's safety or livelihood. These issues become paramount in health care applications where patient safety is an issue, such as in robotic surgical interventions, or in automated diagnostic systems. In addition to the problems posed by the design of mechanisms and architectures that promote reliability and robustness, issues related to metrics for system reliability and robustness and experimental methods for evaluating such metrics will be very important for health care applications.
  4. Real time high performance computing . Time critical problems arise in robot surgery, telepresence applications and crisis management. The fundamental research issues that will allow us to develop real time systems that can assemble heterogeneous computational resources across networks to effectively address such problems need to be identified and addressed.
  5. Knowledge and process based systems . In addition to the traditional issues in knowledge representation that must be addressed to encode high level semantic information about domains for health care information systems, there are also challenging new problems related to the representation of medical "processes" (diseases, care plans) that need to be formalized and studied by the AI community. Methods for inference of process models from databases of case histories and validation of process models across heterogeneous databases of patient records are especially important problems.
  6. Image processing and computer vision . At the core of many advanced health care applications are the requirements to manage, communicate and interpret massive amounts of multi-modal, multi-temporal imagery. Fundamental problems that need to be addressed include registration of imagery across time and modality, segmentation of imagery into natural categories and visualization of multimodal imagery in diagnostic and surgical domains.
  7. Algorithms for Analyzing and Exploring Large Datasets. Historically, challenges posed by medical problems have motivated many advances in the fields of Statistics and Artificial Intelligence. Traditionally, researchers in both fields have had to make do with relatively small medical datasets that typically consisted of no more than a few thousand patient records. This situation will dramatically change over the next decade by which time we anticipate that most health care organizations will have adopted computerized patient record systems. A decade from now, we can expect that there will be some 100 million and eventually many more patient records with, for example, a full database size of 10 terabytes corresponding to 100 text pages of information for each of 100 million patients. Functionalities needed in the use of and analysis of distributed medical databases will include segmentation of medical data into typical models or templates (e.g. characterization of disease states), and comparison of individual patients with templates (to aid diagnosis and to establish canonical care maps). The need to explore these large datasets will drive research projects in statistics, optimization and artificial intelligence.
  8. Fast Exploration of Large Datasets . Care providers and managers will want to be able to rapidly analyze data extracted from large distributed and parallel databases that contain both text and image data. We anticipate that there will be significant performance issues that will arise because of the demand to interactively analyze large (multi-terabyte) datasets. Users will want to minimize waste of time and funds due to searches that reveal little or no relevant information in response to a query, or retrieval of irrelevant, incorrect or corrupted data sets.
The current NSF program in High Performance Computing and Communications is already addressing many of these problems, but some redirection of research priorities is needed in parallel computing. The traditional NSF focus on parallel computing as computational science needs to be expanded to meet the challenges of advanced health care systems and other national challenges. The languages, systems, and tools that have been used in physics and chemistry are not sufficient for the applications considered by the workshop. Some of the additional technology areas identified by the workshop as being critical to addressing these national challenges, such as heterogeneous database systems, are being addressed by both other Government research agencies and by industry, but there is a great deal of fundamental research in systems, languages, integration tools and human/computer interaction that only the National Science Foundation is in a position to support.

The workshop also identified a need for the Foundation to identify new (co-) funding mechanisms with the more mission-oriented agencies that address problems in health care and medicine so that computer scientists and their colleagues from health care and medicine can both design and prototype the computer systems that will demonstrate the feasibility of advanced applications. While the Grand Challenge and National Challenge programs within the Foundation represent the appropriate program structures, these programs all represent alliances between Directorates within the Foundation. Similar, cross agency programs in health care need to be developed, with the Foundation taking the lead in identifying the intellectual content of these programs, working closely with agencies like the National Institutes of Health to identify appropriate funding and administrative structures.

Working group reports