Collaboration, Visualization, and Information Management Technology

Projects

+ Dynamic Query Optimization Project
+ WebSemantics Project
+ Wrapper Generation Project
+ Information Mediation


Innovative Claims of the Proposed Research

Current I3 Technology has produced several successful prototypes and systems. As I3 systems begin to be deployed in a wide-area network-based environment, they will encounter significant challenges in dealing with a large number of unpredictable and unreliable repositories (sources). In order that the deployment of these systems are not compromised, I3 technology must overcome the following three challenges: (1) the availability of particular (heterogeneous) repositories cannot be determined a priori; (2) the response time to access remote sources can fluctuate dramatically during query processing; and (3) currently, there is little support for identifying and locating a relevant repository. We propose innovative solutions for these three challenges.

The first challenge of dynamically (un)available sources is that a query must be accepted without exact knowledge of the available sources. Heterogeneity implies these sources vary widely in their processing capability and cost, thus, complicating this decision. With 100's of sources, we can also expect replication of contents. Replication should be exploited to provide least-cost answers, in a dynamic environment of (un)available sources. This challenge is Dynamic Query Optimization. The innovation of our research is the capability to represent "wrapped"' sources, (with widely varying processing capability, costs, and restrictions), in a source-independent manner. Replication will be exploited to identify similar or identical sources. An innovative cost-model, tailored to wide-area networked servers, and based on parameterized learning from query feedback, will be developed. We generate least-cost non-redundant plans for the available sources.

The second challenge is that data access over wide-area networks involves a large number of sources and communications links, which are vulnerable to congestion and failure. Congestion or failure are manifested as highly-variable response time --- that is, the time required for obtaining data from sources can vary greatly, depending on the specific data sources and the state of the network. The challenge of addressing problems of response-time variability is Dynamic Query Evaluation; we note that current I3 technology has largely dealt with static evaluation and will be brittle in dynamic environments. The innovation of this research is a class of dynamic, runtime query plan modification techniques or query plan scrambling. A query is initially executed using a plan obtained from an optimizer. When there are low to mid variances in response times, then query plan scrambling modifies the execution on-the-fly. Scrambling allows useful work to be done while waiting for sources, and problematic data may be obtained in a background fashion. With large variance of delays, where the evaluation is eventually suspended, partial results can be returned to users, and/or used for query processing at a later time.

The third challenge is locating and accessing relevant repositories on the WWW, where the WWW may be characterized as largely unorganized and chaotic. Increasing the number of data sources into the 100's with current I3 technology, as would be needed in the WWW environment, aggravates two problems. Current I3 technology assumes that the specific location of wrappers/translators and data sources is static. There is no technology to dynamically locate any of these components. Second, users have no tools to assemble answers on-the-fly from these dynamically located sources. This task is Dynamic Location and Assembly of Web-based Data. We utilize the WEBSEMANTICS protocol and architecture to overcome the challenge of locating and assembling information dynamically.

Deliverables Associated with the Proposed Research

Three prototypes are developed for Dynamic Query Optimization DynOpt.

The work on Dynamic Query Evaluation DynEval will proceed in three phases.

The work on Dynamic Location and Assembly of Web-based Data DynWeb is based on the WEBSEMANTICS protocol. We produce three prototypes.



Return to CLIP home page

Back to database group home page