BioFast: Efficient and Seamless Life Science Data Management

There has been an exponential growth in the amount of data on the web that is available to the biological enterprise. This wealth of complex and diverse data presents significant opportunities and challenges for data integration and seamless access to these sources. This report summarizes research in Bioinformatics circa 2002. This workshop sponsored by NSF and NIH addressed data management challenges for the genomics, proteomics, etc.

Our research will apply prior expertise on data integration architectures based on wrappers and mediators, query planning and optimization, and data modeling and data mining techniques to provide seamless access to heterogeneous Web accessible sources. We will develop techniques from areas such as query optimization, adaptive query evaluation, machine learning and schema mapping and integration of heterogeneous databases, to solve problems of data integration with biological data sources. Specific research projects include the following: