The CLIP Colloquium Series presents...


An Overview of Bioinformatics Research at Humboldt

Silke Trissl Humboldt-Universitat zu Berlin
October 22, 2008, 11 a.m. AVW 2120

The Knowledge Management Group in the Computer Science Department at Humboldt-University, Berlin is headed by Ulf Leser. The group has several research interests:

In the talk I will give a brief overview over projects running at the Knowledge Management Group. I will go into depths for two main research interests - data integration and protein function prediction.

Data integration

Columba is a database that integrates data about protein structures. When a user poses a query, she might expect the search results to be ranked.
In Columba we use a ranking that is based on the following observation: different sources contain information about the same types of biological facts. For example the data sources KEGG, aMAZE, and Reactome contain information about metabolic pathways. These three data sources overlap to a certain degree, but also contain diverse data. Querying all three data sources will result in results supported by one, two, or all three data sources. We call data sources with the similar content dimensions. In a setting with many dimensions the ranking of search results is therefore important. We developed two scores, namely the confidence and surprisingess score to rank search results (work presented at DILS 2007).

Protein function prediction

The second point I will focus on is protein function prediction using biological networks and text mining methods. For human, mice, yeast, fruit fly, and arabidopsis protein-protein interaction data are known. Finding common subgraphs in those networks allows to transfer functions of a protein in one network to the corresponding protein in the other network. Using also the network information Samira Jaeger, a PhD student in our group, could show that the prediction accuracy is better than using just sequence information.
To enhance the precision of the results, they furthermore implemented a procedure that validates all predictions based on findings reported in the literature. (work presented at DILS 2008).

About the Speaker

Silke Trissl is a PhD student at Humboldt-University Berlin, Germany. Her supervisor is Ulf Leser, who holds the chair for Knowledge Management in Bioinformatiks in the Computer Science Department. From 1996 to 2001 Silke studied Biotechnology at the University of Applied Sciences in Weihenstephan, Germany, which she finished with a diploma. In 2002 she received a MSc in Bioinformatics at the University of Ulster, UK. She joined the group of Professor Leser in 2003 and worked on two projects, Columba and graph querying.


This talk is part of the CLIP Colloquium Series. For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.