Representing Tuple and Attribute Uncertainty in Probabilistic Databases

TitleRepresenting Tuple and Attribute Uncertainty in Probabilistic Databases
Publication TypeConference Papers
Year of Publication2007
AuthorsSen P, Deshpande A, Getoor L
Conference NameSeventh IEEE International Conference on Data Mining Workshops, 2007. ICDM Workshops 2007
Date Published2007/10/28/31
ISBN Number978-0-7695-3019-2
Keywordsattribute uncertainty, Computer science, Conferences, correlation structures, data mining, Data models, database management systems, Educational institutions, inference mechanisms, noisy data sources, probabilistic database, probabilistic inference, Probability distribution, Query processing, Relational databases, Sensor phenomena and characterization, tuple representation, Uncertainty, uncertainty handling

There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources-sensor data, experimental data, data from uncurated sources, and many others. There is a growing need to be able to flexibly represent the uncertainties in the data, and to efficiently query the data. Building on existing probabilistic database work, we present a unifying framework which allows a flexible representation of correlated tuple and attribute level uncertainties. An important capability of our representation is the ability to represent shared correlation structures in the data. We provide motivating examples to illustrate when such shared correlation structures are likely to exist. Representing shared correlations structures allows the use of sophisticated inference techniques based on lifted probabilistic inference that, in turn, allows us to achieve significant speedups while computing probabilities for results of user-submitted queries.