Sharing-aware horizontal partitioning for exploiting correlations during query processing

TitleSharing-aware horizontal partitioning for exploiting correlations during query processing
Publication TypeJournal Articles
Year of Publication2010
AuthorsTzoumas K, Deshpande A, Jensen CS
JournalProc. VLDB Endow.
Pagination542 - 553
Date Published2010/09//
ISBN Number2150-8097

Optimization of join queries based on average selectivities is suboptimal in highly correlated databases. In such databases, relations are naturally divided into partitions, each partition having substantially different statistical characteristics. It is very compelling to discover such data partitions during query optimization and create multiple plans for a given query, one plan being optimal for a particular combination of data partitions. This scenario calls for the sharing of state among plans, so that common intermediate results are not recomputed. We study this problem in a setting with a routing-based query execution engine based on eddies [1]. Eddies naturally encapsulate horizontal partitioning and maximal state sharing across multiple plans. We define the notion of a conditional join plan, a novel representation of the search space that enables us to address the problem in a principled way. We present a low-overhead greedy algorithm that uses statistical summaries based on graphical models. Experimental results suggest an order of magnitude faster execution time over traditional optimization for high correlations, while maintaining the same performance for low correlations.