On maximum coverage in the streaming model & application to multi-topic blog-watch

TitleOn maximum coverage in the streaming model & application to multi-topic blog-watch
Publication TypeJournal Articles
Year of Publication2009
AuthorsSaha B, Getoor L
Journal2009 SIAM International Conference on Data Mining (SDM09)
Date Published2009///
Abstract

We generalize the graph streaming model to hypergraphs.In this streaming model, hyperedges are arriving online
and any computation has to be done on-the-fly using a
small amount of space. Each hyperedge can be viewed
as a set of elements (nodes), so we refer to our proposed
model as the “set-streaming” model of computation. We
consider the problem of “maximum coverage”, in which
k sets have to be selected that maximize the total weight
of the covered elements. In the set-streaming model of
computation, we show that our algorithm for maximum-
coverage achieves an approximation factor of 1
. When
multiple passes are allowed, we also provide a Θ(log n)
approximation algorithm for the set-cover. We next consider
a multi-topic blog-watch application, an extension of blog-
alert like applications for handling simultaneous multiple-
topic requests. We show how the problems of maximum-
coverage and set-cover in the set-streaming model can be
utilized to give efficient online solutions to this problem. We
verify the effectiveness of our methods both on synthetic and
real weblog data.