TY - JOUR
T1 - Network Clustering Approximation Algorithm Using One Pass Black Box Sampling
JF - arXiv:1110.3563
Y1 - 2011
A1 - DuBois,Thomas
A1 - Golbeck,Jennifer
A1 - Srinivasan, Aravind
KW - Computer Science - Social and Information Networks
KW - Physics - Physics and Society
AB - Finding a good clustering of vertices in a network, where vertices in the same cluster are more tightly connected than those in different clusters, is a useful, important, and well-studied task. Many clustering algorithms scale well, however they are not designed to operate upon internet-scale networks with billions of nodes or more. We study one of the fastest and most memory efficient algorithms possible - clustering based on the connected components in a random edge-induced subgraph. When defining the cost of a clustering to be its distance from such a random clustering, we show that this surprisingly simple algorithm gives a solution that is within an expected factor of two or three of optimal with either of two natural distance functions. In fact, this approximation guarantee works for any problem where there is a probability distribution on clusterings. We then examine the behavior of this algorithm in the context of social network trust inference.
UR - http://arxiv.org/abs/1110.3563
ER -