TY - CONF
T1 - ApproxRank: Estimating Rank for a Subgraph
T2 - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09
Y1 - 2009
A1 - Yao Wu
A1 - Raschid, Louiqa
KW - Application software
KW - ApproxRank
KW - Computer applications
KW - Crawlers
KW - customized semantic query answering
KW - Data engineering
KW - Educational institutions
KW - Explosions
KW - focused crawlers
KW - global Web graph
KW - graph theory
KW - IdealRank algorithm
KW - Internet
KW - localized search engines
KW - PageRank-style
KW - personalized search
KW - Query processing
KW - Runtime
KW - Search engines
KW - Stochastic processes
KW - Web pages
AB - Customized semantic query answering, personalized search, focused crawlers and localized search engines frequently focus on ranking the pages contained within a subgraph of the global Web graph. The challenge for these applications is to compute PageRank-style scores efficiently on the subgraph, i.e., the ranking must reflect the global link structure of the Web graph but it must do so without paying the high overhead associated with a global computation. We propose a framework of an exact solution and an approximate solution for computing ranking on a subgraph. The IdealRank algorithm is an exact solution with the assumption that the scores of external pages are known. We prove that the IdealRank scores for pages in the subgraph converge. Since the PageRank-style scores of external pages may not typically be available, we propose the ApproxRank algorithm to estimate scores for the subgraph. Both IdealRank and ApproxRank represent the set of external pages with an external node L1 and extend the subgraph with links to L1. They also modify the PageRank-style transition matrix with respect to L1. We analyze the L1 distance between IdealRank scores and ApproxRank scores of the subgraph and show that it is within a constant factor of the L1 distance of the external pages (e.g., the true PageRank scores and uniform scores assumed by ApproxRank). We compare ApproxRank and a stochastic complementation approach (SC), a current best solution for this problem, on different types of subgraphs. ApproxRank has similar or superior performance to SC and typically improves on the runtime performance of SC by an order of magnitude or better. We demonstrate that ApproxRank provides a good approximation to PageRank for a variety of subgraphs.
JA - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09
PB - IEEE
SN - 978-1-4244-3422-0
M3 - 10.1109/ICDE.2009.108
ER -