TY  - CHAP
T1  - Distributed Ranked Search
T2  - High Performance Computing – HiPC 2007High Performance Computing – HiPC 2007
Y1  - 2007
A1  - Gopalakrishnan,Vijay
A1  - Morselli,Ruggero
A1  - Bhattacharjee, Bobby
A1  - Keleher,Pete
A1  - Srinivasan, Aravind
ED  - Aluru,Srinivas
ED  - Parashar,Manish
ED  - Badrinath,Ramamurthy
ED  - Prasanna,Viktor
AB  - P2P deployments are a natural infrastructure for building distributed search networks. Proposed systems support locating and retrieving all results, but lack the information necessary to rank them. Users, however, are primarily interested in the most relevant results, not necessarily all possible results. Using random sampling, we extend a class of well-known information retrieval ranking algorithms such that they can be applied in this decentralized setting. We analyze the overhead of our approach, and quantify how our system scales with increasing number of documents, system size, document to node mapping (uniform versus non-uniform), and types of queries (rare versus popular terms). Our analysis and simulations show that a) these extensions are efficient, and scale with little overhead to large systems, and b) the accuracy of the results obtained using distributed ranking is comparable to that of a centralized implementation.
JA  - High Performance Computing – HiPC 2007High Performance Computing – HiPC 2007
T3  - Lecture Notes in Computer Science
PB  - Springer Berlin / Heidelberg
VL  - 4873
SN  - 978-3-540-77219-4
UR  - http://dx.doi.org/10.1007/978-3-540-77220-0_6
ER  -