Space-time tradeoffs for approximate nearest neighbor searching

TitleSpace-time tradeoffs for approximate nearest neighbor searching
Publication TypeJournal Articles
Year of Publication2009
AuthorsArya S, Malamatos T, Mount D
JournalJournal of the ACM (JACM)
Pagination1:1–1:54 - 1:1–1:54
Date Published2009/11//
ISBN Number0004-5411
Keywordsnearest neighbor searching, space-time tradeoffs

Nearest neighbor searching is the problem of preprocessing a set of n point points in d-dimensional space so that, given any query point q, it is possible to report the closest point to q rapidly. In approximate nearest neighbor searching, a parameter ϵ > 0 is given, and a multiplicative error of (1 + ϵ) is allowed. We assume that the dimension d is a constant and treat n and ϵ as asymptotic quantities. Numerous solutions have been proposed, ranging from low-space solutions having space O(n) and query time O(log n + 1/ϵd−1) to high-space solutions having space roughly O((n log n)/ϵd) and query time O(log (n/ϵ)). We show that there is a single approach to this fundamental problem, which both improves upon existing results and spans the spectrum of space-time tradeoffs. Given a tradeoff parameter γ, where 2 ≤ γ ≤ 1/ϵ, we show that there exists a data structure of space O(nγd−1 log(1/ϵ)) that can answer queries in time O(log(nγ) + 1/(ϵγ)(d−1)/2. When γ = 2, this yields a data structure of space O(n log (1/ϵ)) that can answer queries in time O(log n + 1/ϵ(d−1)/2). When γ = 1/ϵ, it provides a data structure of space O((n/ϵd−1)log(1/ϵ)) that can answer queries in time O(log(n/ϵ)). Our results are based on a data structure called a (t,ϵ)-AVD, which is a hierarchical quadtree-based subdivision of space into cells. Each cell stores up to t representative points of the set, such that for any query point q in the cell at least one of these points is an approximate nearest neighbor of q. We provide new algorithms for constructing AVDs and tools for analyzing their total space requirements. We also establish lower bounds on the space complexity of AVDs, and show that, up to a factor of O(log (1/ϵ)), our space bounds are asymptotically tight in the two extremes, γ = 2 and γ = 1/ϵ.