Scalable Geometric & High Dimensional Algorithms

The steadily decreasing cost of commodity computing as well as the democratization of the increasing thirst for computing power, fueled largely by the massive volumes of data generated by powerful search engines, has led to a reexamination of how the traditional methods of solving problems that involve search. As more and more applications become web services (i.e., ``cloud applications'' in technology parlance), there is an increasing expectation that answers be obtained and provided in real time and at a scale that is capable of servicing millions of users at the same time. For many complex problems, this expectation forces a rethink of the solution paradigm. In particular, there is an increasing realization that the speed and scale requirements mean that the only way to provide the service is to decouple the solution process into two steps: one that precomputes a solution space to all possible problems, and possibly provides an encoding of it, and one that navigates the solution space. This research explores the decoupling paradigm in the context of answering shortest path and nearest neighbor queries in a spatial network, where the effect is to decouple the process of computing shortest paths along the network from the process of finding the nearest neighbors.

This research aims at increasing the accessibility to massive amounts of user generated data including maps, documents, videos, images, and music by speeding up the similarity searching task. This invariably involves taking advantage of ways to sort the data which also includes developing methods to build appropriate multidimensional and spatial indexes. By focusing on making use of GPUs and distributed processing, as well as decoupling via pre-computation, it brings the benefits of this increase in computing power to a wider range of users who often have this power at their disposal. In addition, by demonstrating the power obtained by the use of decoupling in computing nearest neighbors in a spatial network it stimulates the development of scalable solutions to other search problems.

Principal Investigators