Indexing Methods for Similarity Searching

Title	Indexing Methods for Similarity Searching
Publication Type	Conference Papers
Year of Publication	2007
Authors	Samet H
Conference Name	Current Trends in Computer Science, 2007. ENC 2007. Eighth Mexican International Conference on
Date Published	2007/09//
Keywords	content-based, database;indexing, database;pattern, databases;, indexing;multimedia, methods;multimedia, recognition;similarity, retrieval;data, searching;data, structures;database, structures;image
Abstract	An overview is given of the various techniques and issues involved in providing indexing support for similarity searching. Similarity searching is a crucial part of retrieval in multimedia databases used for applications such as pattern recognition, image databases, and content-based retrieval. It involves finding objects in a data set S that are similar to a query object q based on some distance measure d which is usually a distance metric. The search process is usually achieved by means of nearest neighbor finding. Existing methods for handling similarity search in this setting fall into one of two classes. The first is based on mapping to a vector space. The vector space is usually of high dimension which requires special handling due to the fact indexing methods do not discriminate well in such spaces. In particular, the query regions often overlap all of the blocks that result from the decomposition of the underlying space. This has led to some special solutions that make use of a sequential scan. An alternative is to use dimension reduction to find a mapping from a high-dimensional space into a low-dimensional space by finding the most discriminating dimensions and then index the data using one of a number of different data structures such as k-d trees, R-trees, quadtrees, etc. The second directly indexes the objects based on distances making use of data structures such as the vp-tree, M-tree, etc.
DOI	10.1109/ENC.2007.9

Indexing Methods for Similarity Searching

Publications