Approximate pattern matching in a pattern database system

TitleApproximate pattern matching in a pattern database system
Publication TypeJournal Articles
Year of Publication1980
AuthorsDavis LS, Roussopoulos N
JournalInformation Systems
Volume5
Issue2
Pagination107 - 119
Date Published1980///
ISBN Number0306-4379
Abstract

This work is concerned with the organization of a large database of binary pictures (normalized for size, rotation and position) and with the efficient "inexact match" of an input pattern against the whole database. Current techniques in pattern analysis use matching algorithms without any regard to the global organization of the storage representations of the models they are to match. Consequently, the algorithms are only practical for small databases. This paper discusses the design of a pattern database system and the economy that it provides for the matching problem. The database organization is based on the quad-tree representation of binary patterns. Given an arbitrary decomposition D (or partition) of a pattern P and an arbitrary function f on the pattern, we repeatedly apply f on D(P), D(D(P))P, ... to obtain higher and higher levels of abstraction f(D)(P)), f(D(D(P))), ... of the pattern. The computed values obtained after the jth application of f are used to label the ith level of the pyramid. The specific representation used in this paper is called the sum-quad-tree, in which each level of the tree stores the sum of the labels of its sons. The lowest level of the sum-quad-tree corresponds to the individual pixels and is the nth level (i.e. node m at level n implies v(m) = 0 or v(m) = 1). Nodes at the jth level of the sum-quad-tree correspond to sums of 2n-j × 2n-j picture points, so that the 0th level contains the number of 1's in the pattern. The pyramid representation is used as a hierarchical (n-level) indexing scheme of the database. The advantage of this organization is that the matching algorithms presented reject most of the patterns in the database by utilizing the relatively small in size index tables and thus avoid the overhead of unnecessary CPU time and operation between main memory and secondary storage.

URLhttp://www.sciencedirect.com/science/article/pii/0306437980900022
DOI