The geometry of visual space distortion

Publication TypeBook Chapters
Year of Publication1997
AuthorsFermüller C, Cheong LF, Aloimonos Y
EditorSommer G, Koenderink J
Book TitleAlgebraic Frames for the Perception-Action CycleAlgebraic Frames for the Perception-Action Cycle
Series TitleLecture Notes in Computer Science
Pagination249 - 277
PublisherSpringer Berlin / Heidelberg
ISBN Number978-3-540-63517-8

The encounter of perception and action happens at the intermediate representations of space-time. In many of the computational models employed in the past, it has been assumed that a metric representation of physical space can be derived by visual means. Psychophysical experiments, as well as computational considerations, can convince us that the perception of space and shape has a much more complicated nature, and that only a distorted version of actual, physical space can be computed. This paper develops a computational geometric model that explains why such distortion might take place. The basic idea is that, both in stereo and motion, we perceive the world from multiple views. Given the rigid transformation between the views and the properties of the image correspondence, the depth of the scene can be obtained. Even a slight error in the rigid transformation parameters causes distortion of the computed depth of the scene. The unified framework introduced here describes this distortion in computational terms. We characterize the space of distortions by its level sets, that is, we characterize the systematic distortion via a family of iso-distortion surfaces which describes the locus over which depths are distorted by some multiplicative factor. Clearly, functions of the distorted space exhibiting some sort of invariance, produce desirable representations for biological and artificial systems [13]. Given that humans' estimation of egomotion or estimation of the extrinsic parameters of the stereo apparatus is likely to be imprecise, the framework is used to explain a number of psychophysical experiments on the perception of depth from motion or stereo.