%0 Conference Paper
%B 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
%D 2011
%T Recent advances in age and height estimation from still images and video
%A Chellapa, Rama
%A Turaga,P.
%K age estimation
%K biometrics (access control)
%K Calibration
%K Estimation
%K Geometry
%K height estimation
%K HUMANS
%K image fusion
%K image-formation model fusion
%K Legged locomotion
%K multiview-geometry
%K Robustness
%K SHAPE
%K shape-space geometry
%K soft-biometrics
%K statistical analysis
%K statistical methods
%K video signal processing
%X Soft-biometrics such as gender, age, race, etc have been found to be useful characterizations that enable fast pre-filtering and organization of data for biometric applications. In this paper, we focus on two useful soft-biometrics - age and height. We discuss their utility and the factors involved in their estimation from images and videos. In this context, we highlight the role that geometric constraints such as multiview-geometry, and shape-space geometry play. Then, we present methods based on these geometric constraints for age and height-estimation. These methods provide a principled means by fusing image-formation models, multi-view geometric constraints, and robust statistical methods for inference.
%B 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
%I IEEE
%P 91 - 96
%8 2011/03/21/25
%@ 978-1-4244-9140-7
%G eng
%R 10.1109/FG.2011.5771367
%0 Journal Article
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%D 2011
%T Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition
%A Turaga,P.
%A Veeraraghavan,A.
%A Srivastava, A.
%A Chellapa, Rama
%K activity based video clustering
%K activity recognition
%K computational geometry
%K Computational modeling
%K Data models
%K face recognition
%K feature representation
%K finite dimensional linear subspaces
%K geometric properties
%K Geometry
%K Grassmann Manifolds
%K Grassmann.
%K HUMANS
%K Image and video models
%K image recognition
%K linear dynamic models
%K linear subspace structure
%K Manifolds
%K maximum likelihood classification
%K maximum likelihood estimation
%K Object recognition
%K Riemannian geometry
%K Riemannian metrics
%K SHAPE
%K statistical computations
%K statistical models
%K Stiefel
%K Stiefel Manifolds
%K unsupervised clustering
%K video based face recognition
%K video based recognition
%K video signal processing
%X In this paper, we examine image and video-based recognition applications where the underlying models have a special structure-the linear subspace structure. We discuss how commonly used parametric models for videos and image sets can be described using the unified framework of Grassmann and Stiefel manifolds. We first show that the parameters of linear dynamic models are finite-dimensional linear subspaces of appropriate dimensions. Unordered image sets as samples from a finite-dimensional linear subspace naturally fall under this framework. We show that an inference over subspaces can be naturally cast as an inference problem on the Grassmann manifold. To perform recognition using subspace-based models, we need tools from the Riemannian geometry of the Grassmann manifold. This involves a study of the geometric properties of the space, appropriate definitions of Riemannian metrics, and definition of geodesics. Further, we derive statistical modeling of inter and intraclass variations that respect the geometry of the space. We apply techniques such as intrinsic and extrinsic statistics to enable maximum-likelihood classification. We also provide algorithms for unsupervised clustering derived from the geometry of the manifold. Finally, we demonstrate the improved performance of these methods in a wide variety of vision applications such as activity recognition, video-based face recognition, object recognition from image sets, and activity-based video clustering.
%B IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 33
%P 2273 - 2286
%8 2011/11//
%@ 0162-8828
%G eng
%N 11
%R 10.1109/TPAMI.2011.52
%0 Conference Paper
%B 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
%D 2011
%T Towards view-invariant expression analysis using analytic shape manifolds
%A Taheri, S.
%A Turaga,P.
%A Chellapa, Rama
%K Databases
%K Deformable models
%K Face
%K face recognition
%K facial expression analysis
%K Geometry
%K Gold
%K Human-computer interaction
%K Manifolds
%K projective transformation
%K Riemannian interpretation
%K SHAPE
%K view invariant expression analysis
%X Facial expression analysis is one of the important components for effective human-computer interaction. However, to develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera i.e. expression models should generalize across facial poses. To perform this systematically, one needs to understand the space of observed images subject to projective transformations. However, since the projective shape-space is cumbersome to work with, we address this problem by deriving models for expressions on the affine shape-space as an approximation to the projective shape-space by using a Riemannian interpretation of deformations that facial expressions cause on different parts of the face. We use landmark configurations to represent facial deformations and exploit the fact that the affine shape-space can be studied using the Grassmann manifold. This representation enables us to perform various expression analysis and recognition algorithms without the need for the normalization as a preprocessing step. We extend some of the available approaches for expression analysis to the Grassmann manifold and experimentally show promising results, paving the way for a more general theory of view-invariant expression analysis.
%B 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
%I IEEE
%P 306 - 313
%8 2011/03/21/25
%@ 978-1-4244-9140-7
%G eng
%R 10.1109/FG.2011.5771415
%0 Conference Paper
%B 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
%D 2010
%T The role of geometry in age estimation
%A Turaga,P.
%A Biswas,S.
%A Chellapa, Rama
%K age estimation
%K Aging
%K Biometrics
%K computational geometry
%K Face
%K Face Geometry
%K Facial animation
%K Feature extraction
%K function estimation problem
%K geometric face attributes
%K Geometry
%K Grassmann manifold
%K human face modeling
%K human face understanding
%K HUMANS
%K Mouth
%K regression
%K Regression analysis
%K SHAPE
%K Solid modeling
%K solid modelling
%K velocity vector
%X Understanding and modeling of aging in human faces is an important problem in many real-world applications such as biometrics, authentication, and synthesis. In this paper, we consider the role of geometric attributes of faces, as described by a set of landmark points on the face, in age perception. Towards this end, we show that the space of landmarks can be interpreted as a Grassmann manifold. Then the problem of age estimation is posed as a problem of function estimation on the manifold. The warping of an average face to a given face is quantified as a velocity vector that transforms the average to a given face along a smooth geodesic in unit-time. This deformation is then shown to contain important information about the age of the face. We show in experiments that exploiting geometric cues in a principled manner provides comparable performance to several systems that utilize both geometric and textural cues. We show results on age estimation using the standard FG-Net dataset and a passport dataset which illustrate the effectiveness of the approach.
%B 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
%I IEEE
%P 946 - 949
%8 2010/03/14/19
%@ 978-1-4244-4295-9
%G eng
%R 10.1109/ICASSP.2010.5495292
%0 Conference Paper
%B Shape Modeling and Applications, 2009. SMI 2009. IEEE International Conference on
%D 2009
%T Classification of non-manifold singularities from transformations of 2-manifolds
%A Leon,J.-C.
%A De Floriani, Leila
%A Hetroy,F.
%K classification;search
%K Computer
%K graphic;continuous
%K graphics;pattern
%K model;nonmanifold
%K problems;topology;
%K property;computer
%K SHAPE
%K singularity;topological
%K transformation;nonmanifold
%X Non-manifold models are frequently encountered in engineering simulations and design as well as in computer graphics. However, these models lack shape characterization for modelling and searching purposes. Topological properties act as a kernel for deriving key features of objects. Here we propose a classification for the non-manifold singularities of non-manifold objects through continuous shape transformations of 2-manifolds without boundary up to the creation of non-manifold singularities. As a result, the non-manifold objects thus created can be categorized and contribute to the definition of a general purpose taxonomy for non-manifold shapes.
%B Shape Modeling and Applications, 2009. SMI 2009. IEEE International Conference on
%P 179 - 184
%8 2009/06//
%G eng
%R 10.1109/SMI.2009.5170146
%0 Conference Paper
%B Image Processing (ICIP), 2009 16th IEEE International Conference on
%D 2009
%T How would you look as you age ?
%A Ramanathan,N.
%A Chellapa, Rama
%K age-separated
%K appearances;facial
%K database;face
%K Face
%K growth
%K image
%K model;face
%K model;facial
%K models;facial
%K recognition;image
%K SHAPE
%K TEXTURE
%K texture;
%K transformation
%K verification;facial
%X Facial appearances change with increase in age. While generic growth patterns that are characteristic of different age groups can be identified, facial growth is also observed to be influenced by individual-specific attributes such as one's gender, ethnicity, life-style etc. In this paper, we propose a facial growth model that comprises of transformation models for facial shape and texture. We collected empirical data pertaining to facial growth from a database of age-separated face images of adults and used the same in developing the aforementioned transformation models. The proposed model finds applications in predicting one's appearance across ages and in performing face verification across ages.
%B Image Processing (ICIP), 2009 16th IEEE International Conference on
%P 53 - 56
%8 2009/11//
%G eng
%R 10.1109/ICIP.2009.5413998
%0 Conference Paper
%B IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009
%D 2009
%T Real-time shape retrieval for robotics using skip Tri-Grams
%A Li,Yi
%A Bitsakos,K.
%A Fermüller, Cornelia
%A Aloimonos, J.
%K Bullseye retrieval test
%K Clocks
%K closed contour shape retrieval
%K Image retrieval
%K Image segmentation
%K Indexing
%K Information retrieval
%K Intelligent robots
%K Jacobian matrices
%K mobile robot
%K Mobile robots
%K MPEG 7 shape dataset
%K piecewise linear segments
%K Piecewise linear techniques
%K Real time systems
%K real-time shape retrieval
%K robot vision
%K SHAPE
%K shape recognition
%K shape representation
%K skip Tri-Grams
%K Testing
%X The real time requirement is an additional constraint on many intelligent applications in robotics, such as shape recognition and retrieval using a mobile robot platform. In this paper, we present a scalable approach for efficiently retrieving closed contour shapes. The contour of an object is represented by piecewise linear segments. A skip Tri-Gram is obtained by selecting three segments in the clockwise order while allowing a constant number of segments to be ÃÂ¿skippedÃÂ¿ in between. The main idea is to use skip Tri-Grams of the segments to implicitly encode the distant dependency of the shape. All skip Tri-Grams are used for efficiently retrieving closed contour shapes without pairwise matching feature points from two shapes. The retrieval is at least an order of magnitude faster than other state-of-the-art algorithms. We score 80% in the Bullseye retrieval test on the whole MPEG 7 shape dataset. We further test the algorithm using a mobile robot platform in an indoor environment. 8 objects are used for testing from different viewing directions, and we achieve 82% accuracy.
%B IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009
%I IEEE
%P 4731 - 4738
%8 2009/10/10/15
%@ 978-1-4244-3803-7
%G eng
%R 10.1109/IROS.2009.5354738
%0 Conference Paper
%B 19th International Conference on Pattern Recognition, 2008. ICPR 2008
%D 2008
%T Bilateral symmetry of object silhouettes under perspective projection
%A Bitsakos,K.
%A Yi,H.
%A Yi,L.
%A Fermüller, Cornelia
%K Automation
%K bilateral symmetry
%K Computer vision
%K Frequency
%K Image analysis
%K Image coding
%K Image reconstruction
%K Internet
%K Internet images
%K Object detection
%K object silhouettes
%K perspective distortion
%K perspective projection
%K SHAPE
%K symmetric objects
%X Symmetry is an important property of objects and is exhibited in different forms e.g., bilateral, rotational, etc. This paper presents an algorithm for computing the bilateral symmetry of silhouettes of shallow objects under perspective distortion, exploiting the invariance of the cross ratio to projective transformations. The basic idea is to use the cross ratio to compute a number of midpoints of cross sections and then fit a straight line through them. The goodness-of-fit determines the likelihood of the line to be the axis of symmetry. We analytically estimate the midpointpsilas location as a function of the vanishing point for a given object silhouette. Hence finding the symmetry axis amounts to a 2D search in the space of vanishing points. We present experiments on two datasets as well as Internet images of symmetric objects that validate our approach.
%B 19th International Conference on Pattern Recognition, 2008. ICPR 2008
%I IEEE
%P 1 - 4
%8 2008/12/08/11
%@ 978-1-4244-2174-9
%G eng
%R 10.1109/ICPR.2008.4761501
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on
%D 2008
%T Statistical analysis on Stiefel and Grassmann manifolds with applications in computer vision
%A Turaga,P.
%A Veeraraghavan,A.
%A Chellapa, Rama
%K algorithm;learning
%K analysis;computer
%K analysis;statistical
%K analysis;video
%K based
%K classification;image
%K classification;spatio-temporal
%K distribution
%K distributions;
%K Face
%K functions;shape
%K Grassmann
%K invariant
%K manifold;activity
%K manifold;Stiefel
%K matching;inference
%K matching;spatiotemporal
%K measures;estimation
%K modeling;statistical
%K parameters;pattern
%K phenomena;statistical
%K recognition;affine
%K recognition;computer
%K recognition;probability
%K SHAPE
%K structure;image
%K technique;geometric
%K theory;manifold-valued
%K vision;distance
%K vision;image
%X Many applications in computer vision and pattern recognition involve drawing inferences on certain manifold-valued parameters. In order to develop accurate inference algorithms on these manifolds we need to a) understand the geometric structure of these manifolds b) derive appropriate distance measures and c) develop probability distribution functions (pdf) and estimation techniques that are consistent with the geometric structure of these manifolds. In this paper, we consider two related manifolds - the Stiefel manifold and the Grassmann manifold, which arise naturally in several vision applications such as spatio-temporal modeling, affine invariant shape analysis, image matching and learning theory. We show how accurate statistical characterization that reflects the geometry of these manifolds allows us to design efficient algorithms that compare favorably to the state of the art in these very different applications. In particular, we describe appropriate distance measures and parametric and non-parametric density estimators on these manifolds. These methods are then used to learn class conditional densities for applications such as activity recognition, video based face recognition and shape classification.
%B Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on
%P 1 - 8
%8 2008/06//
%G eng
%R 10.1109/CVPR.2008.4587733
%0 Journal Article
%J Multimedia, IEEE Transactions on
%D 2008
%T Synthesis of Silhouettes and Visual Hull Reconstruction for Articulated Humans
%A Yue,Zhanfeng
%A Chellapa, Rama
%K active
%K algorithm;articulated
%K algorithm;inner
%K body
%K camera;visual
%K collection;virtual
%K computation;contour-based
%K Context
%K detection;image
%K distance
%K distance;turntable
%K estimation;shape
%K function
%K hull
%K human
%K image
%K image;approximate
%K localization
%K measurement;silhouette
%K measurement;turning
%K part
%K pose;circular
%K reality;
%K recognition;virtual
%K reconstruction;approximation
%K reconstruction;image
%K segmentation
%K segmentation;pose
%K SHAPE
%K similarity
%K synthesis;silhouette
%K technique;human
%K theory;cameras;edge
%K Trajectory
%X In this paper, we propose a complete framework for improved synthesis and understanding of the human pose from a limited number of silhouette images. It combines the active image-based visual hull (IBVH) algorithm and a contour-based body part segmentation technique. We derive a simple, approximate algorithm to decide the extrinsic parameters of a virtual camera, and synthesize the turntable image collection of the person using the IBVH algorithm by actively moving the virtual camera on a properly computed circular trajectory around the person. Using the turning function distance as the silhouette similarity measurement, this approach can be used to generate the desired pose-normalized images for recognition applications. In order to overcome the inability of the visual hull (VH) method to reconstruct concave regions, we propose a contour-based human body part localization algorithm to segment the silhouette images into convex body parts. The body parts observed from the virtual view are generated separately from the corresponding body parts observed from the input views and then assembled together for a more accurate VH reconstruction. Furthermore, the obtained turntable image collection helps to improve the body part segmentation and identification process. By using the inner distance shape context (IDSC) measurement, we are able to estimate the body part locations more accurately from a synthesized view where we can localize the body part more precisely. Experiments show that the proposed algorithm can greatly improve body part segmentation and hence shape reconstruction results.
%B Multimedia, IEEE Transactions on
%V 10
%P 1565 - 1577
%8 2008/12//
%@ 1520-9210
%G eng
%N 8
%R 10.1109/TMM.2008.2007321
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
%D 2007
%T Efficient Indexing For Articulation Invariant Shape Matching And Retrieval
%A Biswas,S.
%A Aggarwal,G.
%A Chellapa, Rama
%K alignment;image
%K articulation
%K geometric
%K invariant
%K matching;image
%K matching;indexing;invariant
%K relationships;shape-wise
%K retrieval;indexing;
%K retrieval;pairwise
%K SHAPE
%X Most shape matching methods are either fast but too simplistic to give the desired performance or promising as far as performance is concerned but computationally demanding. In this paper, we present a very simple and efficient approach that not only performs almost as good as many state-of-the-art techniques but also scales up to large databases. In the proposed approach, each shape is indexed based on a variety of simple and easily computable features which are invariant to articulations and rigid transformations. The features characterize pairwise geometric relationships between interest points on the shape, thereby providing robustness to the approach. Shapes are retrieved using an efficient scheme which does not involve costly operations like shape-wise alignment or establishing correspondences. Even for a moderate size database of 1000 shapes, the retrieval process is several times faster than most techniques with similar performance. Extensive experimental results are presented to illustrate the advantages of our approach as compared to the best in the field.
%B Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
%P 1 - 8
%8 2007/06//
%G eng
%R 10.1109/CVPR.2007.383227
%0 Conference Paper
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%D 2007
%T Hierarchical Part-Template Matching for Human Detection and Segmentation
%A Zhe Lin
%A Davis, Larry S.
%A David Doermann
%A DeMenthon,D.
%K analysis;global
%K approach;background
%K articulations;video
%K Bayesian
%K detection;human
%K detectors;hierarchical
%K detectors;partial
%K framework;Bayesian
%K human
%K likelihood
%K MAP
%K matching;human
%K matching;image
%K methods;image
%K occlusion
%K occlusions;shape
%K part-based
%K part-template
%K re-evaluation;global
%K segmentation;image
%K segmentation;local
%K sequences;
%K sequences;Bayes
%K SHAPE
%K subtraction;fine
%K template-based
%X Local part-based human detectors are capable of handling partial occlusions efficiently and modeling shape articulations flexibly, while global shape template-based human detectors are capable of detecting and segmenting human shapes simultaneously. We describe a Bayesian approach to human detection and segmentation combining local part-based and global template-based schemes. The approach relies on the key ideas of matching a part-template tree to images hierarchically to generate a reliable set of detection hypotheses and optimizing it under a Bayesian MAP framework through global likelihood re-evaluation and fine occlusion analysis. In addition to detection, our approach is able to obtain human shapes and poses simultaneously. We applied the approach to human detection and segmentation in crowded scenes with and without background subtraction. Experimental results show that our approach achieves good performance on images and video sequences with severe occlusion.
%B Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
%P 1 - 8
%8 2007/10//
%G eng
%R 10.1109/ICCV.2007.4408975
%0 Conference Paper
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%D 2007
%T Joint Acoustic-Video Fingerprinting of Vehicles, Part I
%A Cevher, V.
%A Chellapa, Rama
%A McClellan, J.H.
%K acoustic
%K acoustic-video
%K components;joint
%K detection;acoustic
%K estimation;video
%K fingerprinting;passive
%K processing;
%K processing;acoustic
%K sensor;vehicle
%K sensors;acoustic
%K sensors;wheel
%K SHAPE
%K signal
%K speed
%K transducers;video
%K wave-pattern;envelope
%X We address vehicle classification and measurement problems using acoustic and video sensors. In this paper, we show how to estimate a vehicle's speed, width, and length by jointly estimating its acoustic wave-pattern using a single passive sensor that records the vehicle's drive-by noise. The acoustic wave-pattern is approximated using three envelope shape (ES) components, which approximate the shape of the received signal's power envelope. We incorporate the parameters of the ES components along with the estimates of the vehicle engine RPM and number of cylinders to create a vehicle profile vector that forms an intuitive discriminatory feature space. In the companion paper, we discuss vehicle classification and mensuration based on silhouette extraction and wheel detection, using a video sensor. Vehicle speed estimation and classification results are provided using field data.
%B Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
%V 2
%P II-745 -II-748 - II-745 -II-748
%8 2007/04//
%G eng
%R 10.1109/ICASSP.2007.366343
%0 Journal Article
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%D 2006
%T A 3D shape constraint on video
%A Hui Ji
%A Fermüller, Cornelia
%K 3D motion estimation
%K algorithms
%K Artificial intelligence
%K CAMERAS
%K decoupling translation from rotation
%K Estimation error
%K Fluid flow measurement
%K Image Enhancement
%K Image Interpretation, Computer-Assisted
%K Image reconstruction
%K Imaging, Three-Dimensional
%K Information Storage and Retrieval
%K integration of motion fields
%K Layout
%K minimisation
%K Minimization methods
%K Motion estimation
%K multiple motion fields
%K parameter estimation
%K Pattern Recognition, Automated
%K Photography
%K practical constrained minimization
%K SHAPE
%K shape and rotation.
%K shape vectors
%K stability
%K structure estimation
%K surface normals
%K Three-dimensional motion estimation
%K video 3D shape constraint
%K Video Recording
%K video signal processing
%X We propose to combine the information from multiple motion fields by enforcing a constraint on the surface normals (3D shape) of the scene in view. The fact that the shape vectors in the different views are related only by rotation can be formulated as a rank = 3 constraint. This constraint is implemented in an algorithm which solves 3D motion and structure estimation as a practical constrained minimization. Experiments demonstrate its usefulness as a tool in structure from motion providing very accurate estimates of 3D motion.
%B IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 28
%P 1018 - 1023
%8 2006/06//
%@ 0162-8828
%G eng
%N 6
%R 10.1109/TPAMI.2006.109
%0 Conference Paper
%B 2006 International Conference on Parallel Processing Workshops, 2006. ICPP 2006 Workshops
%D 2006
%T Model-based OpenMP implementation of a 3D facial pose tracking system
%A Saha,S.
%A Chung-Ching Shen
%A Chia-Jui Hsu
%A Aggarwal,G.
%A Veeraraghavan,A.
%A Sussman, Alan
%A Bhattacharyya, Shuvra S.
%K 3D facial pose tracking system
%K application modeling
%K application program interfaces
%K application scheduling
%K coarse-grain dataflow graphs
%K Concurrent computing
%K data flow graphs
%K Educational institutions
%K face recognition
%K IMAGE PROCESSING
%K image processing applications
%K Inference algorithms
%K Message passing
%K OpenMP platform
%K parallel implementation
%K PARALLEL PROCESSING
%K parallel programming
%K Particle tracking
%K Processor scheduling
%K SHAPE
%K shared memory systems
%K shared-memory systems
%K Solid modeling
%K tracking
%X Most image processing applications are characterized by computation-intensive operations, and high memory and performance requirements. Parallelized implementation on shared-memory systems offer an attractive solution to this class of applications. However, we cannot thoroughly exploit the advantages of such architectures without proper modeling and analysis of the application. In this paper, we describe our implementation of a 3D facial pose tracking system using the OpenMP platform. Our implementation is based on a design methodology that uses coarse-grain dataflow graphs to model and schedule the application. We present our modeling approach, details of the implementation that we derived based on this modeling approach, and associated performance results. The parallelized implementation achieves significant speedup, and meets or exceeds the target frame rate under various configurations
%B 2006 International Conference on Parallel Processing Workshops, 2006. ICPP 2006 Workshops
%I IEEE
%P 8 pp.-73 - 8 pp.-73
%8 2006///
%@ 0-7695-2637-3
%G eng
%R 10.1109/ICPPW.2006.55
%0 Conference Paper
%B Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
%D 2005
%T An algebraic approach to surface reconstruction from gradient fields
%A Agrawal,A.
%A Chellapa, Rama
%A Raskar, R.
%K algebra;
%K algebraic
%K approach;
%K Computer
%K confinement;
%K discrete
%K domain
%K error
%K field;
%K from
%K gradient
%K graph
%K image
%K integrability;
%K linear
%K local
%K methods;
%K photometric
%K reconstruction;
%K shading;
%K SHAPE
%K stereo;
%K surface
%K system;
%K theory;
%K vision;
%X Several important problems in computer vision such as shape from shading (SFS) and photometric stereo (PS) require reconstructing a surface from an estimated gradient field, which is usually non-integrable, i.e. have non-zero curl. We propose a purely algebraic approach to enforce integrability in discrete domain. We first show that enforcing integrability can be formulated as solving a single linear system Ax =b over the image. In general, this system is under-determined. We show conditions under which the system can be solved and a method to get to those conditions based on graph theory. The proposed approach is non-iterative, has the important property of local error confinement and can be applied to several problems. Results on SFS and PS demonstrate the applicability of our method.
%B Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
%V 1
%P 174 - 181 Vol. 1 - 174 - 181 Vol. 1
%8 2005/10//
%G eng
%R 10.1109/ICCV.2005.31
%0 Conference Paper
%B Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
%D 2005
%T Handwriting matching and its application to handwriting synthesis
%A Yefeng Zheng
%A David Doermann
%K (artificial
%K deformation
%K deformation;
%K handwriting
%K image
%K intelligence);
%K learning
%K learning;
%K matching;
%K point
%K recognition;
%K sampling;
%K SHAPE
%K synthesis;
%X Since it is extremely expensive to collect a large volume of handwriting samples, synthesized data are often used to enlarge the training set. We argue that, in order to generate good handwriting samples, a synthesis algorithm should learn the shape deformation characteristics of handwriting from real samples. In this paper, we present a point matching algorithm to learn the deformation, and apply it to handwriting synthesis. Preliminary experiments show the advantages of our approach.
%B Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
%P 861 - 865 Vol. 2 - 861 - 865 Vol. 2
%8 2005/09/01/aug
%G eng
%R 10.1109/ICDAR.2005.122
%0 Conference Paper
%B 12th International Conference on Advanced Robotics, 2005. ICAR '05. Proceedings
%D 2005
%T Identifying and segmenting human-motion for mobile robot navigation using alignment errors
%A Abd-Almageed, Wael
%A Burns,B. J
%A Davis, Larry S.
%K Computer errors
%K Educational institutions
%K Frequency estimation
%K human-motion identification
%K human-motion segmentation
%K HUMANS
%K Image motion analysis
%K Image segmentation
%K mobile robot navigation
%K Mobile robots
%K Motion estimation
%K Navigation
%K Object detection
%K robot vision
%K SHAPE
%X This paper presents a new human-motion identification and segmentation algorithm, for mobile robot platforms. The algorithm is based on computing the alignment error between pairs of object images acquired from a moving platform. Pairs of images generating relatively small alignment errors are used to estimate the fundamental frequency of the object's motion. A decision criterion is then used to test the significance of the estimated frequency and to classify the object's motion. To verify the validity of the proposed approach, experimental results are shown on different classes of objects
%B 12th International Conference on Advanced Robotics, 2005. ICAR '05. Proceedings
%I IEEE
%P 398 - 403
%8 2005/07//
%@ 0-7803-9178-0
%G eng
%R 10.1109/ICAR.2005.1507441
%0 Conference Paper
%B IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005
%D 2005
%T Integration of motion fields through shape
%A Ji,H.
%A Fermüller, Cornelia
%K 3D motion estimation
%K Automation
%K CAMERAS
%K computational geometry
%K Computer vision
%K constrained minimization problem
%K decoupling translation from rotation
%K Educational institutions
%K image colour analysis
%K image gradients
%K image resolution
%K Image segmentation
%K image sequence
%K Image sequences
%K integration of motion fields
%K Layout
%K minimisation
%K Motion estimation
%K motion field integration
%K motion segmentation
%K parameter estimation
%K planar patches
%K rank-3 constraint
%K scene patches
%K SHAPE
%K shape and rotation
%K shape estimation
%K structure estimation
%X Structure from motion from single flow fields has been studied intensively, but the integration of information from multiple flow fields has not received much attention. Here we address this problem by enforcing constraints on the shape (surface normals) of the scene in view, as opposed to constraints on the structure (depth). The advantage of integrating shape is two-fold. First, we do not need to estimate feature correspondences over multiple frames, but we only need to match patches. Second, the shape vectors in the different views are related only by rotation. This constraint on shape can be combined easily with motion estimation, thus formulating motion and structure estimation from multiple views as a practical constrained minimization problem using a rank-3 constraint. Based on this constraint, we develop a 3D motion technique, which locates through color and motion segmentation, planar patches in the scene, matches patches over multiple frames, and estimates the motion between multiple frames and the shape of the selected scene patches using the image gradients. Experiments evaluate the accuracy of the 3D motion estimation and demonstrate the motion and shape estimation of the technique by super-resolving an image sequence.
%B IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005
%I IEEE
%V 2
%P 663- 669 vol. 2 - 663- 669 vol. 2
%8 2005/06/20/25
%@ 0-7695-2372-2
%G eng
%R 10.1109/CVPR.2005.190
%0 Conference Paper
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%D 2005
%T Pedestrian classification from moving platforms using cyclic motion pattern
%A Yang Ran
%A Qinfen Zheng
%A Weiss, I.
%A Davis, Larry S.
%A Abd-Almageed, Wael
%A Liang Zhao
%K analysis;
%K angle;
%K body
%K classification;
%K compact
%K cyclic
%K DETECTION
%K detection;
%K digital
%K Feedback
%K Gait
%K human
%K image
%K information;
%K locked
%K loop
%K loop;
%K loops;
%K module;
%K MOTION
%K object
%K oscillations;
%K pattern;
%K pedestrian
%K phase
%K Pixel
%K principle
%K representation;
%K sequence;
%K sequences;
%K SHAPE
%K system;
%X This paper describes an efficient pedestrian detection system for videos acquired from moving platforms. Given a detected and tracked object as a sequence of images within a bounding box, we describe the periodic signature of its motion pattern using a twin-pendulum model. Then a principle gait angle is extracted in every frame providing gait phase information. By estimating the periodicity from the phase data using a digital phase locked loop (dPLL), we quantify the cyclic pattern of the object, which helps us to continuously classify it as a pedestrian. Past approaches have used shape detectors applied to a single image or classifiers based on human body pixel oscillations, but ours is the first to integrate a global cyclic motion model and periodicity analysis. Novel contributions of this paper include: i) development of a compact shape representation of cyclic motion as a signature for a pedestrian, ii) estimation of gait period via a feedback loop module, and iii) implementation of a fast online pedestrian classification system which operates on videos acquired from moving platforms.
%B Image Processing, 2005. ICIP 2005. IEEE International Conference on
%V 2
%P II - 854-7 - II - 854-7
%8 2005/09//
%G eng
%R 10.1109/ICIP.2005.1530190
%0 Conference Paper
%B IEEE International Conference on Image Processing, 2005. ICIP 2005
%D 2005
%T Pedestrian classification from moving platforms using cyclic motion pattern
%A Yang Ran
%A Qinfen Zheng
%A Weiss, I.
%A Davis, Larry S.
%A Abd-Almageed, Wael
%A Liang Zhao
%K compact shape representation
%K cyclic motion pattern
%K data mining
%K Detectors
%K digital phase locked loop
%K digital phase locked loops
%K feedback loop module
%K gait analysis
%K gait phase information
%K human body pixel oscillations
%K HUMANS
%K image classification
%K Image motion analysis
%K image representation
%K image sequence
%K Image sequences
%K Motion detection
%K Object detection
%K pedestrian classification
%K pedestrian detection system
%K Phase estimation
%K Phase locked loops
%K principle gait angle
%K SHAPE
%K tracking
%K Videos
%X This paper describes an efficient pedestrian detection system for videos acquired from moving platforms. Given a detected and tracked object as a sequence of images within a bounding box, we describe the periodic signature of its motion pattern using a twin-pendulum model. Then a principle gait angle is extracted in every frame providing gait phase information. By estimating the periodicity from the phase data using a digital phase locked loop (dPLL), we quantify the cyclic pattern of the object, which helps us to continuously classify it as a pedestrian. Past approaches have used shape detectors applied to a single image or classifiers based on human body pixel oscillations, but ours is the first to integrate a global cyclic motion model and periodicity analysis. Novel contributions of this paper include: i) development of a compact shape representation of cyclic motion as a signature for a pedestrian, ii) estimation of gait period via a feedback loop module, and iii) implementation of a fast online pedestrian classification system which operates on videos acquired from moving platforms.
%B IEEE International Conference on Image Processing, 2005. ICIP 2005
%I IEEE
%V 2
%P II- 854-7 - II- 854-7
%8 2005/09//
%@ 0-7803-9134-9
%G eng
%R 10.1109/ICIP.2005.1530190
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%D 2005
%T Using the inner-distance for classification of articulated shapes
%A Ling,H.
%A Jacobs, David W.
%K articulated
%K CE-Shape-1
%K classification;
%K database;
%K databases;
%K dataset;
%K descriptor;
%K dynamic
%K human
%K image
%K inner-distance;
%K Kimia
%K landmark
%K leaf
%K matching;
%K MOTION
%K MPEG7
%K points;
%K programming;
%K SHAPE
%K silhouette
%K silhouette;
%K Swedish
%K visual
%X We propose using the inner-distance between landmark points to build shape descriptors. The inner-distance is defined as the length of the shortest path between landmark points within the shape silhouette. We show that the inner-distance is articulation insensitive and more effective at capturing complex shapes with part structures than Euclidean distance. To demonstrate this idea, it is used to build a new shape descriptor based on shape contexts. After that, we design a dynamic programming based method for shape matching and comparison. We have tested our approach on a variety of shape databases including an articulated shape dataset, MPEG7 CE-Shape-1, Kimia silhouettes, a Swedish leaf database and a human motion silhouette dataset. In all the experiments, our method demonstrates effective performance compared with other algorithms.
%B Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
%V 2
%P 719 - 726 vol. 2 - 719 - 726 vol. 2
%8 2005/06//
%G eng
%R 10.1109/CVPR.2005.362
%0 Journal Article
%J IEEE Robotics & Automation Magazine
%D 2004
%T The Argus eye: a new imaging system designed to facilitate robotic tasks of motion
%A Baker, P.
%A Ogale, A. S
%A Fermüller, Cornelia
%K Argus eye
%K Calibration
%K CAMERAS
%K computational geometry
%K Design automation
%K Eyes
%K image formation
%K imaging system
%K Information geometry
%K Layout
%K Motion estimation
%K multiple stereo configurations
%K panoramic robots
%K robot vision
%K Robot vision systems
%K robotic motion tasks
%K Robotics and automation
%K SHAPE
%K shape model estimation
%K system calibration
%X This article describes an imaging system that has been designed to facilitate robotic tasks of motion. The system consists of a number of cameras in a network, arranged so that they sample different parts of the visual sphere. This geometric configuration has provable advantages compared to small field of view cameras for the estimation of the system's own motion and, consequently, the estimation of shape models from the individual cameras. The reason is, inherent ambiguities of confusion between translation and rotation disappear. Pairs of cameras may also be arranged in multiple stereo configurations, which provide additional advantages for segmentation. Algorithms for the calibration of the system and the three-dimensional (3-D) motion estimation are provided.
%B IEEE Robotics & Automation Magazine
%V 11
%P 31 - 38
%8 2004/12//
%@ 1070-9932
%G eng
%N 4
%R 10.1109/MRA.2004.1371606
%0 Conference Paper
%B 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004. Proceedings
%D 2004
%T The influence of shape on image correspondence
%A Ogale, A. S
%A Aloimonos, J.
%K Automation
%K CAMERAS
%K Computational modeling
%K first order approximation
%K Geometrical optics
%K hidden feature removal
%K image sampling
%K Image segmentation
%K Layout
%K occlusion detection
%K piecewise continuous function
%K Pixel
%K SHAPE
%K Simulated annealing
%K stereo image processing
%K surface fitting
%X We examine the implications of shape on the process of finding dense correspondence and half-occlusions for a stereo pair of images. The desired property of the depth map is that it should be a piecewise continuous function which is consistent with the images and which has the minimum number of discontinuities. To zeroeth order, piecewise continuity becomes piecewise constancy. Using this approximation, we first discuss an approach for dealing with such a fronto-parallel shapeless world, and the problems involved therein. We then introduce horizontal and vertical slant to create a first order approximation to piecewise continuity. We highlight the fact that a horizontally slanted surface (ie. having depth variation in the direction of the separation of the two cameras) appears horizontally stretched in one image as compared to the other image. Thus, while corresponding two images, N pixels on a scanline in one image may correspond to a different number of pixels M in the other image, which has consequences with regard to sampling and occlusion detection. We also discuss the asymmetry between vertical and horizontal slant, and the central role of nonhorizontal edges in the context of vertical slant. Using experiments, we discuss cases where existing algorithms fail, and how the incorporation of new constraints provides correct results.
%B 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004. Proceedings
%I IEEE
%P 945 - 952
%8 2004/09/06/9
%@ 0-7695-2223-8
%G eng
%R 10.1109/TDPVT.2004.1335418
%0 Conference Paper
%B Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004
%D 2004
%T A Rao-Blackwellized particle filter for EigenTracking
%A Zia Khan
%A Balch, T.
%A Dellaert, F.
%K analytically tractable integrals
%K Computer vision
%K EigenTracking
%K Filters
%K Gaussian processes
%K modal analysis
%K multi-modal distributions
%K NOISE
%K noisy targets
%K optimisation
%K optimization-based algorithms
%K Particle filters
%K Particle measurements
%K Particle tracking
%K Principal component analysis
%K probabilistic principal component analysis
%K Rao-Blackwellized particle filter
%K Robustness
%K SHAPE
%K State estimation
%K state vector
%K subspace coefficients
%K Subspace representations
%K target tracking
%K vectors
%X Subspace representations have been a popular way to model appearance in computer vision. In Jepson and Black's influential paper on EigenTracking, they were successfully applied in tracking. For noisy targets, optimization-based algorithms (including EigenTracking) often fail catastrophically after losing track. Particle filters have recently emerged as a robust method for tracking in the presence of multi-modal distributions. To use subspace representations in a particle filter, the number of samples increases exponentially as the state vector includes the subspace coefficients. We introduce an efficient method for using subspace representations in a particle filter by applying Rao-Blackwellization to integrate out the subspace coefficients in the state vector. Fewer samples are needed since part of the posterior over the state vector is analytically calculated. We use probabilistic principal component analysis to obtain analytically tractable integrals. We show experimental results in a scenario in which we track a target in clutter.
%B Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004
%V 2
%P II - 980-II-986 Vol.2
%8 2004/06//
%G eng
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%D 2004
%T Role of shape and kinematics in human movement analysis
%A Veeraraghavan,A.
%A Chowdhury, A.R.
%A Chellapa, Rama
%K activity
%K algorithm;
%K algorithms;
%K analysis;
%K autoregressive
%K average
%K based
%K classification;
%K community;
%K Computer
%K definition;
%K dynamical
%K extraction;
%K feature
%K Gait
%K hidden
%K human
%K identification
%K image
%K Kendall
%K linear
%K manifold;
%K Markov
%K modeling;
%K models;
%K MOTION
%K Movement
%K moving
%K processes;
%K recognition
%K sequences;
%K SHAPE
%K spherical
%K system;
%K VISION
%K vision;
%X Human gait and activity analysis from video is presently attracting a lot of attention in the computer vision community. In this paper we analyze the role of two of the most important cues in human motion-shape and kinematics. We present an experimental framework whereby it is possible to evaluate the relative importance of these two cues in computer vision based recognition algorithms. In the process, we propose a new gait recognition algorithm by computing the distance between two sequences of shapes that lie on a spherical manifold. In our experiments, shape is represented using Kendall's definition of shape. Kinematics is represented using a Linear Dynamical system We place particular emphasis on human gait. Our conclusions show that shape plays a role which is more significant than kinematics in current automated gait based human identification algorithms. As a natural extension we study the role of shape and kinematics in activity recognition. Our experiments indicate that we require models that contain both shape and kinematics in order to perform accurate activity classification. These conclusions also allow us to explain the relative performance of many existing methods in computer-based human activity modeling.
%B Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
%V 1
%P I-730 - I-737 Vol.1 - I-730 - I-737 Vol.1
%8 2004/07/02/june
%G eng
%R 10.1109/CVPR.2004.1315104
%0 Conference Paper
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%D 2003
%T Activity recognition using the dynamics of the configuration of interacting objects
%A Vaswani, N.
%A RoyChowdhury, A.
%A Chellapa, Rama
%K 2D
%K abnormal
%K abnormality
%K abnormality;
%K acoustic
%K activity
%K analysis;
%K change;
%K Computer
%K configuration
%K configuration;
%K data;
%K DETECTION
%K detection;
%K distribution;
%K drastic
%K dynamics;
%K event;
%K filter;
%K hand-picked
%K image
%K infrared
%K interacting
%K learning;
%K location
%K low
%K mean
%K model;
%K monitoring;
%K MOTION
%K moving
%K noise;
%K noisy
%K object
%K object;
%K observation
%K observation;
%K particle
%K pattern
%K plane;
%K point
%K polygonal
%K probability
%K probability;
%K problem;
%K processing;
%K radar
%K recognition;
%K resolution
%K sensor;
%K sensors;
%K sequence;
%K SHAPE
%K shape;
%K signal
%K slow
%K statistic;
%K strategy;
%K Surveillance
%K surveillance;
%K target
%K test
%K tracking;
%K video
%K video;
%K visible
%K vision;
%X Monitoring activities using video data is an important surveillance problem. A special scenario is to learn the pattern of normal activities and detect abnormal events from a very low resolution video where the moving objects are small enough to be modeled as point objects in a 2D plane. Instead of tracking each point separately, we propose to model an activity by the polygonal 'shape' of the configuration of these point masses at any time t, and its deformation over time. We learn the mean shape and the dynamics of the shape change using hand-picked location data (no observation noise) and define an abnormality detection statistic for the simple case of a test sequence with negligible observation noise. For the more practical case where observation (point locations) noise is large and cannot be ignored, we use a particle filter to estimate the probability distribution of the shape given the noisy observations up to the current time. Abnormality detection in this case is formulated as a change detection problem. We propose a detection strategy that can detect both 'drastic' and 'slow' abnormalities. Our framework can be directly applied for object location data obtained using any type of sensors - visible, radar, infrared or acoustic.
%B Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on
%V 2
%P II - 633-40 vol.2 - II - 633-40 vol.2
%8 2003/06//
%G eng
%R 10.1109/CVPR.2003.1211526
%0 Conference Paper
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%D 2003
%T Human body pose estimation using silhouette shape analysis
%A Mittal,A.
%A Liang Zhao
%A Davis, Larry S.
%K 3D
%K analysis;
%K body
%K classification;
%K clutter;
%K detection;
%K estimation;
%K extraction;
%K feature
%K function;
%K human
%K image
%K likelihood
%K multiple
%K object
%K parameter
%K parameters;
%K Pixel
%K pose
%K probability;
%K segmentation;
%K segmentations;
%K SHAPE
%K silhouette
%K structure;
%K surveillance;
%K views;
%X We describe a system for human body pose estimation from multiple views that is fast and completely automatic. The algorithm works in the presence of multiple people by decoupling the problems of pose estimation of different people. The pose is estimated based on a likelihood function that integrates information from multiple views and thus obtains a globally optimal solution. Other characteristics that make our method more general than previous work include: (1) no manual initialization; (2) no specification of the dimensions of the 3D structure; (3) no reliance on some learned poses or patterns of activity; (4) insensitivity to edges and clutter in the background and within the foreground. The algorithm has applications in surveillance and promising results have been obtained.
%B Proceedings. IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
%P 263 - 270
%8 2003/07//
%G eng
%R 10.1109/AVSS.2003.1217930
%0 Conference Paper
%B Intelligent Vehicles Symposium, 2003. Proceedings. IEEE
%D 2003
%T Modelling pedestrian shapes for outlier detection: a neural net based approach
%A Nanda,H.
%A Benabdelkedar,C.
%A Davis, Larry S.
%K (artificial
%K complex
%K Computer
%K computing;
%K custom
%K design;
%K detection;
%K engineering
%K intelligence);
%K layer
%K learning
%K method;
%K modelling;
%K net;
%K nets;
%K neural
%K object
%K outlier
%K pedestrian
%K pedestrians
%K rate;
%K recognition
%K recognition;
%K SHAPE
%K shapes;
%K traffic
%K two
%K vision;
%X In this paper we present an example-based approach to learn a given class of complex shapes, and recognize instances of that shape with outliers. The system consists of a two-layer custom-designed neural network. We apply this approach to the recognition of pedestrians carrying objects from a single camera. The system is able to capture and model an ample range of pedestrian shapes at varying poses and camera orientations, and achieves a 90% correct recognition rate.
%B Intelligent Vehicles Symposium, 2003. Proceedings. IEEE
%P 428 - 433
%8 2003/06//
%G eng
%R 10.1109/IVS.2003.1212949
%0 Conference Paper
%B 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings
%D 2003
%T New eyes for robotics
%A Baker, P.
%A Ogale, A. S
%A Fermüller, Cornelia
%A Aloimonos, J.
%K 3D motion estimation
%K Argus eye
%K array signal processing
%K Birds
%K Calibration
%K CAMERAS
%K Control systems
%K Eyes
%K geometric configuration
%K imaging
%K imaging system
%K Layout
%K Motion estimation
%K multiple stereo configurations
%K Robot kinematics
%K robot vision
%K Robot vision systems
%K ROBOTICS
%K Robotics and automation
%K SHAPE
%K shape models
%X This paper describes an imaging system that has been designed to facilitate robotic tasks of motion. The system consists of a number of cameras in a network arranged so that they sample different parts of the visual sphere. This geometric configuration has provable advantages compared to small field of view cameras for the estimation of the system's own motion and consequently the estimation of shape models from the individual cameras. The reason is that inherent ambiguities of confusion between translation and rotation disappear. Pairs of cameras may also be arranged in multiple stereo configurations which provide additional advantages for segmentation. Algorithms for the calibration of the system and the 3D motion estimation are provided.
%B 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings
%I IEEE
%V 1
%P 1018- 1023 vol.1 - 1018- 1023 vol.1
%8 2003/10/27/31
%@ 0-7803-7860-1
%G eng
%R 10.1109/IROS.2003.1250761
%0 Conference Paper
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%D 2003
%T Statistical shape theory for activity modeling
%A Vaswani, N.
%A Chowdhury, A.R.
%A Chellapa, Rama
%K abnormal
%K activities
%K activity
%K analysis;
%K behavior;
%K classification;
%K data;
%K image
%K mass;
%K matching;
%K modeling;
%K monitoring;
%K moving
%K normal
%K particle;
%K pattern
%K pattern;
%K point
%K polygonal
%K probability;
%K problem;
%K processing;
%K sequence;
%K sequences;
%K SHAPE
%K shape;
%K signal
%K statistical
%K Surveillance
%K surveillance;
%K theory;
%K video
%X Monitoring activities in a certain region from video data is an important surveillance problem. The goal is to learn the pattern of normal activities and detect unusual ones by identifying activities that deviate appreciably from the typical ones. We propose an approach using statistical shape theory based on the shape model of D.G. Kendall et al. (see "Shape and Shape Theory", John Wiley and Sons, 1999). In a low resolution video, each moving object is best represented as a moving point mass or particle. In this case, an activity can be defined by the interactions of all or some of these moving particles over time. We model this configuration of the particles by a polygonal shape formed from the locations of the points in a frame and the activity by the deformation of the polygons in time. These parameters are learned for each typical activity. Given a test video sequence, an activity is classified as abnormal if the probability for the sequence (represented by the mean shape and the dynamics of the deviations), given the model, is below a certain threshold The approach gives very encouraging results in surveillance applications using a single camera and is able to identify various kinds of abnormal behavior.
%B Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
%V 3
%P III - 493-6 vol.3 - III - 493-6 vol.3
%8 2003/04//
%G eng
%R 10.1109/ICASSP.2003.1199519
%0 Conference Paper
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%D 2002
%T Content-based image retrieval using Fourier descriptors on a logo database
%A Folkers,A.
%A Samet, Hanan
%K abstraction;
%K analysis;
%K constraints;
%K content-based
%K contour
%K database
%K database;
%K databases;
%K descriptors;
%K detection;
%K edge
%K Fourier
%K image
%K logos;
%K pictorial
%K processing;
%K query
%K retrieval;
%K SHAPE
%K spatial
%K specification;
%K theory;
%K visual
%X A system that enables the pictorial specification of queries in an image database is described. The queries are comprised of rectangle, polygon, ellipse, and B-spline shapes. The queries specify which shapes should appear in the target image as well as spatial constraints on the distance between them and their relative position. The retrieval process makes use of an abstraction of the contour of the shape which is invariant against translation, scale, rotation, and starting point, that is based on the use of Fourier descriptors. These abstractions are used in a system to locate logos in an image database. The utility of this approach is illustrated using some sample queries.
%B Pattern Recognition, 2002. Proceedings. 16th International Conference on
%V 3
%P 521 - 524 vol.3 - 521 - 524 vol.3
%8 2002///
%G eng
%R 10.1109/ICPR.2002.1047991
%0 Conference Paper
%B IEEE/RSJ International Conference on Intelligent Robots and Systems, 2002
%D 2002
%T Contour migration: solving object ambiguity with shape-space visual guidance
%A Abd-Almageed, Wael
%A Smith,C.E.
%K Artificial intelligence
%K camera motion
%K CAMERAS
%K Computer vision
%K contour migration
%K Databases
%K edge detection
%K Intelligent robots
%K Laboratories
%K Machine vision
%K object ambiguity
%K Object recognition
%K pattern matching
%K Robot vision systems
%K servomechanisms
%K SHAPE
%K shape matching
%K shape-space visual guidance
%K silhouette matching
%K visual servoing
%X A fundamental problem in computer vision is the issue of shape ambiguity. Simply stated, a silhouette cannot uniquely identify an object or an object's classification since many unique objects can present identical occluding contours. This problem has no solution in the general case for a monocular vision system. This paper presents a method for disambiguating objects during silhouette matching using a visual servoing system. This method identifies the camera motion(s) that gives disambiguating views of the objects. These motions are identified through a new technique called contour migration. The occluding contour's shape is used to identify objects or object classes that are potential matches for that shape. A contour migration is then determined that disambiguates the possible matches by purposive viewpoint adjustment. The technique is demonstrated using an example set of objects.
%B IEEE/RSJ International Conference on Intelligent Robots and Systems, 2002
%I IEEE
%V 1
%P 330- 335 vol.1 - 330- 335 vol.1
%8 2002///
%@ 0-7803-7398-7
%G eng
%R 10.1109/IRDS.2002.1041410
%0 Conference Paper
%B Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on
%D 2002
%T Hidden loop recovery for handwriting recognition
%A David Doermann
%A Intrator,N.
%A Rivin,E.
%A Steinherz,T.
%K analysis;
%K character
%K contour
%K cursive
%K detection;
%K distance
%K edge
%K ellipses;
%K form
%K handwritten
%K hidden
%K loop
%K measurements;
%K mutual
%K partitioning;
%K recognition;
%K recovery;
%K SHAPE
%K shape;
%K sophisticated
%K strokes;
%K symmetric
%K truncated
%K word
%X One significant challenge in the recognition of off-line handwriting is in the interpretation of loop structures. Although this information is readily available in online representation, close proximity of strokes often merges their centers making them difficult to identify. In this paper a novel approach to the recovery of hidden loops in off-line scanned document images is presented. The proposed algorithm seeks blobs that resemble truncated ellipses. We use a sophisticated form analysis method based on mutual distance measurements between the two sides of a symmetric shape. The experimental results are compared with the ground truth of the online representations of each off-line word image. More than 86% percent of the meaningful loops are handled correctly.
%B Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on
%P 375 - 380
%8 2002///
%G eng
%R 10.1109/IWFHR.2002.1030939
%0 Journal Article
%J Image Processing, IEEE Transactions on
%D 2002
%T Optimal edge-based shape detection
%A Moon, H.
%A Chellapa, Rama
%A Rosenfeld, A.
%K 1D
%K 2D
%K aerial
%K analysis;
%K boundary
%K conditions;
%K contour
%K cross
%K detection;
%K DODE
%K double
%K edge
%K edge-based
%K error
%K error;
%K exponential
%K extraction;
%K facial
%K feature
%K filter
%K filter;
%K Filtering
%K function;
%K geometry;
%K global
%K human
%K images;
%K imaging
%K localization
%K mean
%K methods;
%K NOISE
%K operator;
%K optimal
%K optimisation;
%K output;
%K performance;
%K pixel;
%K power;
%K propagation;
%K properties;
%K section;
%K SHAPE
%K square
%K squared
%K statistical
%K step
%K theory;
%K tracking;
%K two-dimensional
%K vehicle
%K video;
%X We propose an approach to accurately detecting two-dimensional (2-D) shapes. The cross section of the shape boundary is modeled as a step function. We first derive a one-dimensional (1-D) optimal step edge operator, which minimizes both the noise power and the mean squared error between the input and the filter output. This operator is found to be the derivative of the double exponential (DODE) function, originally derived by Ben-Arie and Rao (1994). We define an operator for shape detection by extending the DODE filter along the shape's boundary contour. The responses are accumulated at the centroid of the operator to estimate the likelihood of the presence of the given shape. This method of detecting a shape is in fact a natural extension of the task of edge detection at the pixel level to the problem of global contour detection. This simple filtering scheme also provides a tool for a systematic analysis of edge-based shape detection. We investigate how the error is propagated by the shape geometry. We have found that, under general assumptions, the operator is locally linear at the peak of the response. We compute the expected shape of the response and derive some of its statistical properties. This enables us to predict both its localization and detection performance and adjust its parameters according to imaging conditions and given performance specifications. Applications to the problem of vehicle detection in aerial images, human facial feature detection, and contour tracking in video are presented.
%B Image Processing, IEEE Transactions on
%V 11
%P 1209 - 1227
%8 2002/11//
%@ 1057-7149
%G eng
%N 11
%R 10.1109/TIP.2002.800896
%0 Conference Paper
%B Multimedia Signal Processing, 2002 IEEE Workshop on
%D 2002
%T Wide baseline image registration using prior information
%A Chowdhury, AM
%A Chellapa, Rama
%A Keaton, T.
%K 2D
%K 3D
%K algorithm;
%K alignment;
%K angles;
%K baseline
%K Computer
%K configuration;
%K constellation;
%K correspondence
%K creation;
%K doubly
%K error
%K extraction;
%K Face
%K feature
%K global
%K holistic
%K image
%K images;
%K matching;
%K matrix;
%K model
%K models;
%K normalization
%K panoramic
%K probability;
%K procedure;
%K processes;
%K processing;
%K registration;
%K robust
%K sequences;
%K SHAPE
%K signal
%K Sinkhorn
%K spatial
%K statistics;
%K stereo;
%K Stochastic
%K video
%K view
%K viewing
%K vision;
%K wide
%X Establishing correspondence between features in two images of the same scene taken from different viewing angles in a challenging problem in image processing and computer vision. However, its solution is an important step in many applications like wide baseline stereo, 3D model alignment, creation of panoramic views etc. In this paper, we propose a technique for registration of two images of a face obtained from different viewing angles. We show that prior information about the general characteristics of a face obtained from video sequences of different faces can be used to design a robust correspondence algorithm. The method works by matching 2D shapes of the different features of the face. A doubly stochastic matrix, representing the probability of match between the features, is derived using the Sinkhorn normalization procedure. The final correspondence is obtained by minimizing the probability of error of a match between the entire constellations of features in the two sets, thus taking into account the global spatial configuration of the features. The method is applied for creating holistic 3D models of a face from partial representations. Although this paper focuses primarily on faces, the algorithm can also be used for other objects with small modifications.
%B Multimedia Signal Processing, 2002 IEEE Workshop on
%P 37 - 40
%8 2002/12//
%G eng
%R 10.1109/MMSP.2002.1203242
%0 Conference Paper
%B Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001
%D 2001
%T A spherical eye from multiple cameras (makes better models of the world)
%A Baker, P.
%A Fermüller, Cornelia
%A Aloimonos, J.
%A Pless, R.
%K 3D motion estimation
%K Calibration
%K camera network
%K CAMERAS
%K Computer vision
%K egomotion recovery
%K geometric configuration
%K geometric constraint
%K image gradients
%K image sampling
%K imaging system
%K Laboratories
%K Layout
%K Motion estimation
%K multiple cameras
%K Pixel
%K Robot vision systems
%K SHAPE
%K shape models
%K Space technology
%K spherical eye
%K system calibration
%K video
%K video cameras
%K video signal processing
%K visual sphere sampling
%X The paper describes an imaging system that has been designed specifically for the purpose of recovering egomotion and structure from video. The system consists of six cameras in a network arranged so that they sample different parts of the visual sphere. This geometric configuration has provable advantages compared to small field of view cameras for the estimation of the system's own motion and consequently the estimation of shape models from the individual cameras. The reason is that inherent ambiguities of confusion between translation and rotation disappear. We provide algorithms for the calibration of the system and 3D motion estimation. The calibration is based on a new geometric constraint that relates the images of lines parallel in space to the rotation between the cameras. The 3D motion estimation uses a constraint relating structure directly to image gradients.
%B Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001
%I IEEE
%V 1
%P I-576- I-583 vol.1 - I-576- I-583 vol.1
%8 2001///
%@ 0-7695-1272-0
%G eng
%R 10.1109/CVPR.2001.990525
%0 Conference Paper
%B Sixth International Conference on Computer Vision, 1998
%D 1998
%T Which shape from motion?
%A Fermüller, Cornelia
%A Aloimonos, J.
%K 3D motion estimation
%K affine shape
%K Algorithm design and analysis
%K Computer vision
%K distorted version
%K distortion function
%K human visual space distortion
%K HUMANS
%K Image motion analysis
%K image representation
%K Information analysis
%K Layout
%K Motion analysis
%K Motion estimation
%K motion information
%K Psychology
%K rigid transformation
%K SHAPE
%K shape estimation
%K shape representations
%K State estimation
%K visual space
%X In a practical situation, the rigid transformation relating different views is recovered with errors. In such a case, the recovered depth of the scene contains errors, and consequently a distorted version of visual space is computed. What then are meaningful shape representations that can be computed from the images? The result presented in this paper states that if the rigid transformation between different views is estimated in a way that gives rise to a minimum number of negative depth values, then at the center of the image affine shape can be correctly computed. This result is obtained by exploiting properties of the distortion function. The distortion model turns out to be a very powerful tool in the analysis and design of 3D motion and shape estimation algorithms, and as a byproduct of our analysis we present a computational explanation of psychophysical results demonstrating human visual space distortion from motion information
%B Sixth International Conference on Computer Vision, 1998
%I IEEE
%P 689 - 695
%8 1998/01/04/7
%@ 81-7319-221-9
%G eng
%R 10.1109/ICCV.1998.710792
%0 Journal Article
%J IEEE Software
%D 1995
%T Image-browser taxonomy and guidelines for designers
%A Plaisant, Catherine
%A Carr,D.
%A Shneiderman, Ben
%K analysis
%K Computer Graphics
%K design
%K designer guidelines
%K Equations
%K Europe
%K Evaluation
%K Formal specifications
%K Graphical user interfaces
%K Guidelines
%K IMAGE PROCESSING
%K image-browser taxonomy
%K informal specification technique
%K Laboratories
%K large image browsing
%K Layout
%K Road transportation
%K selected image exploration
%K SHAPE
%K Software design
%K task taxonomy
%K Taxonomy
%K tools
%K two-dimensional browsing
%K user interface management systems
%K visual databases
%X In many applications users must browse large images. Most designers merely use two one-dimensional scroll bars or ad hoc designs for two-dimensional scroll bars. However, the complexity of two-dimensional browsing suggests that more careful analysis, design, and evaluation might lead to significant improvements. Our exploration of existing 2D browsers has led us to identify many features and a wide variety of tasks performed with the browsers. We introduce an informal specification technique to describe 2D browsers and a task taxonomy, suggest design features and guidelines, and assess existing strategies. We focus on the tools to explore a selected image and so do not cover techniques to browse a series of images or to browse large-image databases
%B IEEE Software
%V 12
%P 21 - 32
%8 1995/03//
%@ 0740-7459
%G eng
%N 2
%R 10.1109/52.368260
%0 Journal Article
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%D 1994
%T A syntactic approach to scale-space-based corner description
%A Fermüller, Cornelia
%A Kropatsch,W.
%K Computer vision
%K corner detection
%K curvature extrema
%K edge detection
%K IMAGE PROCESSING
%K image resolution
%K Image segmentation
%K Laboratories
%K Large-scale systems
%K PARALLEL PROCESSING
%K pattern recognition
%K planar curves
%K resolution
%K Sampling methods
%K sampling problems
%K scale space based corner description
%K SHAPE
%K Smoothing methods
%K syntactic approach
%X Planar curves are described by information about corners integrated over various levels of resolution. The detection of corners takes place on a digital representation. To compensate for ambiguities arising from sampling problems due to the discreteness, results about the local behavior of curvature extrema in continuous scale-space are employed
%B IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 16
%P 748 - 751
%8 1994/07//
%@ 0162-8828
%G eng
%N 7
%R 10.1109/34.297957
%0 Conference Paper
%B 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1993. Proceedings CVPR '93
%D 1993
%T Early vision processing using a multi-stage diffusion process
%A Yacoob,Yaser
%A Davis, Larry S.
%K 3-D space
%K Computational modeling
%K Computer vision
%K Diffusion processes
%K discontinuity detection
%K early vision processing
%K Educational institutions
%K Image edge detection
%K Image segmentation
%K Laboratories
%K multistage diffusion process
%K Noise shaping
%K noise-free edges
%K noisy edges
%K Performance analysis
%K roof edges
%K segmentation
%K SHAPE
%K shape homogeneous regions
%K step edges
%K valley edges
%X The use of a multistage diffusion process in the early processing of range data is examined. The input range data are interpreted as occupying a volume in 3-D space. Each diffusion stage simulates the process of diffusing part of the boundary of the volume into the volume. The outcome of the process can be used for both discontinuity detection and segmentation into shape homogeneous regions. The process is applied to synthetic noise-free and noisy step, roof, and valley edges as well as to real range images
%B 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1993. Proceedings CVPR '93
%I IEEE
%P 41 - 46
%8 1993/06//
%@ 0-8186-3880-X
%G eng
%R 10.1109/CVPR.1993.341003
%0 Conference Paper
%B , 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR '92
%D 1992
%T Multi-resolution shape description by corners
%A Fermüller, Cornelia
%A Kropatsch,W.
%K ambiguities
%K Automation
%K computational geometry
%K Computer vision
%K continuous curves
%K corners
%K curvature extrema
%K curvature information
%K curve fitting
%K digital images
%K Feature extraction
%K IMAGE PROCESSING
%K image resolution
%K Image segmentation
%K Laboratories
%K multiple resolution
%K multiresolution structure
%K parallelizable
%K planar curves
%K Robustness
%K Sampling methods
%K scale-space
%K SHAPE
%K Smoothing methods
%K varying scale
%X A robust method for describing planar curves in multiple resolution using curvature information is presented. The method is developed by taking into account the discrete nature of digital images as well as the discrete aspect of a multiresolution structure (pyramid). The main contribution lies in the robustness of the technique, which is due to the additional information that is extracted from observing the behavior of corners in the whole pyramid. Furthermore, the resulting algorithm is conceptually simple and easily parallelizable. Theoretical results are developed analyzing the curvature of continuous curves in scale-space and showing the behavior of curvature extrema under varying scale. The results are used to eliminate any ambiguities that might arise from sampling problems due to the discreteness of the representation. Experimental results demonstrate the potential of the method
%B , 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR '92
%I IEEE
%P 271 - 276
%8 1992/06/15/18
%@ 0-8186-2855-3
%G eng
%R 10.1109/CVPR.1992.223264
%0 Conference Paper
%B Proceedings of 10th International Conference on Pattern Recognition, 1990
%D 1990
%T Purposive and qualitative active vision
%A Aloimonos, J.
%K active vision
%K Automation
%K brain models
%K complex visual tasks
%K Computer vision
%K environmental knowledge
%K highly sophisticated navigational tasks
%K HUMANS
%K Image reconstruction
%K intentions
%K Kinetic theory
%K Laboratories
%K Medusa
%K Motion analysis
%K Navigation
%K planning
%K planning (artificial intelligence)
%K purposive-qualitative vision
%K recovery problem
%K Robust stability
%K Robustness
%K SHAPE
%K stability
%X The traditional view of the problem of computer vision as a recovery problem is questioned, and the paradigm of purposive-qualitative vision is offered as an alternative. This paradigm considers vision as a general recognition problem (recognition of objects, patterns or situations). To demonstrate the usefulness of the framework, the design of the Medusa of CVL is described. It is noted that this machine can perform complex visual tasks without reconstructing the world. If it is provided with intentions, knowledge of the environment, and planning capabilities, it can perform highly sophisticated navigational tasks. It is explained why the traditional structure from motion problem cannot be solved in some cases and why there is reason to be pessimistic about the optimal performance of a structure from motion module. New directions for future research on this problem in the recovery paradigm, e.g., research on stability or robustness, are suggested
%B Proceedings of 10th International Conference on Pattern Recognition, 1990
%I IEEE
%V i
%P 346-360 vol.1 - 346-360 vol.1
%8 1990/06/16/21
%@ 0-8186-2062-5
%G eng
%R 10.1109/ICPR.1990.118128
%0 Journal Article
%J Proceedings of the IEEE
%D 1988
%T Visual shape computation
%A Aloimonos, J.
%K a priori knowledge
%K active vision
%K computational problems
%K Computer vision
%K computing shaping from motion
%K contour
%K cues
%K Focusing
%K HUMANS
%K ill posed problems
%K Machine vision
%K Psychology
%K regularization theory
%K RETINA
%K sense of Hadamard
%K shading
%K SHAPE
%K space of possible solutions
%K Stereo vision
%K Surface texture
%K TEXTURE
%K visual shape computation
%K Visual system
%X Perceptual processes responsible for computing shape from several cues, including shading, texture, contour, and stereo, are examined. It is noted that these computational problems, as well as that of computing shaping from motion, are ill-posed in the sense of Hadamard. It is suggested that regularization theory can be used along with a priori knowledge to restrict the space of possible solutions, and thus restore the problem's well-prosedness. Some alternative methods are outlined, and the idea of active vision is explored briefly in connection with the problem
%B Proceedings of the IEEE
%V 76
%P 899 - 916
%8 1988/08//
%@ 0018-9219
%G eng
%N 8
%R 10.1109/5.5964
%0 Report
%D 1986
%T Computing Intrinsic Images.
%A Aloimonos, J.
%K *ARTIFICIAL INTELLIGENCE
%K *COMPUTERS
%K *IMAGE PROCESSING
%K *VISION
%K algorithms
%K CAMERAS
%K CATASTROPHIC CONDITIONS
%K COMPUTATIONS
%K CYBERNETICS
%K ERRORS
%K HUMANS
%K IMAGES
%K INTENSITY
%K LOW LEVEL
%K MATHEMATICS
%K MOTION
%K RETINA
%K Robots
%K SHADOWS
%K SHAPE
%K TEXTURE
%K THEORY
%K THESES.
%X Low level modern computer vision is not domain dependent, but concentrates on problems that correspond to identifiable modules in the human visual system. Several theories have been proposed in the literature for the computation of shape from shading, shape from texture, retinal motion from spatiotemporal derivatives of the image intensity function and the like. The problems with the existing approach are basically the following: (1) The employed assumptions are very strong and so most of the algorithms fail when applied to real images. (2) Usually the constraints from the geometry and the physics of the problem are not enough to guarantee uniqueness of the computed parameters. (3) In most cases the resulting algorithms are not robust, in the sense that if there is a slight error in the input this results in a catastrophic error in the output. In this thesis the problem of machine vision is explored from its basics. A low level mathematical theory is presented for the unique robust computation of intrinsic parameters. The computational aspect of the theory envisages a cooperative highly parallel implementation, bringing in information from five different sources (shading, texture, motion, contour and stereo), to resolve ambiguities and ensure uniqueness and stability of the intrinsic parameters. The problems of shape from texture, shape from shading and motion, visual motion analysis and shape and motion from contour are analyzed in detail.
%8 1986/08//
%G eng
%U http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA189440