Friday at 3:30pm in AVW 2120 (or as notified)
Spring, 2005
Organizers: Ramani
Duraiswami and David Jacobs
Webmaster: Zhiyun Li
| Date | Speaker | Title |
|---|---|---|
| 02/11/05 | Haibin Ling | Using the Inner-Distance for Classification of Articulated Shapes |
| 02/18/05 | No "Cfar Weekly Seminar" today | |
| 02/25/05 | Bohyung Han | Kernel based Bayesian Filtering for Object Tracking |
| 03/04/05 | Dr. Rakesh (Teddy) Kumar | Creating intelligence by aligning pixels |
| 03/11/05 | Yefeng Zheng | Robust Point Matching for Non-Rigid Shapes: A Relaxation Labeling Based Approach |
| 03/18/05 | ||
| 03/25/05 | Spring break, no seminar | |
| 04/01/05 | Gaurav Aggarwal | Face: Tracking and Recognition |
| 04/08/05 | Naresh Cuntoor | Activity Recognition using HMM-based Event Probability Sequences |
| 04/15/05 | Amit Agrawal | An Algebraic approach to surface reconstruction from gradient fields |
| 04/22/05 | Prof. Ron DeVOre | Compression of Digitized Surfaces |
| 04/29/05 | Narayanan Ramanathan | Age Progression in Human Faces : A Computational Approach |
| 05/06/05 | Philip David | Object Recognition in High Clutter Images Using Line Features |
| Top |
| 02/11/05 | Using the Inner-Distance for Classification of Articulated Shapes |
|---|---|
| Speaker | Haibin Ling, UMD |
| Abstract | We propose using the inner-distance between landmark points to build
shape descriptors. The inner-distance is defined as the length of the
shortest path between landmark points within the shape silhouette. We show
that the inner-distance is articulation insensitive and more effective at
capturing complex shapes with part structures than Euclidean distance. To
demonstrate this idea, it is used to build a new shape descriptor based on
shape contexts. After that, we design a dynamic programming based method for
shape matching and comparison. We have tested our approach on a variety of
shape databases including an articulated shape dataset, MPEG7 CE-Shape-1, Kimia silhouettes, a Swedish leaf database and a human motion silhouette
dataset. In all the experiments, our method demonstrates effective
performance compared with other algorithms. PS: the TR version is available at http://www.cs.umd.edu/~hbling/Research/Inner-Distance/inner-dist-tr.pdf |
| 02/25/05 | Kernel based Bayesian Filtering for Object Tracking |
| Speaker | Bohyung Han, UMD |
| Abstract | Particle filtering provides a general framework for propagating probability density functions in non-linear and non-Gaussian systems. However, the algorithm is based on a Monte Carlo approach and sampling is a problematic issue, especially for high dimensional problems. This paper presents a new kernel-based Bayesian filtering framework, which adopts an analytic approach to better approximate and propagate density functions. In this framework, the techniques of density interpolation and density approximation are introduced to represent the likelihood and the posterior densities by Gaussian mixtures, where all parameters such as the number of mixands, their weight, mean, and covariance are automatically determined. The proposed analytic approach is shown to perform sampling more efficiently in high dimensional space. We apply our algorithm to the real-time tracking problem, and demonstrate its performance on real video sequences as well as synthetic examples. |
| 03/04/05 | Creating intelligence by aligning pixels |
| Speaker | Dr. Rakesh (Teddy) Kumar, Sarnoff Corporation, Princeton NJ |
| Abstract | The confluence of Video and Vision research at Sarnoff is
strongly influenced by real world applications. Over the past decade or so,
advances in fundamental algorithms, and successes in creating customer
oriented technologies and products have produced a unique R&D environment at
Sarnoff. This talk will present a tour of algorithm and application
development as it has evolved at Sarnoff in the recent past. Specifically,
we will present a framework for video representation within which
progressively complex video alignment models reveal the underlying 2D and 3D
nature of scenes and objects behind moving video pixels. The framework will
be highlighted through applications ranging across video surveillance,
mapping, entertainment and ATR. Alignment is a key tool in the representation and manipulation of the fundamental information content present in motion video and 3D sequences. Global 2D parametric alignment models reveal the 2D nature of video frames from close-by vantage points. These can be used for stabilization of video sequences and construction of video panoramas and aerial maps. Layered representations of motion video in terms of moving object and multiple scene layers can be exploited for tracking, compression and indexing. Alignment of video frames with generic optical flow and 3D models reveals the rigid and non-rigid structure of objects and scenes, and can be used for video enhancement, 3D recovery, modeling and image based rendering. Alignment of videos and 3D data to stored reference models enables object recognition, augmented reality, video insertion, and targeting. The talk will span these and other related applications of alignment. Finally, we will focus in the key technical area of 3D object recognition. We will present sub-linear techniques for the automatic recognition and classification of 3D objects observed by 3D sensors using a very large database of articulated 3D models. Our methods achieve a probability of correct identification (Pid) of greater than 0.95 for objects in the clear and 0.90 for objects with up to 50% occlusion.
Dr. Rakesh “Teddy” Kumar is currently the Technical Director of the Media Vision Laboratory at Sarnoff Corporation, Princeton, New Jersey. Prior to joining Sarnoff, he was employed at IBM. He received his Ph.D. in Computer Science from the University of Massachusetts at Amherst in 1992. He received his MS in ECE from SUNY Buffalo and BTech in EE from IIT-Kanpur. His technical interests are in the areas of computer vision, computer graphics, image processing and multimedia. At Sarnoff, he has been directing and performing commercial and government research and development projects in the areas of video surveillance and monitoring, video and 3D exploitation and analysis, object recognition, immersive tele-presence, 3D modeling, medical image analysis and multi-sensor registration. He has been one of the principal founders from Sarnoff for multiple spin-off and spin-in companies: VideoBrush, LifeClips and Pyramid Vision Technologies. He is currently the chief technical officer for Pyramid Vision Technologies.
Rakesh Kumar received the Sarnoff Technical Achievement awards in 1994 and 1996 for his work in registration of multi-sensor, multi-dimensional medical images and alignment of video to three dimensional scene models respectively. He was an Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence from 1999 to 2003. He was an Area Chair for IEEE Computer Vision and Pattern Recognition Conference (CVPR’2004), Program Chair for the first and second IEEE Workshop on Video Registration (IVR’ 2001 and 2004) and the Fourth IEEE Workshop on Applications of Computer Vision (WACV’98) and has served on the program committee of a number of computer vision conferences and National Science Foundation review panels. He has co-authored more than 40 research publications and has received over 17 patents, with numerous others pending. |
| 03/11/05 | Robust Point Matching for Non-Rigid Shapes: A Relaxation Labeling Based Approach |
| Speaker | Yefeng Zheng, UMD |
| Abstract | Shape matching or image registration, which is often formulated as a point matching problem, is frequently encountered in image analysis, computer vision, and pattern recognition. Although the problem of registering rigid shapes was widely studied, non-rigid shape matching has recently received more and more attention. For non-rigid shapes, most neighboring points cannot move independently under deformation due to physical constraints. Therefore, though the absolute distance between two points may change significantly, the neighborhood of a point is well preserved in general. Based on this observation, we formulate point matching as a graph matching problem. Each point is a node in the graph, and two nodes are connected by an edge if their Euclidean distance is less than a threshold. The optimal match between two graphs is the one that maximizes the number of matched edges. The shape context distance is used to initialize the graph matching, and relaxation labeling (after enforcing one-to-one matching) is used to refine the matching results. Non-rigid deformation is overcome by bringing one shape closer to the other in each iteration using deformation parameters estimated from the current point correspondence. Experiments on real and synthesized data demonstrate the effectiveness of our approach: it outperforms shape context and TPS-RPM algorithms under non-rigid deformation and noise on a public data set. |
| 04/01/05 | Face: Tracking and Recognition |
| Speaker | Gaurav Aggarwal |
| Abstract | The talk will consist of the following two (related)
problems : 1) Video-based Face Recognition: We pose video-to-video face recognition as a dynamical system identification and classification problem. Video-to-video means that both gallery and probe consists of videos. We model a moving face as a linear dynamical system whose appearance changes with pose. An auto-regressive and moving average (ARMA) model is used to represent such a system. The choice of ARMA model is based on its ability to take care of the change in appearance while modeling the dynamics of pose, expression etc. Recognition is performed using the concept of subspace angles to compute distances between probe and gallery video sequences. The results obtained are very promising given the extent of pose, expression and illumination variation in the video data used for experiments. 2) Facial Pose Tracking in Uncalibrated Videos: This paper presents a method to recover the 3D configuration of a face in each frame of a video. The 3D configuration consists of the three translational parameters and the three orientation parameters which correspond to the yaw, pitch and roll of the face. Such information is important for applications like face modeling, recognition, expression analysis, etc. which require head stabilization. The approach combines the structural advantages of geometric modeling with the statistical advantages of a particle-filter based inference. The face is modeled as the curved surface of a cylinder which is free to translate and rotate arbitrarily. The geometric modeling takes care of pose and self-occlusion while the statistical modeling handles moderate occlusion and illumination variations. Experimental results on multiple datasets are provided to show the efficacy of the approach. The insensitivity of our approach to calibration parameters (focal length) is also shown. |
| 04/08/05 | Activity Recognition using HMM-based Event Probability Sequences |
| Speaker | Naresh Cuntoor |
| Abstract | Many activities may be described by using a few key events that are contained in the motion trajectories. These events have a natural physical interpretation, but are not amenable to direct statistical representation, due to semantic ambiguity and large data variability, i.e., the statistical properties of trajectories corresponding to the same activity may vary significantly. It is desirable, however, to learn compact models from the observed data. The hidden Markov model(HMM) is a popular choice for many recognition tasks including speech and simple visual activities. In this talk, I shall present a method to derive events from the observed trajectories using a subset of the state sequences of the HMM. We define a new variable that measures the event probability at each time instant, assuming that the events are localized in time and space. The event probability sequences act as signatures for the activities. Our experiments demonstrate the application of the event-based activity representation to activity recognition and anomaly detection. |
| 04/15/05 | An Algebraic approach to surface reconstruction from gradient fields |
| Speaker | Amit Agrawal |
| Abstract | Several important problems in computer vision such as the Lightness problem, Shape from Shading (SFS), Photometric Stereo (PMS), etc. require reconstructing a surface from estimated gradient field, which is usually non-integrable, i.e. have non-zero curl. We propose a purely algebraic approach to enforce integrability in discrete domain. We first show that enforcing integrability can be formulated as solving a single linear system $Ax=b$ over the image. In general, this system is under-determined. We show conditions under which the system can be solved and a method to get to those conditions based on graph theory. The proposed approach is non-iterative, has the important property of local error confinement and can be applied to several problems. Results on Lightness problem, SFS and PMS demonstrate the applicability of our method. |
| 04/22/05 | Compression of Digitized Surfaces |
| Speaker | Ronald A. DeVore, CSCAMM |
| Abstract | Compression of digitized elevation maps (DEMs) has many
differences with image compression. Usually there is inherent geometry that
needs to be preserved for DEMs. Also the applications of DEMs requires
different fidelity metrics than those used for images. This talk will
consider the problem of surface encoding at a fundamental level and discuss
metrics to measure fidelity, models for surfaces, and geometric preserving
encoders. |
| 04/29/05 | Face verfication across Age Progression |
| Speaker | Narayanan Ramanathan |
| Abstract |
Human faces undergo considerable amount of variations with aging. While
studies have revealed the extent to which factors such as illumination
variations, pose variations, facial expression and occlusions affect
face recognition, the role of natural factors such as aging effects in
affecting the same are yet to be studied. How does age progression
affect the similarity between two images of an individual ? What is the
confidence associated with establishing the identity between two age
separated face images of an individual ? On a database of pairs of
passport images, we study similarity of faces as a function time. We
propose a Bayesian age-difference classifier that is built on a
probabilistic eigenspaces framework. Since age separated face images
invariably differ in illumination and have facial variations due to
aging, we propose a method to overcome non uniform illumination across
face images. The problem discussed in this paper has direct applications
in passport renewal and homeland security.
|
| 05/06/05 | Object Recognition in High Clutter Images Using Line Features |
| Speaker | Philip David |
| Abstract |
An object recognition algorithm is described that uses model and image
line features to locate complex objects in high clutter environments.
Finding correspondences between model and image features is the main
challenge in most object recognition systems. In our approach,
corresponding line features are determined by a three stage process. The
first stage generates a large number of approximate pose hypotheses from
correspondences of one or two lines in the model and image. Next, the
pose hypotheses from the previous stage are quickly evaluated by
comparing local image neighborhoods to the corresponding local model
neighborhoods. Fast nearest neighbor algorithms are used to implement a
distance measure that is unaffected by clutter and partial occlusion.
Finally, a robust pose estimation algorithm is applied for refinement
and verification, starting from the few best approximate poses produced
by the previous stages. Our algorithm is invariant to changes in image
scale, orientation, and partially invariant to affine distortion.
Experiments on real images demonstrate robust recognition of partially
occluded objects in very high clutter environments.
|
Previous Semesters: 2004-Spring 2004-Fall
Questions/comments: zli(at)cs.umd.edu