Object Tracking and Recognition in Videos

Shape and Behavior Encoded Tracking of Bee Dances or

Simulataneous Tracking and Behavior Recognition

Behavior analysis of social insects has garnered impetus in recent years and has led to some advances in fields like control systems, flight navigation etc. Manual labeling of insect motions required for analyzing the behaviors of insects requires significant investment of time and effort. In this paper, we propose certain general principles that help in simultaneous automatic tracking and behavior analysis with applications in tracking bees and recognizing specific behaviors exhibited by them. The state space for tracking is defined using position, orientation and the current behavior of the insect being tracked. The position and orientation are parametrized using a shape model while the behavior is explicitly modeled using a three-tier hierarchical motion model. The first tier (dynamics) models the local motions exhibited and the models built in this tier act as a vocabulary for behavior modeling. The second tier is a Markov motion model built on top of the local motion vocabulary which serves as the behavior model. The third tier of the hierarchy models the switching between behaviors and this is also modeled as a Markov model. We address issues in learning the three-tier behavioral model, in discriminating between models, detecting and in modeling abnormal behaviors. Another important aspect of this work is that it leads to joint tracking and behavior analysis instead of the traditional track and then recognize approach. We apply these principles for trackingbees in a hive while they are executing the waggle dance and the round dance.


Ashok Veeraraghavan, Rama Chellappa and Mandyam Srinivasan. "Shape and Behavior Encoded Tracking of Bee Dances" Accepted for Publication in IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI).[pdf]

 

3D Facial Pose Tracking in Uncalibrated Videos

This paper presents a method to recover the 3D configuration of a face in each frame of a video. The 3D configuration consists of the 3 translational parameters and the 3 orientation parameters which correspond to the yaw, pitch and roll of the face, which is important for applications like face modeling, recognition, expression analysis, etc. The approach combines the structural advantages of geometric modeling with the statistical advantages of a particle- lter based inference. The face is modeled as the curved surface of a cylinder which is free to translate and rotate arbitrarily. The geometric modeling takes care of pose and self-occlusion while the statistical modeling handles moderate occlusion and illumination variations. Experimental results on multiple datasets are provided to show the efficacy of the approach. The insensitivity of our approach to calibration parameters (focal length) is also shown.

Gaurav Aggarwal, Ashok Veeraraghavan and Rama Chellappa. "3D Facial Pose Tracking in Uncalibrated Videos". International Conference on Pattern Recognition and Machine Intelligence(PReMI), 2005. Published in Lecture Notes in Computer Science, Volume 3776, Dec 2005, Pages 515-520 [pdf] [ppt] [TrackingResult1] [TrackingResult2]

 

 

Motion Based Correspondence for 3D Tracking of Multiple Dim Objects

Tracking multiple objects in a video is a demanding task that is frequently encountered in several systems such as surveillance and motion analysis. Ability to track objects in 3D requires the use of multiple cameras. While tracking multiple objects using multiples video cameras, establishing correspondence between objects in the various cameras is a nontrivial task. Specifically, when the targets are dim or are very far away from the camera, appearance cannot be used in order to establish this correspondence. Here, we propose a technique to establish correspondence across cameras using the motion features extracted from the targets, even when the relative position of the cameras is unknown. Experimental results are provided for the problem of tracking multiple bees in natural flight using two cameras. The reconstructed 3D flight paths of the bees show some interesting flight patterns.

Ashok Veeraraghavan, Mandyam Srinivasan, Rama Chellappa, Emily Baird and Richard Lamont. " Motion Based Correspondence for 3D Tracking of Multiple Dim Objects", In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 2006 (ICASSP). [pdf] [ppt] [StereoTrackingResult] [3DFlightPathReconstruction]