Multimodal Tracking for Smart Videoconferencing and Video Surveillance

TitleMultimodal Tracking for Smart Videoconferencing and Video Surveillance
Publication TypeConference Papers
Year of Publication2007
AuthorsZotkin DN, Raykar VC, Duraiswami R, Davis LS
Conference NameComputer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
Date Published2007/06//
Keywords(numerical, 3D, algorithm;smart, analysis;least, approximations;particle, arrays;nonlinear, cameras;multiple, Carlo, estimator;multimodal, filter;self-calibration, Filtering, least, likelihood, methods);teleconferencing;video, methods;image, microphone, MOTION, motion;Monte-Carlo, problem;particle, processing;video, signal, simulations;maximum, squares, surveillance;, surveillance;Monte, tracking;multiple, videoconferencing;video

Many applications require the ability to track the 3-D motion of the subjects. We build a particle filter based framework for multimodal tracking using multiple cameras and multiple microphone arrays. In order to calibrate the resulting system, we propose a method to determine the locations of all microphones using at least five loudspeakers and under assumption that for each loudspeaker there exists a microphone very close to it. We derive the maximum likelihood (ML) estimator, which reduces to the solution of the non-linear least squares problem. We verify the correctness and robustness of the multimodal tracker and of the self-calibration algorithm both with Monte-Carlo simulations and on real data from three experimental setups.