Probabilistic recognition of human faces from video

TitleProbabilistic recognition of human faces from video
Publication TypeJournal Articles
Year of Publication2003
AuthorsZhou S, Krueger V, Chellappa R
JournalComputer Vision and Image Understanding
Pagination214 - 245
Date Published2003/07//
ISBN Number1077-3142
KeywordsExemplar-based learning, face recognition, sequential importance sampling, Still-to-video, Time series state space model, Video-to-video

Recognition of human faces using a gallery of still or video images and a probe set of videos is systematically investigated using a probabilistic framework. In still-to-video recognition, where the gallery consists of still images, a time series state space model is proposed to fuse temporal information in a probe video, which simultaneously characterizes the kinematics and identity using a motion vector and an identity variable, respectively. The joint posterior distribution of the motion vector and the identity variable is estimated at each time instant and then propagated to the next time instant. Marginalization over the motion vector yields a robust estimate of the posterior distribution of the identity variable. A computationally efficient sequential importance sampling (SIS) algorithm is developed to estimate the posterior distribution. Empirical results demonstrate that, due to the propagation of the identity variable over time, a degeneracy in posterior probability of the identity variable is achieved to give improved recognition. The gallery is generalized to videos in order to realize video-to-video recognition. An exemplar-based learning strategy is adopted to automatically select video representatives from the gallery, serving as mixture centers in an updated likelihood measure. The SIS algorithm is applied to approximate the posterior distribution of the motion vector, the identity variable, and the exemplar index, whose marginal distribution of the identity variable produces the recognition result. The model formulation is very general and it allows a variety of image representations and transformations. Experimental results using images/videos collected at UMD, NIST/USF, and CMU with pose/illumination variations illustrate the effectiveness of this approach for both still-to-video and video-to-video scenarios with appropriate model choices.