Bilattice-based Logical Reasoning for Human Detection

Summary: This work proposes a logic based approach that explicitly reasons and detects humans under partial occlusions by integrating visual as well as non-visual information. In this framework, knowledge about contextual cues, scene geometry and human body constraints is encoded in the form of rules in a logic programming language and applied to the output of low level parts based detectors. Positive and negative information from different rules, as well as uncertainties from detections are integrated within the bilattice framework. The bilattice is as shown in figure below where each element is of the form (evidence_for, evidence_against). This framework also generates proofs or justifications (as shown in figure below) for each hypothesis it proposes. These justifications (or lack thereof) are further employed by the system to explain and validate, or reject potential hypotheses. This allows the system to explicitly reason about complex interactions between humans and handle occlusions. These proofs are also available to the end user as an explanation of why the system thinks a particular hypothesis is actually a human. We employ a boosted cascade of gradient histograms based detector to detect individual body parts.

Example of human detection using the bilattice based logical reasoning approach.

Uncertainties assigned to logical rules and facts are taken from a set structured as a Bilattice. In a bilattice for continuous valued logic every element is of the form: (evidence_for, evidence_against)

Proof for human marked with arrow in figure above.

Related Publications

  • Vinay D. Shet, Jan Neumann, Visvanathan Ramesh and Larry S. Davis, Bilattice-based Logical Reasoning for Human Detection, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2007, Minneapolis, MN [pdf][Abstract & BibRef]

Nonmonotonic Reasoning for Visual Surveillance

Summary: Proposed a logic based framework that attempts to emulate human `common sense reasoning' to perform certain computer vision tasks. Logic framework employs "multivalued default logic" to perform high level reasoning and allows us to augment contextual cues, present in the environment being observed, to the output of various low level image processing modules in a principled manner. Specifically, we have used this framework in a visual surveillance setup to address the problem of identifying (matching) individuals across large visibility gaps and further to handle the uncertainties associated with recognizing various activities that these individuals perform. While default logic allows us to commit errors in reasoning and subsequently recover upon acquisition of new information (nonmonotonic reasoning), multivalued belief states allows us to encode our confidence in decision that we take based on the amount of uncertainty.

Uncertainty in establishing identity feeds into uncertainty in recognizing activity: If X==Y, then no theft has occurred, while if X!=Y, then possible theft.

The design of our reasoning framework draws heavily upon reasoning exhibited by humans.

Bilattice for prioritized default logic employed in our work.

Overview of logical reasoning based identity maintenance and activty recognition: Narrated by Prof. Larry S. Davis.

Related Publications

  • Vinay D. Shet, David Harwood and Larry S. Davis, Multivalued Default Logic for Identity Maintenance in Visual Surveillance, 9th European Conference on Computer Vision (ECCV), 2006, Graz, Austria [pdf][Abstract & BibRef]

  • Vinay D. Shet, David Harwood and Larry S. Davis, Top-Down, Bottom-up Multivalued Default Reasoning for Identity Maintenance, 4th ACM International Workshop onVideo Surveillance & Sensor Networks (VSSN) in conjunction with ACM Multimedia Santa Barbara, CA, USA, 2006 [pdf]

Logical Reasoning based Activity Recognition

Summary: Developed surveillance system that combines Prolog based logic programming with real time image processing algorithms like background subtraction/ tracking to recognize complex human activities. Used this tool to take input from multiple cameras and to recognize several activities including unauthorized entry, escorted entry, unattended package, attended package, theft, accomplice/witness to a theft, besides other events like interactions of people.

Overview of system architecture.

Video showing results of logical reasoning based activity recognition.

Related Publications

  • Vinay D. Shet, David Harwood and Larry S. Davis, VidMAP: Video Monitoring of Activity with Prolog, IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS), Como, Italy, September 2005 [pdf][Abstract & BibRef]

Gesture Recognition

Summary: Developed a gesture recognition system to control a robot driven vehicle (HUMVee) with hand gestures. Employed a probabilistic framework that combines shape and motion exemplars to differentiate between 14 different hand gestures. Addressed the problem of capturing dynamics for exemplar based classification systems by proposing a non-parametric HMM learning approach to learn observation densities directly from training data. Demonstrated capability of system to operate under camera motion by tracking human as he gesticulates.

Related Publications

  • Vinay D. Shet, V. Shiv Naga Prasad, Ahmed Elgammal, Yaser Yacoob and Larry Davis, Multi-cue Exemplar-based Nonparametric Model for Gesture Recognition, Indian Conference on Computer Vision, Graphics and Image Processing(ICVGIP), Kolkata, India, Dec 2004, 656-662. [pdf][Abstract & BibRef] (Recipient of Honorary Mention for Best Paper Award)

  • Ahmed Elgammal, Vinay D. Shet, Yaser Yacoob and Larry Davis, Learning Dynamics for Exemplar based Gesture Recognition, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Wisconsin 2003 [pdf][Abstract & BibRef]

  • Ahmed Elgammal, Vinay D. Shet, Yaser Yacoob and Larry Davis, Exemplar-based Tracking and Recognition of Arm Gestures, 3rd IEEE International Symposium on Image and Signal Processing and Analysis (ISPA) Rome, Italy 2003 [pdf]

  • Ahmed Elgammal, Vinay D. Shet, Yaser Yacoob and Larry Davis, Gesture Recognition Using a Probabilistic Framework for Pose Matching, IEEE International Conference on Control Automation Robotics and Computer Vision (ICARCV) Singapore 2002 [pdf]

Biometrics - Automatic Fingerprint Identification System (Undergraduate research)

Summary: Designed and implemented an algorithm that could reliably extract features (minutiae) from raw fingerprint images and then match them from those already present in the database. Segmentation was achieved using Gabor filtering based on ridge orientation, skeletonisation and extraction of features like ridge endings and bifurcations. Fast feature matching was achieved using a deformation model that enforced certain geometric constraints (like radial distortion) that were pertinent to access control applications.