I am a postdoctoral research associate at the University of Maryland,
associated with the Computer Vision Lab and the Institute for
Advanced Computer Studies. I received my doctoral degree in Computer
Science at the University of Maryland, College Park, with Prof. Larry Davis
as my advisor. I completed my B.S. and M.S. degrees at Penn State University (with Prof.
Octavia Camps as my research advisor). My research interests are in
Computer Vision, Pattern Recognition, Machine Learning, and Artificial
Analyzing Activities Involving Interacting People
I am currently working on analyzing activities, such as basketball,
that involve multiple interacting people. This work entails low
and mid level analysis, such as tracking humans and recognizing
actions, as well as high level reasoning that incorporates external
knowledge about the world.
Tracking People's Hands and Feet Using Mixed Network AND/OR Search
We describe a framework that leverages mixed probabilistic and deterministic networks and their
AND/OR search space to efficiently find and track the hands and feet of multiple interacting humans in
2D from a single camera view. Our framework detects and tracks multiple people's heads, hands, and feet
through partial or full occlusion; requires few constraints (does not require multiple views, high image
resolution, knowledge of performed activities, or large training sets); and makes use of constraints and
AND/OR Branch-and-Bound with lazy evaluation and carefully computed bounds to efficiently solve the
complex network that results from the consideration of inter-person occlusion. Our main contributions
are 1) a multi-person part-based formulation that emphasizes extremities and allows for the globally
optimal solution to be obtained in each frame, and 2) an efficient and exact optimization scheme that
relies on AND/OR Branch-and-Bound, lazy factor evaluation, and factor cost sensitive bottom-up bound
Automatic Tuning for Fast Gaussian Summation
We provide an algorithm that combines tree methods with the Improved
Fast Gauss Transform (IFGT). As originally proposed the IFGT suffers
from two problems: (1) the Taylor series expansion does not perform
well for very low bandwidths, and (2) parameter selection is not
trivial and can drastically affect performance and ease of use. We
address the first problem by employing a tree data structure, and the
second problem by using an online tuning approach that results in a
black box method that automatically chooses the evaluation method and
its parameters to yield the best performance for the input data,
desired accuracy, and bandwidth. In addition, the new IFGT parameter
selection approach allows for tighter error bounds. Our approach
chooses the fastest method at negligible additional cost, and has
superior performance in comparisons with previous approaches.
Representing Visibility Context for Action Understanding
We proposed a representation of visibility/spatial context based on
visibility features (obtained from isovists and visibility graphs)
that is suitable for human action understanding. Using a Bayes net, we
then used our visibility context representation to reason about
2-dimensional trajectories (top view) generated by an agent performing
a simple search-based task in various layouts. Human subjects were
asked to interpret the trajectories 1) to demonstrate that knowledge of
visibility context improves interpretation of our task and 2) to provide
a baseline against which our algorithm can be compared. Our framework
was able to match the performance of humans.
Appearance Modeling for Multi-camera Correspondence and Tracking
We learned generative appearance models by extracting object appearance
from single or multiple views and learning its evolution on a manifold
over time. Using target dynamics, we were able to predict future
appearance in each view. In the multiple view case we learned
correspondences between the implicit low-dimensional representation of
each high-dimensional object view by either aligning low-dimensional
coordinates during nonlinear manifold learning, or learning the dynamics
of how low-dimensional coordinates in separate views evolved together
over time. Our model allowed us to "hallucinate" the appearance of
occluded targets by 1) predicting future appearance in each view
temporally or by 2) predicting the appearance in one view given the
appearance in another.
Vlad I. Morariu, Ejaz Ahmed, Venkataraman Santhanam, David Harwood, Larry S. Davis. Composite Discriminant Factor Analysis.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2014.
coming soon: [code] [vehicle dataset]
Send me your email, and I will let you know when the code and dataset are available.
Radu Dondera, Vlad I. Morariu, Yulu Wang, and Larry S. Davis. Interactive Video Segmentation Using Occlusion Boundaries and Temporally Coherent Superpixels.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2014.
Fatemeh Mirrashed, Vlad I. Morariu, and Larry S. Davis. Sampling
for Unsupervised Domain Adaptive Object Detection.
IEEE International Conference on Image Processing (ICIP), 2013.
Radu Dondera, Vlad I. Morariu, Larry S. Davis. Learning to Detect
Carried Objects with Minimal Supervision. IEEE Workshop on
Socially Intelligent Surveillance and Monitoring (SISM), 2013.
Vlad I. Morariu, David Harwood, and Larry S. Davis. Tracking
People's Hands and Feet Using Mixed Network AND/OR Search.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2013.
coming soon: [software] [videos]
Fatemeh Mirrashed, Vlad I. Morariu, Behjat Siddiquie, Rogerio S. Feris, and Larry S. Davis. Domain Adaptive Object Detection.
IEEE Workshop on the Applications of Computer Vision (WACV), 2013.
Hyungtae Lee, Vlad I. Morariu, and Larry S. Davis. Qualitative Pose Estimation by Discriminative Deformable Part Models.
Asian Conference on Computer Vision (ACCV), 2012.
Sameh Khamis, Vlad I. Morariu, and Larry S. Davis. Combining Per-Frame and Per-Track Cues for Multi-Person Action Recognition.
European Conference on Computer Vision (ECCV), 2012.
Sameh Khamis, Vlad I. Morariu, and Larry S. Davis. A Flow Model for Joint Action Recognition and Identity Maintenance.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, and Larry S. Davis. Subordinate Categorization Using Volumetric Primitives and Pose-Normalized Appearance.
IEEE International Conference on Computer Vision (ICCV), 2011. [ORAL]
Vlad I. Morariu and Larry S. Davis. Multi-agent event recognition
in structured scenarios.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
Vlad I. Morariu, Balaji Vasan Srinivasan, Vikas C. Raykar, Ramani
Duraiswami, and Larry S. Davis. Automatic online tuning for fast
Gaussian summation. Advances in Neural Information Processing
Systems (NIPS), 2008.
Vlad I. Morariu, V. Shiv Naga Prasad, and Larry S. Davis. Human
Activity Understanding using Visibility Context. IEEE/RSJ IROS
Workshop: From sensors to human spatial concepts (FS2HSC), 2007.
Benjamin Fransen, Vlad Morariu, Eric Martinson, Samuel Blisard,
Matthew Marge, Scott Thomas, Alan Schultz, and Dennis Perzanowski.
Using Vision, Acoustics, and Natural Language for Disambiguation.
IEEE International Conference on Human-Robot Interaction (HRI) 2007.
Vlad I. Morariu, Octavia I. Camps, Mario Sznaier, and Hwasup Lim.
Robust Cooperative Visual Tracking: A Combined Nonlinear
Dimensionality Reduction/Robust Identification Approach. In
Advances in Cooperative Control and Optimization, M. Hirsch, R.
Murphey, P. Pardalos and D. Grundel, Eds., Springer Verlag, 2007.
Vlad I. Morariu and Octavia I. Camps. Modeling Correspondences in
Multi-camera Tracking using Nonlinear Manifold Learning and Target
Dynamics. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 2006.
Hwasup Lim, Vlad I. Morariu, Octavia I. Camps, and Mario Sznaier.
Dynamic Appearance Modeling for Human Tracking. IEEE
Conference on Computer Vision and Pattern Recognition (CVPR) 2006.
FIGTree - Fast Improved Gauss Transform with Tree Data Structure
A library for fast computation of Gauss transforms in multiple
dimensions, using the Improved Fast Gauss Transform and Approximate
Nearest Neighbor searching. The nearest neighbor searching is
performed using the ANN library, available at
This software allows for efficient computation of probabilities by
Kernel Density Estimation (KDE), and can reduce complexity of
algorithms commonly used in Computer Vision, Machine Learning, etc,
that must evaluate the Gauss transform. The publication describing
the newest improvements in the code is the NIPS 2008 paper by
Morariu et al. Previous publications related to this approach are
Vikas Raykar's page.
NOTE: A new version of the code based on the NIPS 2008
paper has been released! Now FIGTree can be used as a black box.