My general areas of research are Computer Vision, Statistical
Learning Theory and Optimization Theory. I am interested in designing efficient algorithms for handling outliers and missing values in statistical data models with applications to high-dimensional and large-scale vision problems. I am also interested in the geometry of multiple views/cameras as encountered in the Structure from Motion problem. Recently, I have worked on face recognition from blurred and poorly illuminated images.
Research TopicsLearning Theory with Applications to Computer VisionWe have proposed robust regression algorithms, both linear and kernel, with applications to high-dimensional problems such as age and pose estimation. The most popular robust regression technique in the vision literature is the Random Sampling Consensus (RANSAC) algorithm. However, this algorithm is combinatorial in the dimension of the problem (number of the model parameters) and hence can not be used for solving high-dimensional problems. We have formulated the linear regression problem by modeling the outliers as sparse variables and then estimated it via a "sparse Bayesian learning" algorithm. This makes our algorithm polynomial in the dimension of the problem and hence suitable for solving high-dimensional problems. A widely used kernel regression technique in vision is the Relevance vector machine (RVM) regression. It has been used for solving problems such as age estimation and pose estimation. However, because of the Gaussian noise assumption, RVM regression is not robust to outliers. We model outliers as sparse variables and jointly estimate them along with the model parameters. We have used this robust RVM version for solving the problems of image denoising and age estimation, obtaining much better results as compared to the original RVM. Matrix factorization is another statistical data modeling technique which has many applications in vision; structure-from-motion(SfM), non-rigid SfM and photometric stereo can all be solved by factorizing appropriate "measurement" matrices. However, because of occlusions in SfM and shadows in photometric stereo, these matrices have many missing entries, which makes the factorization challenging. We formulate the matrix factorization with missing data problem as a Low-rank semidefinite program (LRSDP) with the advantage that: 1)we can solve large scale factorization problems. This is because LRSDP is based on a quasi-Newton algorithm which has low computational complexity and memory requirements. 2) Additional constraints such as orthonormality, required in the orthographic SfM, can be easily incorporated in the LRSDP-based formulation.
Structure from Motion (SfM) is the problem of obtaining the 3-D structure of a scene and the camera parameters given multiple images of the scene. Bundle adjustment is the final refinement step of the SfM problem, where a cost function (image reprojection error) is minimized. The traditional bundle adjustment algorithm, which is based on minimizing the L2-norm of reprojection error, has cubic complexity in the number of unknowns. Based on the L-infinity reprojection error, we proposed an algorithm that has a quadratic complexity. Currently, I am interested in the problem of critical camera motion sequences (for which the SfM problem does not have a unique solution) for a planar scene. Enumerating all the critical camera motion sequences will help us in avoiding them or in designing algorithm to handle them properly.
We consider the problem of recognizing remotely acquired face images; the main factors that make this problem difficult are image degradation due to blur and variations in appearance due to illumination and pose. We use the convolution model for blur and the low-dimensional linear subspace model for illumination variations to propose a direct face recognition algorithm. We determine the identity of a given probe image by maximizing its likelihood function over the space of identities (gallery images), blur kernels and illumination coefficients. We do not assume any parametric form for the blur kernel, however given this information, we can easily incorporate it in our formulation.
|