Vikas Raykar's Course Projects

[HOME] [EDUCATION] [PUBLICATIONS] [RESEARCH] [SOFTWARE] [PROJECTS]

SPECTRAL CLUSTERING AND KERNEL PCA ARE PURSUING GOOD PROJECTIONS [ report ]

We interpret spectral clustering algorithms in the light of unsupervised learning techniques like principal component analysis and kernel principal component analysis. We show both are equivalent up to a normalization of the do product or the affinity matrix.

UNSUPERVISED LEARNING OF SEMANTIC CONCEPTS [ report ]

Given a large number of articles comprising of natural text we intend to train on the text and extract clusters of words which are semantically related. When we query the system with a word it will return all the words which are very related to the same theme. For example our training set could be all the articles appearing in a magazine. If I query the system with war it would return all the words like soldiers, aircrafts, Iraq etc. Note that all the words do not mean the same they are related by a common unifying theme of war. We want to apply the techniques of nonlinear manifold learning to this unsupervised learning task. We experimented with two techniques, a linear technique Principal Component Analysis and and non-linear manifold learning technique called Isomap.

CMSC733 COMPUTER VISION COURSE PROJECTS

[ Image Mosia cing ]

[ Independent Motion Detection ]

[ Video Manipulation ]

FROM SHAPES TO SOUNDS [ r eport ]

We present a perceptually inspired mapping to convert a simple two dimensional image consisting of simple geometrical shapes to a one dimensional audio waveform consisting of simple harmonic complexes. More specifically we map objects to harmonic complexes where the pitch, timbre and location of the complex corresponds to the size, shape and the position of the object respectively.

NONLINEAR MANIFOLD LEARNING [ report ] [ slides ]

The aim of this report is to study the two nonlinear manifold learning techniques recently proposed [Isomap and Locally Linear Embedding (LLE)] and suggest directions for further research. First the Isomap and the LLE algorithm are discussed in detail. Some of the areas that need further work are pointed out. A few novel applications which could use these two algorithms have been discussed.

PDMA-MEMORY ANALYSIS TOOLKIT

We developed a tool can be used to test programs for memory bottlenecks such as cache misses. The tool uses underlying hardware counters supported by existing machines to monitor memory-specific events and patches the runtime code of the program in order to monitor these events. The performance analysis results are displayed using histograms and stacked bar charts that are hyperlinked with the original source code. We use our tool to conduct tests on specific benchmarks such as Parkbench and FFT that successfully demonstrate its application utility.

R eport

FAST KERNEL PRINCIPAL COMPONENT ANALYSIS

We explored Fast Multipole Method (FMM) like techniques to speeden up Kernel Principal Component Analysis for gaussian and polynomial kernels.

Report slides

OPTIMISATION METHODS FOR SOUND SOURCE LOCALIZATION

Sound source Localization can be formulated as a non linear least squares problem. In this project we evaluated different minimization methods including Nelder Mead simplex method, Quasi Netwton methods, Leveneberg Marquadrat algorithm and the Gauss Newton algorithm. Gauss Newton method was the best in terms of localization eroor and the number of iterations required.

Report

FACE DETECTION

The aim of our project was to detect and localize human faces in any given grayscale image. In this project we evaluated the performance of different approaches for face detection on gray scale images including Neural networks, Principal Component Analysis(PCA), Kernel PCA, Linear Discriminant Analysis(LDA), Kernel LDA, Biased Discriminant Analysis(BDA), Kernel BDA and Adaboost. We also combined the above a approaches with a skin color detector to speed up the algorithm.

slides.pdf Sample Results

VIDEO CODEC IMPLEMENTATION, DC IMAGE EXTRACTION AND SHOT SEGMENTATION

A Video Codec has been implemented using transform coding for reduction of spatial redundancy and unidirectional prediction to reduce temporal redundancy. Coding decisions [I/P frame, Intra/Inter Macro block] are adaptively made to ensure best tradeoff between quality and compression. Shot segmentation is an important step in video content analysis. The automatic partitioning of video into shots involves the detection of transitions. Detecting transitions by extracting the DC image in the compressed domain is advantageous. In this report we extract the DC images from the compressed MPEG stream . We implement the algorithms for cut and wipe detection . We also propose a new algorithm for wipe detection.

Report Slides1 Slides2

PROBABILITY DENSITY ESTIMATION

The aim of the project was to estimate the probability density function (PDF) of any arbitrary distribution from a set of training samples. PDF estimation was done using parametric (Maximum Likelihood estimation of a Gaussian model), non-parametric (Histogram, Kernel based and K-nearest neighbor) and semi-parametric methods (EM algorithm and gradient based optimization). Application of EM algorithm for binary sequence estimation has also been discussed.

Report

LINEAR NETWORKS, MULTILAYER PERCEPTRONS AND RBF'S

The aim of the project was to implement a Linear network (LN) for a 3 class pattern classification problem. Training was done using the LMS algorithm. A 3-h-1 Multi Layer Perceptron (MLP) was implemented for a two class problem. Back propagation was used for training. The network was then pruned using Optimal Brain Surgeon. A Radial Basis Function (RBF) network using Inverse multiquadratic basis function was implemented for function approximation. The training was done using LMS algorithm. Also an MLP was designed and implemented for printed numeral recognition.

Report

FROST BEAMFORMER

The aim of the project was to study the different beamforming techniques and use the Constrained Least Mean Squares (LMS) filter for spatial filtering. A Constrained least mean square algorithm (also known as Frost Beamformer) was derived which is capable of iteratively adapting the weights of the sensor array to minimize noise power at the array output while maintaining a chosen frequency response in the look direction. It was observed that there was a significant improvement in the SNR as compared to the simple delay and sum beamformer.The beamformers were also implemented in real time using two circular arrays of 7 microphones each.

Report