GPUML - Graphical Processing Units for Machine Learning

 

 

Welcome to the GPUML library home page. GPUML is a library that provides a C/C++ and MATLAB interface for speeding up the computation of the weighted kernel summation and kernel matrix construction on GPU. These computations occur commonly in several machine learning algorithms like kernel density estimation, kernel regression, kernel PCA, etc. The algorithms used to implement the library are as below.

 

Algorithm to evaluate kernel sums on GPUs:

Data: Source points xi; i=1,..,N, evaluation points yj; j=1,..,M
Each thread evaluates the sum corresponding to one evaluation point:
Step 1: Load evaluation point corresponding to the current thread in to a local register.
Step 2: Load the first chunk of source data to the shared memory.
Step 3: Evaluate part of kernel sum corresponding to source data in the shared memory.
Step 4: Store the result in a local register.
Step 5: If all the source points have not been processed yet, load the next chunk, go to Step 3.
Step 6: Write the sum in the local register to the global memory.

Algorithm to construct kernel matrix on GPUs:

 

Data: Source points xi, i=1,...,N, evaluation points yj,j=1,...,M  
Each thread evaluates one element of the kernel matrix
Step 1: Load the source points from global memory into the shared memory.
Step 2: For large data dimension which can not fit into shared memory, divide the each source vector into several chunks of constant size and load them consecutively.
Step 3: Compute the distance contribution of the current chunk in a local register, and load the next chunk. Repeat this until the complete dimension is spanned.
Step 4: Use the computed distance for evaluating the matrix entry.
Step 5: Write the final computed kernel matrix entries into global memory.

GPU based matrix decompositions are available here (by Volkov Vasily) and can be used as is on top of our approach.

 

Related Publications:

  1. Srinivasan BV, Qi H, Duraiswami R, "GPUML: Graphical processors for speeding up kernel machines", Workshop on High Performance Analytics - Algorithms, Implementations, and Applications, Siam Conference on Data Mining, April 2010 [paper]

  2. Srinivasan BV, Duraiswami R, "Scaling kernel machine learning algorithm via the use of GPUs", GPU Technology Conference, NVIDIA Research Summit, October 2009 [abstract]

Authors:
The library was written by Balaji Vasan Srinivasan and Qi Hu under the supervision of Prof. Ramani Duraiswami. Please provide your valuable feedback by mailing the primary author at balajiv(at)umiacs(dot)umd(dot)edu.
 
License:
This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; version 2.1 or later. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. 
 
Download:
Click here to download the current version. Also available from mloss.org.
 
Citation:

If you are using GPUML for your work, please cite the following paper:

Srinivasan BV, Qi H, Duraiswami R, "GPUML: Graphical processors for speeding up kernel machines", Workshop on High Performance Analytics - Algorithms, Implementations, and Applications, Siam Conference on Data Mining, April 2010

Site Meter