The CLIP Colloquium Series presents...


Computing in the Clouds: Applications of MapReduce in "Web-Scale" Information Processing

Chris Dyer and Jimmy Lin (University of Maryland)
October 17, 2007, 11:00am, KIM Bldg. Pepco room 1105

Lin's slides
Dyer's slides

The University of Maryland is one of six universities across the U.S. involved in a new initiative lead by IBM/Google to explore the MapReduce programming paradigm, through an open-source implementation called Hadoop (see press release). The initiative aims to provide faculty and students with access to next-generation computing technologies, allowing them to think at "Web scale". Specifically, IBM is providing access to clusters running Hadoop and the accompanying infrastructure support.

The purpose of this informal presentation is to introduce the Maryland community to this project. Jimmy Lin will first provide a brief overview of Hadoop and Maryland's collaboration with IBM/Google. Chris Dyer will provide a demo of Hadoop and talk about preliminary experiments in a machine translation application---the estimation of phrased-based translation models, a task that usually takes a day or so on a single process script, takes about 20 minutes using the IBM cluster. We'll then open up the floor for discussion about the new research space that this programming paradigm opens up, and how new collaborations could be built.

About the Speakers

Jimmy Lin is an assistant professor in the College of Information Studies, and leads the IBM/Google "Cloud Computing" initiative at Maryland.

Chris Dyer is a graduate student in the Department of Linguistics who works on statistical machine translation.


This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.