Welcome to the homepage for Cloud9, a MapReduce library for Hadoop designed to serve as both a teaching tool and a repository for code that may be broadly useful for a variety research problems in human language technology (information retrieval, natural language processing, etc.). Development of this code base began in late October 2007, so there isn't much here yet... However, it is available via anonymous Subversion checkout. Like Hadoop itself, Cloud9 is distributed under the Apache License.
The University of Maryland is one of six universities that's part of the IBM/Google cloud computing initiative. Ongoing efforts at Maryland include a cloud computing course in Spring 2008 and application of this technology to various research problems.
Quick Links
- Hadoop homepage (0.16.0): [local] [live]
- Hadoop API (0.16.0): [local] [live]
- Cloud9 API
Content Pages
- Downloading Cloud9 and getting started
- Adding Cloud9 to your project
- Layout of project directory tree
- Using the tuple library
- Staging records using the Tuple class
- Primer on the Partitioner, or computing conditional probabilities
- Primer on HBase Shell, the command-line interface to HBase
- Using HBase with MapReduce
- PageRank exercise
Related Projects
- Cascading: pipe assembly patterns for Hadoop
- Mahout: machine learning library on Hadoop
- Pig: data analysis platform on top of Hadoop
- HBase: an open source implementation of Google's BigTable
- Hypertable: another open source implementation of Google's BigTable
Subversion Access
- umd-hadoop-core: https://subversion.umiacs.umd.edu/umd-hadoop/core
- umd-hadoop-dist: https://subversion.umiacs.umd.edu/umd-hadoop/dist
Explanation why the library is split across two separate repositories.