Cloud9: What Hadoop version should I use?

by Jimmy Lin

(Page first created: 30 May 2009; last updated: )

What Hadoop version should I use? I get this question a lot... and the answer is (as you might expect): it depends. Unfortunately, in many cases this is a damned if you do, damned if you don't decision. Take your chances with new unknown bugs or continue working around known ones? But if you never upgrade you'll never get to take advantage of new features...

Here are some guidelines that may be helpful, but first, a clarification on terminology. Hadoop releases are currently numbered 0.X.Y, where X is usually known as the major release, and Y is usually known as the minor release.

  • Hadoop 0.18.3 is currently considered the "stable release". Stability, however, is relative. You might be interested in Cloudera's distribution.
  • Hadoop 0.19.X is a complete disaster—avoid at all costs. This release introduced HDFS file append, a very complex feature that require significant changes to the code base. It was so buggy that the developers had to disable the feature in a subsequent minor release.
  • Hadoop 0.20.0 is the latest release, available on April 22, 2009. This release is still in it's shakeout period, but things look promising from a stability point of view. There are non-trivial API changes in 0.20.0, so upgrading to this version will require changes to existing code.

In general, nearly all initial major releases (i.e., 0.X.0) are buggy. These bugs are usually fixed in subsequent minor releases. Therefore, it is probably a good idea to wait until 0.20.1 before upgrading... let others debug the release first! However, since upgrading to 0.20 will require code changes, I expect that many will remain with 0.18.3 for a while.

Cloud9 is currently written for 0.17.2, which is included in umd-hadoop-core/hadoop/. It's a bit behind, because the clusters that Maryland uses are still running that release. However, since there are no major API changes from 0.17 to 0.18, the entire code base should run fine with 0.18.3 (for example, on EC2).

Back to main page

Creative Commons: Attribution-Noncommercial-Share Alike 3.0 United States Valid XHTML 1.0! Valid CSS!