whirr

DevOps

Run The Latest Whirr And Deploy HBase In Minutes

In a few of my recent posts I have covered the ease of deploying clusters of Hadoop and Cassandra using Whirr. With Whirr you can simply write a

Read
DevOps

Quickly Launch A Cassandra Cluster On Amazon EC2

If you have read my previous post, Map-Reduce With Ruby Using Hadoop, then you will know that firing up a Hadoop cluster is really simple when you use

Read
Data processing

Map-Reduce With Ruby Using Hadoop

Here I demonstrate, with repeatable steps, how to fire-up a Hadoop cluster on Amazon EC2, load data onto the HDFS (Hadoop Distributed File-System), write map-reduce scripts in Ruby and use them to run a map-reduce job on your Hadoop cluster. You will not need to ssh into the cluster, as all tasks are run from your local machine. Below I am using my MacBook Pro as my local machine, but the steps I have provided should be reproducible on other platforms running bash and Java.

Read