Friday, April 20, 2012

Talking to Elastic Map Reduce Jobs

Clusters on EC2 are like clusters on a local set of machines with one important exception - the ports for accessing the Hadoop UI are normally closed. to open then open the AWS Management console. Select Security Groups and click ElasticMapReduce-master.
Now create a custom TCP rule - the port range is 9000-9103. I would set the source to 0.0.0.0/0 unless you know your IP address and know it will  not change. Hit the add rule button and you will see a rule (shown as the last visible rule in the lower right hand panel.

Once the security group is set up you can talk to a Hadoop 
You will find a public DNS by selecting the job and looking at the description page as shown above.
Then talk to the the job tracker on port 9100 as shown below.