Automating Cloudera on AWS
The Cloudera user-guide is designed to be an all-encompassing instruction set for how to deploy Hadoop. Sometimes though you just want to setup a simple development environment and need to spin it up quickly and effectively. Using this guide you can get your hadoop started without even having to SSH into your node.
The trick is to use the user-data script option provided to you by aws
The Steps to Launch Cloudera Automatically
Select Amazon Linux as your operating system
Use any instance size, we used T2.large in this demo. It is probably best to go with 2 cores + 8 Gigs of ram minimum
Under advanced details paste the user-data script
Continue to security groups and be sure to allow connections coming from your own IP. There are a few ways to configure this but in a development environment I typically allow all traffic from my own IP.
Launch instances and wait several minutes
Connect to your instance with HOSTNAME:7180
The default password for cloudera is admin/admin
Next: Download the Script here
Continue: Learn to move data to Amazon’s S3 Storage