Giter Club home page Giter Club logo

hadoop-on-ec2's Introduction

EC2 Hadoop Cluster

Creates a Hadoop cluster on AWS EC2

Quickstart

  1. Obtain your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from AWS and add it into your ~/.bashrc or ~/.bash_profile or /etc/profile
  2. Obtain your AWS identity file (*.pem)
  3. Update config/deploy/production.rb and config/deploy/production_hadoop.rb to refer to the location of your pem file. Furthermore, you can take it a notch by externalizing the location of your identity file and referring to it as an environment variable from config/deploy/production.rb and config/deploy/production_hadoop.rb.
  4. Create and configure your AWS subnet (currently outside the scope of this automation script). Define your subnet mask and take note of the subnet id.
  5. Create and configure your cluster security group (currently outside the scope of this automation script) and open the following ports:
    ..* 22 (So that you can SSH)
    ..* 8030 - 8100
    ..* 50070
    ..* 50075
    ..* 50475
    ..* 50105
    ..* 50470
    ..* 2181 (not required by hadoop, this is a zookeeper port)
    ..* 50090
  6. Customize the create_hdp_cluster.sh script according to your subnet choices.
  7. Run the script and monitor your EC2 instance dashboard
$ ./create_hdp_cluster.sh

Enjoy!

hadoop-on-ec2's People

Contributors

cjjavellana avatar

Watchers

 avatar  avatar  avatar

Forkers

ddoloroi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.