Hadoop-Essentials
This is the code repository for Hadoop-Essentials, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.
Software Requirements
All of the code is organized into folders.
Each folder starts with C
followed by the chapter number.
For example, HDFS commands.txt
is a file from Chap 1, and HBase commands.txt
is one from Chap 5.
What you need for the code files:
A prerequisite of Java programming and basics of distributed computing will be very helpful and an interest to understand about Hadoop and its ecosystem components. The code and syntax have been tested in Hadoop 2.4.1 and other compatible ecosystem component versions, but may vary in the newer version.
Software and Hardware requirements:
- Apache Hadoop 2.x - Install Ambari 1.7.0 - Atleast 4 node cluster with average configuration and at 16 GBit Ethernet - Linux
- Hive, Pig - Install Ambari 1.7.0 - Atleast 4 node cluster with average configuration and at 16 GBit Ethernet - Linux
- HBase - Install Ambari 1.7.0 - Atleast 4 node cluster with average configuration and at 16 GBit Ethernet - Linux
- Sqoop, Flume - sqoop 1.4.5, Flume 1.5.2 - Atleast 4 node cluster with average configuration and at 16 GBit Ethernet - Linux