Apache Hadoop Essentials
This one-day course is designed to help both IT Professionals and decision-makers understand the concepts and benefits of Apache Hadoop and how it can help them meet business goals.
You will get a good understanding of the Hadoop technology stack, including MapReduce, HDFS, Hive, Pig, HBase and provides an initial introduction to Mahout and other common utilities.
At the end of this course you will be able to understand:
- The essential components of a Hadoop-based data management solution
- Pros and cons of implemeneting Hadoop
- How does Hadoop fit into our existing environment and architecture?
- The differences between various Hadoop distributions
- History & Background
- Real-world use cases and case studies
The Hadoop Platform
- Introduction to MapReduce and Hadoop file System (HDFS)
- Data Warehousing with Hive
- Parallel processing with Pig
- Data mining with Mahout
- Data storage with HBase
- Common utilities – Sqoop, Flume, Hue, Scribe, Zookeeper, HCatalog
- Hadoop distributions – Apache foundation, Cloudera, Hortonworks, MapR, IBM
The future of Hadoop
- YARN – Next generation MapReduce
- Other programming paradigms on Hadoop