Ready to Learn More

Hadoop Data Science Training

Objective

Hadoop is invaluable when it comes to processing big data. This course introduces you to Hadoop, its file system (HDFS), its processing engine (MapReduce), and its many libraries and programming tools

This course is designed for the beginners and will go up to Advanced level who are looking to start their career as a Data Scientist! 

Our instructor will show you how to set up a Hadoop development environment, run and optimize MapReduce jobs, code basic queries with Hive and Pig, and build workflows to schedule jobs

 

Course Contents

 

  • Understanding Hadoop core components: HDFS and MapReduce
  • Setting up your Hadoop development environment
  • Working with the Hadoop file system
  • Running and tracking Hadoop jobs
  • Tuning MapReduce
  • Understanding Hive and HBase
  • Exploring Pig tools
  • Building workflows
  • Using other libraries, such as Impala, Mahout, and Storm
  • Understanding Spark
  • Visualizing Hadoop output