Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Hadoop Essentials

Course Summary

The Hadoop Essentials training course is designed to demonstrate how to develop on the Hadoop platform.

The course begins with an overview of common use cases for Hadoop. Next, it explores the Hadoop Distributed File System (HDFS) Shell. The course concludes by assessing MapReduce jobs, Pig and Apache HIVE.

Purpose
Learn the fundamentals of the Hadoop platform.
Audience
Anyone wanting to develop solutions on the Hadoop platform. Basic Java experience recommended.
Role
Business Analyst - Project Manager - Software Developer - Technical Manager - Web Developer
Skill Level
Intermediate
Style
Workshops
Duration
4 Days
Related Technologies
Java | Hadoop | Apache

 

Productivity Objectives
  • Discover the fundamentals of the Hadoop ecosystem
  • Gain exposure to the main Hadoop processing products
  • Identify the fundamentals of MapReduce at a low level as well as higher level frameworks

What You'll Learn:

In the Hadoop Essentials training course, you'll learn:
  • Hadoop
    • Overview
    • Common use cases
  • HDFS
    • Overview
    • Architecture
  • HDFS Shell
    • HDFS Components
    • HDFS Shell
  • HDFS Java API
    • Overview
    • Java basics
  • Moving Data - Sqoop
    • Use Cases/examples
    • How to use Sqoop to move data
  • Moving Data - Flume
    • Use Cases/Examples
    • How to use Flume to move data
    • What tool when?
  • HBASE
    • Overview
    • Use Cases (When would you use it)
    • HBASE Architecture
    • Design HBASE tables
    • Storage Model
  • HBASE Shell
    • Runtime Modes
    • HBASE Shell overview
    • HBASE DML
    • HBASE DDL
  • HBASE Java Client API (Data Access and Admin)
    • Overview
    • Use the Client API to Access HBASE
    • Basic HBASE operations
  • HBASE: Advanced Data Access
    • Architecture
    • Scan Data Retrieval
    • Result Scanner
    • Scanner Cache
    • Scanner Batch
    • Filters
  • Map Reduce on YARN
    • Overview
    • History (V1 vs V2)
    • Map Reduce Workflow
    • Case Study/Example
    • Map Reduce Framework Components
    • Map Reduce Configuration
  • First Map Reduce Job with Java
    • Overview
    • Job Components (Inputformats, OutputFormat, etc)
    • Mapper
    • Reducer
    • Job configuration
  • Map Reduce Job Execution
    • Components
    • Distributed Cache
    • Job Execution on YARN
    • Failures
  • Hadoop Streaming
    • Overview
    • Other Languages for Map Reduce
    • Streaming Job Components
  • Workflows
    • Code workflows in Java
    • Use Oozie for Workflows
    • Comparison
    • Use Case example
  • Apache Pig
    • Pig Architecture
    • Pig and Map Reduce
    • Pig access options
    • Pig Components
    • Run Pig
    • Basic Pig Scripts
  • Joining Data Sets with Pig
    • Inner/Outer/Full Joins
    • Build a Pig Script to Join Datasets
    • Cogroups
  • Apache HIVE
    • Overview
    • Example/Use Case from Industry
    • Hive Architecture
    • Hive MetaStore
    • Hive access options
    • Create DatabasesTables
    • Loading data
    • External vs Internal tables
    • Partitions
    • Bucketing
    • Joins
“I appreciated the instructor's technique of writing live code examples rather than using fixed slide decks to present the material.”

VMware

Dive in and learn more

When transforming your workforce, it's important to have expert advice and tailored solutions. We can help. Tell us your unique needs and we'll explore ways to address them.

Let's chat

By filling out this form and clicking submit, you acknowledge our privacy policy.