Hadoop Essentials

The Hadoop Essentials training course is a foundational class for anyone wanting to develop on the Hadoop platform. Attendees will gain experience developing code for the major component projects in the big data ecosystem.

The course starts with an overview of common use cases for Hadoop. Then it covers working with the Hadoop Distributed File System (HDFS) Shell. It moves into working with Map Reduce jobs and finishes up workign with Pig and Apache HIVE.

Course Summary

Purpose: 
Learn the fundamentals of the Hadoop platform.
Audience: 
Anyone wanting to develop solutions on the Hadoop platform. Basic Java experience recommended.
Skill Level: 
Learning Style: 

Hands-on training is customized, instructor-led training with an in-depth presentation of a technology and its concepts, featuring such topics as Java, OOAD, and Open Source.

Hands On help
Duration: 
4 Days
Productivity Objectives: 
  • Learn the fundamentals of the Hadoop ecosystem
  • Gain exposure to the main Hadoop processing products
  • Learn the fundamentals of Map Reduce at a low level as well as higher level frameworks
Hadoop Essentials is part of the Apache Training curriculum.

What You'll Learn

In the Hadoop Essentials training course you’ll learn:

  • Hadoop
    • Overview
    • Common Use Cases
  • HDFS
    • Overview
    • Architecture
  • HDFS Shell
    • HDFS Components
    • HDFS Shell
  • HDFS Java API
    • Overview
    • Java basics
  • Moving Data – Sqoop
    • Use Cases/examples
    • How to use Sqoop to move data
  • Moving Data – Flume
    • Use Cases/Examples
    • How to use Flume to move data
    • What tool when?
  • HBASE
    • Overview
    • Use Cases (When would you use it)
    • HBASE Architecture
    • Designing HBASE tables
    • Storage Model
  • HBASE Shell
    • Runtime Modes
    • HBASE Shell overview
    • HBASE DML
    • HBASE DDL
  • HBASE Java Client API (Data Access and Admin)
    • Overview
    • Using the Client API to Access HBASE
    • Basic HBASE operations
  • HBASE: Advanced Data Access
    • Architecture
    • Scan Data Retrieval
    • Result Scanner
    • Scanner Caching
    • Scanner Batching
    • Filters
  • Map Reduce on YARN
    • Overview
    • History (V1 vs V2)
    • Map Reduce Workflow
    • Case Study/Example
    • Map Reduce Framework Components
    • Map Reduce Configuration
  • First Map Reduce Job with Java
    • Overview
    • Job Components (Inputformats, OutputFormat, etc)
    • Mapper
    • Reducer
    • Job configuration
  • Map Reduce Job Execution
    • Components
    • Distributed Cache
    • Job Execution on YARN
    • Failures
  • Hadoop Streaming
    • Overview
    • Other Languages for Map Reduce
    • Streaming Job Components
  • Workflows
    • Coding workflows in Java
    • Using Oozie for Workflows
    • Comparison
    • Use Case example
  • Apache Pig
    • Pig Architecture
    • Pig and Map Reduce
    • Pig access options
    • Pig Components
    • Running Pig
    • Basic Pig Scripts
  • Joining Data Sets with Pig
    • Inner/Outer/Full Joins
    • Building a Pig Script to Join Datasets
    • Cogroups
  • Apache HIVE
    • Overview
    • Example/Use Case from Industry
    • Hive Architecture
    • Hive MetaStore
    • Hive access options
    • Creating Databases/Tables
    • Loading data
    • External vs Internal tables
    • Partitions
    • Bucketing
    • Joins

Meet Your Instructor

Michael headshot
Michael

Michael is a practicing software developer, course developer, and trainer with DevelopIntelligence. For the majority of his career, Michael has designed and implemented large-scale, enterprise-grade, Java-based applications at major telecommunications and Internet companies, such as Level3 Communications, US West/Qwest/Century Link, Orbitz, and others.

Michael has a passion for learning new technologies, patterns, and paradigms (or, he has a tendency to get bored or disappointed with current ones)....

Meet Michael »
Rich picture
Rich

Rich is a full-stack generalist with a deep and wide background in architecture, development and maintenance of web-scale, mission-critical custom applications, and building / leading extraordinary technology teams.

He has spent about equal thirds of his two decade career in the Fortune 500, government, and start-up arenas, where he’s served as everything from the trench-level core developer to VP of Engineering. He currently spends the majority of his time sharing his knowledge about Amazon Web...

Meet Rich »
Sujee Picture
Sujee

Sujee has been developing software for 15 years. In the last few years he has been consulting and teaching Hadoop, NOSQL and Cloud technologies. Sujee stays active in Hadoop / Open Source community. He runs a developer focused meetup and Hadoop hackathons called ‘Big Data Gurus’. He has presented at variety of meetups. Sujee contributes to Hadoop project and other open source projects. He writes about Hadoop and other technologies on his website.

Meet Sujee »

Contact us to learn more

Not all training courses are created equal. Let the customization process begin! We'll work with you to design a custom Hadoop Essentials training course that meets your specific needs.

DevelopIntelligence has been in the technical/software development learning and training industry for nearly 20 years. We’ve provided learning solutions to more than 48,000 engineers, across 220 organizations worldwide.

About Develop Intelligence
Di Clients

surveyask

Need help finding the right learning solution?   Call us: 877-629-5631