Develop Intelligence
Hadoop Administration
Learn to maintain/operate a Hadoop cluster.

The Hadoop Administration training course provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster. From installation and configuration through load balancing and tuning. You’ll get experience with some of the most common and challenging scenarios Hadoop administrators see in the real world and become familiar with the most up-to-date details of the platform.

After this course, you will be able to:

  • Learn the fundamentals of standing up a Hadoop cluster
  • Gain understanding on how to configure Hadoop for High availability
  • Learn solid fundamental configurations for maximizing Hadoop operations
This course will be delivered in 4 Days

DI will work with you and your team to define the most appropriate delivery structure, schedule and dates. Structure, schedule and dates will be determined by project schedule, team availability and classroom availability. And of course, it will also be determined by DI’s instructor availability.

In the Hadoop Administration training course you’ll learn:

  • Hadoop Introduction
    • A Brief History of Hadoop
    • Core Hadoop Components
    • Fundamental Concepts

  • Planning Your Hadoop Cluster
    • General Planning Considerations
    • Choosing Hardware
    • Network Considerations
    • Configuring Nodes
    • Planning for Cluster Management
  • HDFS
    • HDFS Features
    • Writing and Reading Files
    • NameNode Considerations
    • HDFS Security
    • Namenode Web UI
    • Hadoop File Shell
  • Getting Data into HDFS
    • Pulling data from External Sources with Flume
    • Importing Data from Relational Databases with Sqoop
    • REST Interfaces
    • Best Practices
  • MapReduce
    • MapReduce overview
    • Features of MapReduce
    • Architectural Overview
    • YARN – MapReduce Version 2
    • Failure Recovery
    • The JobTracker Web UI
  • Hadoop Installation and Initial Configuration
    • Configuration & Deployment Types
    • Installing Hadoop
    • Specifying the Hadoop Configuration
    • Initial HDFS & MapReduce Configuration
    • Log Files
  • Installing/Configuring Hive, Impala, and Pig
    • Hive
    • Impala
    • Pig
  • Hadoop Clients
    • What is a Hadoop Client?
    • Installing and Configuring Hadoop Clients
    • Installing and Configuring Hue
    • Hue Authentication and Configuration
  • Advanced Cluster Configuration
    • Advanced Configuration Parameters
    • Configuring Hadoop Ports
    • Explicitly Including and Excluding Hosts
    • Configuring HDFS for Rack Awareness & HDFS High Availability
  • Hadoop Security
    • Why Hadoop Security Is Important
    • Hadoop’s Security System Concepts
    • What Kerberos Is and How it Works
    • Securing a Hadoop Cluster with Kerberos
  • Managing and Scheduling Jobs
    • Managing Running Jobs
    • Scheduling Hadoop Jobs
    • Configuring the FairScheduler
  • Cluster Maintenance
    • Checking HDFS Status
    • Copying Data Between Clusters
    • Adding/Removing Cluster Nodes
    • Rebalancing the Cluster
    • NameNode Metadata Backup
    • Cluster Upgrades
  • Cluster Monitoring and Troubleshooting
    • General System Monitoring
    • Managing Hadoop’s Log Files
    • Monitoring the Clusters
    • Common Troubleshooting Issues
Call us at (877) 629-5631