-
Onboard
For Tech Teams
- Reduce initial time to productivity.
- Increase employee tenure.
- Plug-and-play into HR onboarding and career pathing programs.
- Customize for ad-hoc and cohort-based hiring approaches.
-
Upskill
For Tech Teams
- Upgrade and round out developer skills.
- Tailor to tech stack and specific project.
- Help teams, business units, centers of excellence and corporate tech universities.
-
Reskill
For Tech Teams
- Offer bootcamps to give employees a running start.
- Create immersive and cadenced learning journeys with guaranteed results.
- Supplement limited in-house L&D resources with all-inclusive programs to meet specific business goals.
-
Design
For Tech Teams
- Uplevel your existing tech learning framework.
- Extend HR efforts to provide growth opportunities within the organization.
- Prepare your team for an upcoming tech transformation.
Get your team started on a custom learning journey today!
Our Boulder, CO-based learning experts are ready to help!
Course Summary
The Big Data Fast Track fast track is about providing a thorough introduction to developers and developer ops job roles. The attendee will receive an introduction and use all the major component frameworks in the big data ecosystem.
- Apache Spark - Hadoop - Scala - Java - Apache Kafka
- Productivity Objectives:
- To gain a development as well as operational knowledge of Hadoop
- Gain exposure to the major Hadoop ecosystem products
- Learn the use cases where Big Data technology has the greatest impact
Request Information
Get your team upskilled or reskilled today. Chat with one of our experts to create a custom training proposal. Fully customized at no additional cost.

If you are not completely satisfied with your training class, we'll give you your money back.




about our training
-
Real-World Content
Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.
-
Expert Practitioners
Industry experts with 15+ years of industry experience that bring their battle scars into the classroom.
-
Experiential Learning
More coding than lecture, coupled with architectural and design discussions.
-
Fully Customized
One-size-fits-all doesn't apply to training teams. That's where we come in!
What You'll Learn
In the Big Data Fast Track training course, you'll learn:
- HDFS
- Overview
- Architecture
- HDFS Shell
- HDFS Components
- HDFS Shell
- Getting Data into HDFS
- Pulling data from External Sources with Flume
- Importing Data from Relational Databases with Sqoop
- REST Interfaces
- Best Practices
- Moving Data – Sqoop
- Use Casesexamples
- How to use Sqoop to move data
- Moving Data – Flume
- Use CasesExamples
- How to use Flume to move data
- What tool when
- HBASE
- Overview
- Use Cases (When would you use it)
- HBASE Architecture
- Designing HBASE tables
- Storage Model
- HBASE Shell
- Runtime Modes
- HBASE Shell overview
- HBASE DML
- HBASE DDL
- HBASE Java Client API (Data Access and Admin)
- Overview
- Using the Client API to Access HBASE
- Basic HBASE operations
- Map Reduce on YARN
- Overview
- History (V1 vs V2)
- Map Reduce Workflow
- Case StudyExample
- Map Reduce Framework Components
- Map Reduce Configuration
- First Map Reduce Job with Java
- Overview
- Job Components (Inputformats, OutputFormat, etc.)
- Mapper
- Reducer
- Job configuration
- Map Reduce Job Execution
- Components
- Distributed Cache
- Job Execution on YARN
- Failures
- Apache Oozie
- Overview
- Job Scheduling with Oozie
- Creating declarative workflows
- Apache Pig
- Pig Architecture
- Pig and Map Redce
- Pig access options
- Pig Components
- Running Pig
- Basic Pig Scripts
- Joining Data Sets with Pig
- InnerOuterFull Joins
- Building a Pig Script to Join Datasets
- Cogroups
- Apache HIVE
- Overview
- Example Use Case from Industry
- Hive Architecture
- Hive MetaStore
- Hive access options
- Creating DatabasesTables
- Loading data
- External vs Internal tables
- Partitions
- Bucketing
- Joins
- Hadoop Clients
- What is a Hadoop Client
- Installing and Configuring Hadoop Clients
- Installing and Configuring Hue
- Hue Authentication and Configuration
- Hadoop Security
- Why Hadoop Security Is Important
- Hadoops Security System Concepts
- What Kerberos Is and How it Works
- Securing a Hadoop Cluster with Kerberos
- Managing and Scheduling Jobs
- Managing Running Jobs
- Scheduling Hadoop Jobs
- Configuring the FairScheduler
- Cluster Monitoring and Troubleshooting
- General System Monitoring
- Managing Hadoops Log Files
- Monitoring the Clusters
- Common Troubleshooting Issues
- Apache Kafka
- Overview
- Use Cases
- Ecosystem
- Producer API
- Consumer API
- High Level
- Simple
- Configuration
- Broker
- Consumer
- Producer
- New Producer
- Design Points
- Persistence
- Producer
- Consumer
- Message Delivery
- Replication
- Log Compaction
- Apache Storm
- Overview
- General Architecture
- Messaging characteristics
- Spouts
- Bolts
- Deploying a topology
- Fault tolerance
- The Trident API
- API Overview
- Spouts
- Storm Metrics
- Integrating Storm with other Big Data frameworks
- Apache Spark
- What is Apache Spark
- Quick Intro to Scala
- basic Syntax
- Scala Hello World
- Spark Basics
- Using the Spark Shell
- Resilient Distributed Datasets (RDDs)
- Functional Programming with Spark
- The Hadoop Distributed File System
- Why HDFS
- HDFS Architecture
- Using HDFS
- Spark and Hadoop
- Spark and the Hadoop Ecosystem
- Spark and MapReduce
- RDDs
- RDD Operations
- Key-Value Pair RDDs
- MapReduce and Pair RDD Operations
- Running Spark on a Cluster
- Standalone Cluster
- The Spark Standalone Web UI
- Parallel Programming with Spark
- RDD Partitions and HDFS Data Locality
- Working With Partitions
- Executing Parallel Operations
- Caching and Persistence
- Distributed Persistence
- Caching
- Writing Spark Applications
- SparkContext
- Spark Properties
- Building and Running a Spark Application
- Logging
- Spark Streaming
- Streaming Overview
- Sliding Window Operations
- Spark Streaming Applications
- Common Spark Algorithms
- Iterative Algorithms
- Graph Analysis
- Machine Learning
- Improving Spark Performance
- Shared Variables Broadcast Variables
- Shared Variables Accumulators
- Common Performance Issues
Real-world content
Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.
Expert Practitioners
Industry experts that bring their battle scars into the classroom.
Experiential Learning
More coding than lecture, coupled with architectural and design discussions.
Fully Customized
One-size-fits-all doesn't apply to training teams. That's where we come in!

Elite Instructor Program
We recently launched our internal Elite Instructor Program. The community driven instructor program is designed to support instructors in transforming students’ lives by consistently showing a world-class level of engagement, ability, and teaching prowess. Reach out today to learn more about our instructors.
Customized Technical Learning Solutions to Help Attract and Retain Talented Developers
Let DI help you design solutions to onboard, upskill or reskill your software development organization. Fully customized. 100% guaranteed.
DevelopIntelligence leads technical and software development learning programs for Fortune 500 companies. We provide learning solutions for hundreds of thousands of engineers for over 250 global brands.



“I appreciated the instructor’s technique of writing live code examples rather than using fixed slide decks to present the material.”
VMwareAbout Us
LET’S DISCUSS
DevelopIntelligence has been in the technical/software development learning and training industry for nearly 20 years. We’ve provided learning solutions to more than 48,000 engineers, across 220 organizations worldwide.
Resources
Thank you for everyone who joined us this past year to hear about our proven methods of attracting and retaining tech talent.

- Boulder, Colorado Headquarters: 980 W. Dillon Road, Louisville, CO 80027
© 2013 - 2022 DevelopIntelligence LLC - Privacy Policy