Introduction to Apache Spark

Course Summary

The Introduction to Apache Spark training course is designed to demonstrate the necessary skills to work with Apache Spark, an open-source engine for data in the Hadoop ecosystem optimized for speed and advanced analytics.

The course begins by examining how to use Spark as an alternative to traditional MapReduce processing. Next, it explores how Spark supports streamed data processing and iterative algorithms. The course concludes with a lesson on how Spark enables jobs to run faster than traditional Hadoop MapReduce.

Purpose	Learn how to use Apache Spark as an alternative to traditional MapReduce processing.
Audience	Developers working on projects that use traditional Hadoop MapReduce.
Role	Software Developer
Skill Level	Introduction
Style	Hack-a-thon - Learning Spikes - Workshops
Duration	3 Days
Related Technologies	Apache Spark \| Hadoop \| Apache

Productivity Objectives

Describe how Apache Spark and Hadoop fit together
List three motivations for using Spark
Understand Resilient Distributed Datasets (RDDs)
Implement an application using the key Spark concepts

What You'll Learn:

In the Introduction to Apache Spark training course, you'll learn:

Spark Basics
- What is Apache Spark?
- Using the Spark Shell
- Resilient Distributed Datasets (RDDs)
- Functional Programming with Spark
The Hadoop Distributed File (HDFS) System
- Why HDFS?
- HDFS Architecture
- Using HDFS
Spark and Hadoop
- Spark and the Hadoop Ecosystem
- Spark and MapReduce
RDDs
- RDD Operations
- KeyValue Pair RDDs
- MapReduce and Pair RDD Operations
Running Spark on a Cluster
- Standalone Cluster
- The Spark Standalone Web UI
Parallel Programming with Spark
- RDD Partitions and HDFS Data Locality
- Working With Partitions
- Executing Parallel Operations
Caching and Persistence
- Distributed Persistence
- Caches
Writing Spark Applications
- SparkContext
- Spark Properties
- Building and Running a Spark Application
- Logging
Spark Streaming
- Streaming Overview
- Sliding Window Operations
- Spark Streaming Applications

Real-World Content

Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.

Expert Practitioners

Industry experts that bring their battle scars into the classroom.

Experiential Learning

More coding than lecture, coupled with architectural and design discussions.

Tailored Outlines

One-size-fits-all doesn't apply to training teams. That's where we come in!

“I appreciated the instructor's technique of writing live code examples rather than using fixed slide decks to present the material.”

VMware

Dive in and learn more

When transforming your workforce, it's important to have expert advice and tailored solutions. We can help. Tell us your unique needs and we'll explore ways to address them.

Let's chat

First Name*

Last Name*

Business Email*

Company*

Job Title*

Phone*

Country*

Tell us about what you’re looking to accomplish:

By filling out this form and clicking submit, you acknowledge our privacy policy.

Introduction to Apache Spark

Course Summary

Purpose

Audience

Role

Skill Level

Style

Duration

Related Technologies