The Introduction to Apache Spark training course is designed to demonstrate the necessary skills to work with Apache Spark, an open-source engine for data in the Hadoop ecosystem optimized for speed and advanced analytics.
The course begins by examining how to use Spark as an alternative to traditional MapReduce processing. Next, it explores how Spark supports streamed data processing and iterative algorithms. The course concludes with a lesson on how Spark enables jobs to run faster than traditional Hadoop MapReduce.
Purpose
|
Learn how to use Apache Spark as an alternative to traditional MapReduce processing. |
Audience
|
Developers working on projects that use traditional Hadoop MapReduce. |
Role
| Software Developer |
Skill Level
| Introduction |
Style
| Hack-a-thon - Learning Spikes - Workshops |
Duration
| 3 Days |
Related Technologies
| Apache Spark | Hadoop | Apache |
Productivity Objectives
- Describe how Apache Spark and Hadoop fit together
- List three motivations for using Spark
- Understand Resilient Distributed Datasets (RDDs)
- Implement an application using the key Spark concepts