The Introduction to Apache Spark in Production training course is designed to demonstrate the basics of running Spark in a production setting. The course answers the questions of hardware specific considerations as well as architecture and internals of Spark.
The course begins with a focus on Spark architecture, internals, and hardware considerations. Next, it covers streaming jobs and integrating Spark with HortonWorks. The course concludes with a lesson on job scheduling and monitoring.
Purpose
|
Learn about the architecture and internals of Spark, a fast and general engine for big data processing with built-in modules for streaming, SQL, machine learning, and graph processing. |
Audience
|
Admins and DevOps roles that will be responsible for maintaining a Spark deployment |
Role
| Business Analyst - Software Developer - System Administrator - Technical Manager |
Skill Level
| Intermediate |
Style
| Hack-a-thon - Learning Spikes - Workshops |
Duration
| 2 Days |
Related Technologies
| Apache Spark | Hadoop | Apache |
Productivity Objectives
- Identify how to install and configure a production Spark cluster
- Integrate Spark with YARN
- Discover the internals of Spark that apply to running an efficient cluster