Introduction to Apache Spark in Production

The Introduction to Apache Spark in Production training course focuses on running spark in a production setting. We answer the questions of hardware specific considerations as well as architecture and internals of Spark. A key component that is covered is around job scheduling and special considerations for streaming jobs.

Course Summary

Learn the architecture and internals of Spark, a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Admins and DevOps roles that will be responsible for maintaining a Spark deployment
Skill Level: 
Learning Style: 

Hands-on training is customized, instructor-led training with an in-depth presentation of a technology and its concepts, featuring such topics as Java, OOAD, and Open Source.

Hands On help
2 Days
Productivity Objectives: 
  • How to install and configure a production Spark cluster
  • Integrate Spark with YARN
  • Learn the internals of Spark that apply to running an efficient cluster
Introduction to Apache Spark in Production is part of the Apache Training curriculum.

What You'll Learn

In the Introduction to Apache Spark in Production training course you’ll learn:

  • Spark Architecture
  • Spark Internals
  • Hardware Considerations
  • Streaming Jobs
  • Integrating Spark with HortonWorks
  • Job Scheduling
    • Stand-alone
    • Mesos
    • YARN
    • Fair Scheduler
  • Monitoring
    • What to monitor
    • Diagnosing job issues

Meet Your Instructor

Michael headshot

Michael is a practicing software developer, course developer, and trainer with DevelopIntelligence. For the majority of his career, Michael has designed and implemented large-scale, enterprise-grade, Java-based applications at major telecommunications and Internet companies, such as Level3 Communications, US West/Qwest/Century Link, Orbitz, and others.

Michael has a passion for learning new technologies, patterns, and paradigms (or, he has a tendency to get bored or disappointed with current ones)....

Meet Michael »
Sujee Picture

Sujee has been developing software for 15 years. In the last few years he has been consulting and teaching Hadoop, NOSQL and Cloud technologies. Sujee stays active in Hadoop / Open Source community. He runs a developer focused meetup and Hadoop hackathons called ‘Big Data Gurus’. He has presented at variety of meetups. Sujee contributes to Hadoop project and other open source projects. He writes about Hadoop and other technologies on his website.

Meet Sujee »
Photo of Instructor
Andrew S

Andrew is a mathematician turned software engineer who loves building systems. After graduating with a PhD in pure math, he became fascinated by software startups and has since spent 20 years learning. During this period, he’s worked on a wide variety of projects and platforms, including big data analytics, enterprise optimization, mathematical finance, cross-platform middleware, and medical imaging.

In 2001, Andrew served as company architect at ProfitLogic, a pricing optimization startup...

Meet Andrew S »

Contact us to learn more

Not all training courses are created equal. Let the customization process begin! We'll work with you to design a custom Introduction to Apache Spark in Production training course that meets your specific needs.

DevelopIntelligence has been in the technical/software development learning and training industry for nearly 20 years. We’ve provided learning solutions to more than 48,000 engineers, across 220 organizations worldwide.

About Develop Intelligence
Di Clients


Need help finding the right learning solution?   Call us: 877-629-5631