The Introduction to Apache Spark in Production training course focuses on running spark in a production setting. We answer the questions of hardware specific considerations as well as architecture and internals of Spark. A key component that is covered is around job scheduling and special considerations for streaming jobs.
- How to install and configure a production Spark cluster
- Integrate Spark with YARN
- Learn the internals of Spark that apply to running an efficient cluster
What You'll Learn
In the Introduction to Apache Spark in Production training course you’ll learn:
- Spark Architecture
- Spark Internals
- Hardware Considerations
- Streaming Jobs
- Integrating Spark with HortonWorks
- Job Scheduling
- Fair Scheduler
- What to monitor
- Diagnosing job issues
Meet Your Instructor
Michael is a practicing software developer, course developer, and trainer with DevelopIntelligence. For the majority of his career, Michael has designed and implemented large-scale, enterprise-grade, Java-based applications at major telecommunications and Internet companies, such as Level3 Communications, US West/Qwest/Century Link, Orbitz, and others.
Michael has a passion for learning new technologies, patterns, and paradigms (or, he has a tendency to get bored or disappointed with current ones)....Sujee
Sujee has been developing software for 15 years. In the last few years he has been consulting and teaching Hadoop, NOSQL and Cloud technologies. Sujee stays active in Hadoop / Open Source community. He runs a developer focused meetup and Hadoop hackathons called ‘Big Data Gurus’. He has presented at variety of meetups. Sujee contributes to Hadoop project and other open source projects. He writes about Hadoop and other technologies on his website.Andrew S
Andrew is a mathematician turned software engineer who loves building systems. After graduating with a PhD in pure math, he became fascinated by software startups and has since spent 20 years learning. During this period, he’s worked on a wide variety of projects and platforms, including big data analytics, enterprise optimization, mathematical finance, cross-platform middleware, and medical imaging.
In 2001, Andrew served as company architect at ProfitLogic, a pricing optimization startup...