-
Onboard
For Tech Teams
- Reduce initial time to productivity.
- Increase employee tenure.
- Plug-and-play into HR onboarding and career pathing programs.
- Customize for ad-hoc and cohort-based hiring approaches.
-
Upskill
For Tech Teams
- Upgrade and round out developer skills.
- Tailor to tech stack and specific project.
- Help teams, business units, centers of excellence and corporate tech universities.
-
Reskill
For Tech Teams
- Offer bootcamps to give employees a running start.
- Create immersive and cadenced learning journeys with guaranteed results.
- Supplement limited in-house L&D resources with all-inclusive programs to meet specific business goals.
-
Design
For Tech Teams
- Uplevel your existing tech learning framework.
- Extend HR efforts to provide growth opportunities within the organization.
- Prepare your team for an upcoming tech transformation.
Get your team started on a custom learning journey today!
Our Boulder, CO-based learning experts are ready to help!
Course Summary
The Scalable Machine Learning (SML) course is designed and developed to provide students with exposure in Scalable Machine learning. The course focuses on utilizing the Hadoop and Spark Frameworks to implement SML Algorithms via Scala and Python programming languages.
The course begins with an introduction to SML and why developers use Spark for SML Next, the course dives into data acquisition, data pre-processing for modeling, and working with Iterative algorithms. The course concludes with model evaluation, optimization and deployment.
- Apache Spark - Hadoop - Python
- Productivity Objectives:
- Describe the role of Spark in Machine Learning.
- Apply Machine learning on massive datasets.
- Demonstrate experience in Data Acquisition, Processing, Analysis and Modeling using Hadoop and Spark.
- Evaluate various common types of data e.g. CSV, XML, JSON, Social Media data, etc. for pre-processing and/or building Machine Learning Models using Spark.
- Train, tune, test and deploy Machine Learning Models.
Request Information
Get your team upskilled or reskilled today. Chat with one of our experts to create a custom training proposal. Fully customized at no additional cost.

If you are not completely satisfied with your training class, we'll give you your money back.




about our training
-
Real-World Content
Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.
-
Expert Practitioners
Industry experts with 15+ years of industry experience that bring their battle scars into the classroom.
-
Experiential Learning
More coding than lecture, coupled with architectural and design discussions.
-
Fully Customized
One-size-fits-all doesn't apply to training teams. That's where we come in!
What You'll Learn
In the Scalable Machine Learning training course, you'll learn:
- Introduction to SML
- What is SML?
- Why it is required?
- Key platforms for performing SML
- SMLProject End to End Pipeline
- Spark Introduction
- Why Spark for SML?
- Databricks Platform Demo
- Approaches for scaling sci-kit learn code
- Hands-on Exercise(s): Experiencing the first notebook
- Why Spark for SML?
- Problems with Traditional Machine Learning Frameworks
- Machine Learning at Scale – Various options
- Iterative Algorithms
- How Spark performs well for Iterative Machine Learning Algorithms?
- Hands-on Exercise(s)
- SML on Enterprise Platform
- Quick Recap/Introduction to Hadoop
- Logical View of Cloudera Distribution
- Big Data Analytics Pipelines
- Components in Cloudera Distribution for performing SML
- Hands-on Exercise(s)
- Data Acquisition at Scale
- Acquiring Structured content from Relational Databases
- Acquiring Semi-structured content from Log Files
- Acquiring Unstructured content from other key sources like Web
- Tools for Performing Data acquisition at Scale
- Sqoop, Flume and Kafka Introduction, use cases and architectures
- Hands-on Exercise(s)
- Data Pre-Processing for Modeling
- Using the Spark Shell
- Resilient Distributed Datasets (RDDs)
- Functional Programming with Spark
- RDD Operations
- Key-Value Pair RDDs
- MapReduce and Pair RDD Operations
- Building and Running a Spark Application
- Performing Data Validation
- Data De-Duplication
- Detecting Outliers
- Hands-on Exercise(s)
- Working with Iterative Algorithms
- Dealing with RDD Infinite Lineages
- Caching Overview
- Distributed Persistence
- Checkpointing of an Iterative Machine Learning Algorithm
- Hands-on Exercise(s)
- Spark SQL
- Introduction
- Dataframe API
- Performing ad-hoc query analysis using Spark SQL
- Hands-on Exercise(s)
- Spark Machine Learning Using MLLib
- Spark ML vs Spark MLLib
- Data types and key terms
- Feature Extraction
- Linear Regression using Spark MLLib
- Hands-on Exercise(s)
- Spark Machine Learning Using ML
- Spark ML Overview
- Transformers and Estimators
- Pipelines
- Implementing Decision Trees
- K-Means Clustering using Spark ML
- Hands-on Exercise(s)
- Decision Trees and Random Forest
- Types – Classification and Regression trees
- Gini Index, Entropy and Information Gain
- Building Decision Trees
- Pruning the trees
- Prediction using Trees
- Ensemble Models
- Bagging and Boosting
- Advantages of using Random Forest
- Working with Random Forest
- Ensemble Learning
- How ensemble learning works
- Building models using Bagging
- Random Forest algorithm
- Random Forest model building
- Fine tuning hyper-parameters
- Hands-on Exercise(s)
- Model Evaluation, Optimization and Deployment
- Model Evaluation
- Optimizing a Model
- Deploying Model
- Best Practices
Real-world content
Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.
Expert Practitioners
Industry experts that bring their battle scars into the classroom.
Experiential Learning
More coding than lecture, coupled with architectural and design discussions.
Fully Customized
One-size-fits-all doesn't apply to training teams. That's where we come in!

Elite Instructor Program
We recently launched our internal Elite Instructor Program. The community driven instructor program is designed to support instructors in transforming students’ lives by consistently showing a world-class level of engagement, ability, and teaching prowess. Reach out today to learn more about our instructors.
Customized Technical Learning Solutions to Help Attract and Retain Talented Developers
Let DI help you design solutions to onboard, upskill or reskill your software development organization. Fully customized. 100% guaranteed.
DevelopIntelligence leads technical and software development learning programs for Fortune 500 companies. We provide learning solutions for hundreds of thousands of engineers for over 250 global brands.



“I appreciated the instructor’s technique of writing live code examples rather than using fixed slide decks to present the material.”
VMwareAbout Us
LET’S DISCUSS
DevelopIntelligence has been in the technical/software development learning and training industry for nearly 20 years. We’ve provided learning solutions to more than 48,000 engineers, across 220 organizations worldwide.
Resources
Thank you for everyone who joined us this past year to hear about our proven methods of attracting and retaining tech talent.

- Boulder, Colorado Headquarters: 980 W. Dillon Road, Louisville, CO 80027
© 2013 - 2022 DevelopIntelligence LLC - Privacy Policy