Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Python for Data Scientists

Course Summary

The Python for Data Scientists training course is designed to demonstrate the Python programming language, which continues to gain popularity not only among developers, but also among data scientists due to its rich ecosystem for data manipulation, data analytics, and machine learning.

This course begins by covering the fundamentals of Python-including data structures, loops, and list comprehensions. Next, it examines utilize the Python ecosystem to import and manipulate data, create summaries and exploratory visualizations, and perform standard hypothesis tests. The course concludes with an analysis of basic machine learning methods, and, time permitting, we'll cover dynamic visualizations.

Purpose
Learn how to use Python to explore and analyze data, run basic regression models, visualize data, and apply some basic machine learning models to data.
Audience
Business Analysts, Data Engineers, or working Data Scientists who don't possess knowledge of Python.
Role
Business Analyst - Data Engineer - Data Scientist
Skill Level
Introduction
Style
Fast Track - Workshops
Duration
3 Days
Related Technologies
DevOps Training | Python | Machine Learning Training

 

Productivity Objectives
  • Utilize Python to read, manipulate, and clean data
  • Analyze and visualize data using Python
  • Perform predictive analysis using basic machine learning models

What You'll Learn:

In the Python for Data Scientists training course, you'll learn:
  • Introduction to the Anaconda Python Distribution
    • Understand and utilize Jupyter notebooks
    • Understand the data science and machine learning libraries that are included
  • Introduction to Python
    • Dynamic typing
    • Primitive datatypes
    • Looping/list comprehensions
    • Modules and packages
  • Introduction to Pandas
    • Datatypes
    • Import data
      • CSV
      • Excel
      • SQL
    • Creating numerical summaries
    • Exploring data
    • Descriptive statistics
    • Basic probability distributions (Gaussian/normal, Poisson, Chi-Squared, binomial, exponential) including generating random numbers and finding critical values
    • Standard hypothesis testing, e.g., t-tests, z-tests, ANOVA, chi-square tests, as well as basic non-parametric tests like Wilcoxon signed-rank and rank-sum tests
    • Dummy variables
    • Linear regression
    • Logistic regression
    • Evaluating regression models
    • Simulating data from probability distributions
    • Permutation tests and the bootstrap
    • Creating publication-quality graphics
  • Introduction to SciKit-Learn
    • Supervised vs. Unsupervised learning
    • Classification vs. Regression
    • Linear Regression
    • Decision Trees
    • Support Vector Machines
    • Ensemble Models
    • Evaluating Models
    • Fine-Tuning Your Models
“I appreciated the instructor's technique of writing live code examples rather than using fixed slide decks to present the material.”

VMware

Dive in and learn more

When transforming your workforce, it's important to have expert advice and tailored solutions. We can help. Tell us your unique needs and we'll explore ways to address them.

Let's chat

By filling out this form and clicking submit, you acknowledge our privacy policy.