Google Cloud for Data Scientists

The Google Cloud for Data Scientists training course is designed to prepare beginner data scientists and machine learning practitioners to implement regression and classification models in TensorFlow using both structured and unstructured data and then serve the models, elastically and resiliently, with Google Cloud.

Students will begin by getting to know the core data science and machine learning concepts that will be important throughout the course. Next, students will use Google Cloud Jupyter notebook hosting environment, Colab, to prepare a structured dataset sourced from a publicly accessible and serverless data warehouse.

Students will use the dataset to explore basic features of TensorFlow, including its various Application Programming Interfaces (APIs), and learn how to use Google Cloud Machine Learning Engine for distributed training, hyperparameter tuning, and serving of your model as a web service based API. Students will learn how to avoid the training-serving skew problem with an effective feature processing pipeline and explore the importance of feature engineering for building high-performance machine learning systems based on case studies and best practices.

The remainder of the course will leverage previous learning about using Google Cloud for structured data and apply it to unstructured data and image classification. Students will work with convolutional neural networks, implement changes to TensorFlow models to use convolutional layers, batch normalization, dropout, transfer learning, and apply image-specific data augmentation techniques.

This course targets beginner data scientists and machine learning engineers who have some experience developing with Python, SQL, and Linux Shell. The course will be conducted on Google Cloud Platform. All students need is a reasonably powerful laptop running an up-to-date browser (preferably Chrome).

Course Summary

Purpose: 
Create and deploy high-performance data science and machine learning systems on Google Cloud for regression and classification use cases leveraging both structured and unstructured datasets.
Audience: 
Developers and developer teams looking to use Google Cloud in a data science context.
Skill Level: 
Learning Style: 

Hands-on training is customized, instructor-led training with an in-depth presentation of a technology and its concepts, featuring such topics as Java, OOAD, and Open Source.

Hands On help

Seminars are highly-focused, lecture-heavy, half-day to multi-day learning events. Seminars are a great way to create an awareness level of knowledge for a large number of concepts, in a short period of time. Think wide (breadth) and thin (depth).

Seminar help

Workshops are instructor-led lab-intensives focused on the practical application of technologies through the facilitation of a project-related lab. Workshops are just the opposite of Seminars. They deliver the highest level of knowledge transfer of any format. Think wide (breadth) and deep (depth).

Workshop help
Duration: 
3 Days
Productivity Objectives: 
  • Use TensorFlow to create regression and classification models
  • Deploy statistical and deep learning models to Google Cloud for training and serving
  • Apply feature engineering to structured and unstructured datasets
  • Optimize performance metrics of regression and classification models
  • Evaluate and use end-to-end data science and machine learning pipelines

What You'll Learn

In the Google Cloud for Data Scientists training course you’ll learn:

  • Google Cloud Basics
    • Why Google Cloud for Data Science
    • Managed Virtual Infrastructure vs. Serverless
  • Data Science with Google Cloud
    • Tensors as Data Structures
    • Machine Learning for Data Science
    • Regression vs. Classification Use Cases
    • Reproducibility in Data Science
  • Google Colaboratory (Colab)
    • Jupyter Notebooks on Google Cloud
    • Using Google Cloud Services from Colab
  • Regression with Structured Data
    • BigQuery for Structured Data Warehousing
    • Python, Pandas, and SQL for Data Preparation
    • Seaborn for Data Visualization
    • Reproducible Datasets with Hashing
    • BigQuery Structured Dataset for Regression
    • Regression Loss vs. Metric
    • Benchmark Loss and Metric for a Dataset
  • Model Training and Evaluation
    • Gradient Descent vs. Alternatives for Training
    • Netflix Prize Model Evaluation Case Study
    • Best Practices for Model Evaluation
  • TensorFlow
    • TensorFlow Models and Frameworks
    • Distributed Training Support
    • Core Python API
    • Eager vs. Lazy Evaluation
  • Classification with Structured Data
    • Deep Neural Network Models
    • Activation Functions
    • TensorFlow Playgrounds
  • TensorFlow Estimator API
    • Use Cases for TensorFlow Keras and Estimator APIs
    • Regression and Classification with TensorFlow Estimator API
    • Processing Sharded Datasets with TensorFlow Data
    • Fault-Tolerant Distributed Training
    • TensorBoard for Monitoring and Analysis
  • Cloud Storage (GCS)
    • Object Storage and Buckets
    • Integration with GCP
    • Web-based and Command Line Interfaces
  • Cloud Machine Learning Engine (MLE)
    • End-to-end Machine Learning Pipeline with Cloud MLE
    • Compute and Parameter Node Capacity
    • Distributed Model Training
    • Autoscalable/elastic Model Serving
  • Feature Engineering
    • Five Criteria for Effective Features
    • Case Studies and Best Practices
    • Feature Crosses, Quantization, Hot-one Encoding with TensorFlow
    • Feature Pre-processing and Engineering in a Machine Learning Pipeline
    • Features for Wide-and-Deep Machine Learning Models
    • Cloud Pub/Sub for Streaming Data
    • BigQuery and DataFlow for Feature Engineering
    • Data Pipelines and Feature Engineering
  • Classification with Unstructured Image Data
    • Fashion-MNIST and Flowers Image Datasets
    • Cross-Entropy Loss and Precision, Recall, ROC, AUC Metrics
    • Google Machine Learning APIs
    • Cloud AutoML Vision for Benchmark Image Classification Models
    • Deep Neural Networks for Image Classification
    • Convolutional Neural Networks for Image Classification
    • Convolutional and Maxpooling Layers
  • Training Convolutional Neural Networks
    • TensorFlow Image API
    • L1, L2, and Dropout Regularization
    • Batch Normalization
    • Data Augmentation
    • Transfer Learning

Get Custom Training Quote

We'll work with you to design a custom Google Cloud for Data Scientists training program that meets your specific needs. A 100% guaranteed plan that works for you, your team, and your budget.

Learn More

Chat with one of our Program Managers from our Boulder, Colorado office to discuss various training options.

DevelopIntelligence has been in the technical/software development learning and training industry for nearly 20 years. We’ve provided learning solutions to more than 48,000 engineers, across 220 organizations worldwide.

About Develop Intelligence
Di Clients
Need help finding the right learning solution?   Call us: 877-629-5631