Learn opencl. opencl training. hands-on opencl training courses.

OpenCL Training / CUDA Training

Learn how to program using OpenCL / CUDA on-site or on-line from Dan Connors, recipient of the NVIDIA Faculty Fellowship Award. View Dan's bio.

Overview of High-Performance GPU Architectures

Purpose: Learn the approach and practices to using GPU architectures for high performance computing

Audience: Group leaders and program managers looking to direct high-performance computing projects

Duration: 1 Day

Summary:
The Overview of High-Performance GPU Architectures training course is the starting point for project leaders using GPU technology. The course focuses on the key concepts, technologies, and practices when building a high performance application for GPUs. Overview of High-Performance GPU Architectures begins with a review of parallel programming techniques, like OpenMP and pthreads, and then transitions to cover the modern parallel programming technologies found in modern multicore and GPU systems.

Topics:
  • Overview of GPU architectures
  • Examination of OpenCL concepts
  • Creating and Programming CUDA applications
  • Multicore Architectures
  • Parallel Programming Concepts: OpenMP and Pthreads
  • Memory Bandwidth

Introduction High-Performance GPU Architectures

Purpose: Introduce programming techniques required to develop general purpose software applications for GPU hardware

Audience: Software developers needing to implement high performance applications (example numerical computing areas: finance and engineering)

Duration: 3 Day hands-on Workshop

Summary:
The goal of this course is to introduce the programming techniques required to develop general purpose software applications for GPU hardware. The course examines the programming models of both OpenCL and NVIDIA's CUDA development framework. Participants will be able to understand how GPU hardware architectures differ from traditional CPU architectures and the changes in the programming environment (development, debugging, and validation).

Topics:
  • OpenCL/CUDA Programming Model
  • Stream computing and SIMD platforms
  • Threads and thread hierarchy
  • Memory hierarchy
  • Synchronisation
  • Host and device interactions
  • GPU Device Architecture
  • Streaming multiprocessors and scalar processors
  • On-chip memory: registers and local shared memory
  • Execution model: warps, scheduling and divergence
  • Device memory and latency
  • Performance Tuning and Optimization
  • Instruction performance
  • Memory access patterns
  • Global memory coalescence
  • Local memory bank conflicts
  • Optimization strategies

Advanced Programming of High-Performance GPU Architectures

Purpose: Examine advanced programming techniques in OpenCL and CUDA programming for GPU hardware

Audience: Experienced programming wanting to take a leadership role as a GPU project architect

Duration: 3 Day hands-on Workshop

Summary:
The course will provide experienced students with advanced knowledge and hands-on experience in developing and analyzing high performance applications software for processors with massively parallel computing resources (graphics processing units and multicore processors). By end of the training, participants will: understand algorithm styles that are suitable for accelerators, understand the most important architectural performance considerations to developing applications, be exposed to computational thinking skills for accelerating applications in science and engineering. gain ability to engage computing accelerators on science and engineering breakthroughs.

Topics:
  • Synchronization
  • Heterogeneous Parallel Programming
  • OpenCL Programming Model
  • CUDA Programming Model
  • CUDA Application Case Study Code Examples
  • NVIDIA Product/Processor Overview
  • CUDA Optimization Techniques
  • GPU Optimization
  • Trends in GPU Architectures

Meet Dan Connors

dan connors, phdDr. Dan Connors is a veteran of the high performance microprocessor and scientific computing field. He received his Ph.D. in Computer Engineering from the University of Illinois at Urbana-Champaign in the year 2000. As a professor at the University of Colorado in the Department of Computer Science and Electrical Engineering, Dr. Connors investigates parallel programming models, compiler optimization, fault tolerance, and design of multicore architectures.

For his commitment to teaching, Dr. Connors was awarded the University of Colorado College of Engineering Peebles Outstanding Teaching Award and the University of Colorado College of Engineering Sullivan-Carlson Teaching Innovation Award in 2008.

Dan Connors serves as a leading industry consultant for the development of education infrastructure for high-performance systems and parallel programming initiatives for emerging applications. He was an early adopter of NVIDIA’s CUDA programming model and teaches courses in parallel programming. Dan Connors has developed industry courses for Intel Corporation, Hewlett-Packard, Lockheed-Martin, and Northrup-Gruman. For his work in CUDA development, Dr. Connors received an NVIDIA Faculty Fellowship Award. Finally, Dr. Connors has authored and presented numerous research papers on topics including compiler technology, fault tolerance, operating system design, and parallel programming.



Hands-On Format
Customized, in-depth, instructor-led lecture and lab training.

We'll Come To You
Here are just a few recent training locations: Austin, Baltimore, Boston, Boulder, Calgary, Charlotte, Chicago, Columbus, Dallas, Denver, Detroit, Edmonton, Houston, Indianapolis, Jacksonville, Las Vegas, Los Angeles, Louisville, Memphis, Milwaukee, Montreal, Nashville, New York, Ottawa, Philadelphia, Phoenix, Portland, San Antonio, San Diego, San Francisco, San Jose, Seattle, Toronto, Vancouver, Washington DC

110% Guarantee
If you aren't satisfied with our training, we'll refund your money and give you $750.

Customized for You
We customize every solution to meet the specific learning needs.