
|
OpenCL Training / CUDA Training
Learn how to program using OpenCL / CUDA on-site or on-line from Dan Connors, recipient of the NVIDIA Faculty Fellowship Award. View Dan's bio.
Purpose: Learn the approach and practices to using GPU architectures
for high performance computing
Audience: Group leaders and program managers looking to direct high-performance computing projects
Duration: 1 Day
Summary:
The Overview of High-Performance GPU Architectures training course is the
starting point for project leaders using GPU technology. The course
focuses on the key concepts, technologies, and practices when building
a high performance application for GPUs. Overview of High-Performance GPU Architectures
begins with a review of parallel programming techniques, like OpenMP and pthreads,
and then transitions to cover the modern parallel programming technologies found in
modern multicore and GPU systems.
Topics:
- Overview of GPU architectures
- Examination of OpenCL concepts
- Creating and Programming CUDA applications
- Multicore Architectures
- Parallel Programming Concepts: OpenMP and Pthreads
- Memory Bandwidth
Purpose: Introduce programming techniques required to develop general purpose software applications for GPU hardware
Audience: Software developers needing to implement high performance applications (example numerical computing areas: finance and engineering)
Duration: 3 Day hands-on Workshop
Summary:
The goal of this course is to introduce the programming techniques
required to develop general purpose software applications for GPU
hardware. The course examines the programming models of both OpenCL
and NVIDIA's CUDA development framework. Participants will be able to
understand how GPU hardware architectures differ from traditional CPU
architectures and the changes in the programming environment
(development, debugging, and validation).
Topics:
- OpenCL/CUDA Programming Model
- Stream computing and SIMD platforms
- Threads and thread hierarchy
- Memory hierarchy
- Synchronisation
- Host and device interactions
- GPU Device Architecture
- Streaming multiprocessors and scalar processors
- On-chip memory: registers and local shared memory
- Execution model: warps, scheduling and divergence
- Device memory and latency
- Performance Tuning and Optimization
- Instruction performance
- Memory access patterns
- Global memory coalescence
- Local memory bank conflicts
- Optimization strategies
Purpose: Examine advanced programming techniques in OpenCL and CUDA programming for GPU hardware
Audience: Experienced programming wanting to take a leadership role as a GPU project architect
Duration: 3 Day hands-on Workshop
Summary:
The course will provide experienced students with advanced knowledge
and hands-on experience in developing and analyzing high performance
applications software for processors with massively parallel computing
resources (graphics processing units and multicore processors). By
end of the training, participants will: understand algorithm styles
that are suitable for accelerators, understand the most important
architectural performance considerations to developing applications,
be exposed to computational thinking skills for accelerating
applications in science and engineering. gain ability to engage
computing accelerators on science and engineering breakthroughs.
Topics:
- Synchronization
- Heterogeneous Parallel Programming
- OpenCL Programming Model
- CUDA Programming Model
- CUDA Application Case Study Code Examples
- NVIDIA Product/Processor Overview
- CUDA Optimization Techniques
- GPU Optimization
- Trends in GPU Architectures
Dr. Dan Connors is a veteran of the high performance microprocessor and scientific computing field.
He received his Ph.D. in Computer Engineering from the University of Illinois at Urbana-Champaign in the year 2000.
As a professor at the University of Colorado in the Department of Computer Science and Electrical Engineering,
Dr. Connors investigates parallel programming models, compiler optimization, fault tolerance, and design of multicore architectures.
For his commitment to teaching, Dr. Connors was awarded the University of Colorado College of Engineering Peebles Outstanding Teaching Award
and the University of Colorado College of Engineering Sullivan-Carlson Teaching Innovation Award in 2008.
Dan Connors serves as a leading industry consultant for the development of education infrastructure for high-performance systems
and parallel programming initiatives for emerging applications. He was an early adopter of NVIDIA’s CUDA programming model and
teaches courses in parallel programming. Dan Connors has developed industry courses for Intel Corporation, Hewlett-Packard, Lockheed-Martin, and Northrup-Gruman.
For his work in CUDA development, Dr. Connors received an NVIDIA Faculty Fellowship Award. Finally, Dr. Connors has authored and presented
numerous research papers on topics including compiler technology, fault tolerance, operating system design, and parallel programming.
|
Hands-On Format
Customized, in-depth, instructor-led lecture and lab training.
We'll Come To You
Here are just a few recent training locations:
Austin, Baltimore, Boston, Boulder, Calgary, Charlotte, Chicago, Columbus, Dallas, Denver, Detroit, Edmonton, Houston,
Indianapolis, Jacksonville, Las Vegas, Los Angeles, Louisville, Memphis, Milwaukee, Montreal, Nashville,
New York, Ottawa, Philadelphia, Phoenix, Portland, San Antonio, San Diego, San Francisco, San Jose,
Seattle, Toronto, Vancouver, Washington DC
110% Guarantee
If you aren't satisfied with our training, we'll refund your money and give you $750.
|