The Introduction to High-Performance GPU Architectures training course introduces the programming techniques required to develop general purpose software applications for GPU hardware. The course examines the programming models of both OpenCL and NVIDIA’s CUDA development framework. Participants will be able to understand how GPU hardware architectures differ from traditional CPU architectures and the changes in the programming environment (development, debugging, and validation).
The DevelopIntelligence remote lab environment utilizes Nvidia hardware (Nvidia GTX480 and Tesla C2070) to illustrate CUDA/OpenCL concepts and to allow training participants to
experimentally investigate performance issues, debugging techniques, and code examples.
Course Summary
Hands-on training is customized, instructor-led training with an in-depth presentation of a technology and its concepts, featuring such topics as Java, OOAD, and Open Source.
What You'll Learn
In the Introduction to High-Performance GPU Architectures training course you’ll learn:
- OpenCL/CUDA Programming Model
- Stream computing and SIMD platforms
- Threads and thread hierarchy
- Memory hierarchy
- Synchronisation
- Host and device interactions
- GPU Device Architecture
- Streaming multiprocessors and scalar processors
- On-chip memory: registers and local shared memory
- Execution model: warps, scheduling and divergence
- Device memory and latency
- Performance Tuning and Optimization
- Instruction performance
- Memory access patterns
- Global memory coalescence
- Local memory bank conflicts
- Optimization strategies
Meet Your Instructor
Dan ConnorsMeet Dan Connors »Dr. Dan Connors is a veteran of the high performance microprocessor and scientific computing field. He received his Ph.D. in Computer Engineering from the University of Illinois at Urbana-Champaign in the year 2000. As a professor at the University of Colorado in the Department of Computer Science and Electrical Engineering, Dr. Connors investigates parallel programming models, compiler optimization, fault tolerance, and design of multicore architectures.
For his commitment to teaching, Dr. Connors was...






