Introduction to High-Performance GPU Architectures

CUDA Programming on High-Performance GPU Architectures

The Introduction to High-Performance GPU Architectures training course introduces the programming techniques required to develop general purpose software applications for GPU hardware. The course examines the programming models of both OpenCL and NVIDIA’s CUDA development framework. Participants will be able to understand how GPU hardware architectures differ from traditional CPU architectures and the changes in the programming environment (development, debugging, and validation).

The DevelopIntelligence remote lab environment utilizes Nvidia hardware (Nvidia GTX480 and Tesla C2070) to illustrate CUDA/OpenCL concepts and to allow training participants to
experimentally investigate performance issues, debugging techniques, and code examples.

Course Summary

Purpose: 
Introduce CUDA programming, profiling, and debugging techniques required to develop general purpose software applications for GPU hardware
Audience: 
Software developers needing to implement high performance applications (example numerical computing areas: finance and engineering)
Skill Level: 
Learning Style: 

Hands-on training is customized, instructor-led training with an in-depth presentation of a technology and its concepts, featuring such topics as Java, OOAD, and Open Source.

Hands On
Duration: 
3 Days
Introduction to High-Performance GPU Architectures is part of the OpenCL Training curriculum.

What You'll Learn

In the Introduction to High-Performance GPU Architectures training course you’ll learn:

  • OpenCL/CUDA Programming Model
  • Stream computing and SIMD platforms
  • Threads and thread hierarchy
  • Memory hierarchy
  • Synchronisation
  • Host and device interactions
  • GPU Device Architecture
  • Streaming multiprocessors and scalar processors
  • On-chip memory: registers and local shared memory
  • Execution model: warps, scheduling and divergence
  • Device memory and latency
  • Performance Tuning and Optimization
  • Instruction performance
  • Memory access patterns
  • Global memory coalescence
  • Local memory bank conflicts
  • Optimization strategies

Meet Your Instructor

Dan Connors

Dr. Dan Connors is a veteran of the high performance microprocessor and scientific computing field. He received his Ph.D. in Computer Engineering from the University of Illinois at Urbana-Champaign in the year 2000. As a professor at the University of Colorado in the Department of Computer Science and Electrical Engineering, Dr. Connors investigates parallel programming models, compiler optimization, fault tolerance, and design of multicore architectures.

For his commitment to teaching, Dr. Connors was...

Meet Dan Connors »

Related Courses

Prepare yourself for this course by taking: 
Once you've completed this course, deepen your knowledge by taking: 

Develop Your Intelligence

Contact us to begin the personalization process.

We'll work with you to design a personalized,
relevant learning solution that's budget friendly.

Questions? Answered.

Problem? Solved.


They Liked Us.




You will too.

Learn About The DI Way

Everyone learns more when it's personally relevant. Yes - It's that simple!

Contact Us

Contact DevelopIntelligence

Please fill out the information below to have a DevelopIntelligence Learning Solutions Architect contact you within 1-business day. If you would like immediate live help, please call (877) 629-5631.

Because we value your privacy, we don’t share your information. We’ll only use it to help you find the best personally relevant learning solution.

Need help finding the right learning solution? Call us: 877-629-5631