VOOZH about

URL: https://www.coursera.org/learn/packt-gpu-programming-with-c-and-cuda

⇱ GPU Programming with C++ and CUDA | Coursera


GPU Programming with C++ and CUDA

Keep adding new skills with 10,000+ programs for $239 (usually $399). Save now.

GPU Programming with C++ and CUDA

Included with

β€’

Learn more

Ask Coursera

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Accelerate real-world tasks using GPU parallelism

  • Optimize performance with CUDA streams and custom C++ solutions

  • Create and share GPU libraries with Python integration

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

March 2026

Assessments

10 assignments

Taught in English

There are 10 modules in this course

In this course, you’ll master GPU programming using C++ and CUDA to significantly enhance your software's performance. By focusing on parallelism, you’ll learn to leverage the full power of GPUs for high-performance computing applications.

You will acquire practical knowledge on managing GPU devices, optimizing GPU resource usage, and integrating GPU code with Python to build scalable and efficient applications. This course emphasizes real-world strategies for optimizing performance and building reusable libraries. This course combines fundamental theory with hands-on applications to help you solve complex performance challenges. You'll not only understand the core concepts but also implement them in real-world projects, such as creating libraries for Python integration. Ideal for C++ developers with experience in basic programming concepts, this course will take you through advanced topics, from parallel algorithms to multi-GPU usage. A background in operating systems is recommended for tackling more complex concepts. Based on the book, GPU Programming with C++ and CUDA, by Paulo Motta.

In this section, we explore parallelism in software, its importance, and the differences between CPU and GPU architectures to build a foundation for GPU programming.

What's included

2 videos4 readings1 assignment

2 videosβ€’Total 2 minutes
  • Introduction - Overview Videoβ€’1 minute
  • Introduction to Parallel Programming - Overview Videoβ€’1 minute
4 readingsβ€’Total 40 minutes
  • Introductionβ€’10 minutes
  • Why Is Parallelism Important?β€’10 minutes
  • An Overview of GPU Architectureβ€’10 minutes
  • Memory Management and Accessβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • Parallel Programming Fundamentalsβ€’10 minutes

In this section, we configure a GPU environment using Docker, locate official Linux documentation, and install the CUDA toolkit on Ubuntu 20.04 or 22.04 for AI and machine learning workflows.

What's included

1 video3 readings1 assignment

1 videoβ€’Total 1 minute
  • Setting Up Your Development Environment - Overview Videoβ€’1 minute
3 readingsβ€’Total 30 minutes
  • Introductionβ€’10 minutes
  • Docker at a Glanceβ€’10 minutes
  • Readying Our Development Environmentβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • Docker and CUDA Setup Fundamentalsβ€’10 minutes

In this section, we introduce GPU programming fundamentals, including kernel execution, device inspection, and setting up a working environment for CUDA development.

What's included

1 video4 readings1 assignment

1 videoβ€’Total 1 minute
  • Hello CUDA - Overview Videoβ€’1 minute
4 readingsβ€’Total 50 minutes
  • Introductionβ€’10 minutes
  • A First Running Programβ€’10 minutes
  • Consulting Devicesβ€’10 minutes
  • VS Code Configurationβ€’20 minutes
1 assignmentβ€’Total 10 minutes
  • Introduction to CUDA and Dev Containersβ€’10 minutes

In this section, we explore SIMD execution, data movement, and parallel vector addition for GPU programming.

What's included

1 video5 readings1 assignment

1 videoβ€’Total 1 minute
  • Hello Again, but in Parallel - Overview Videoβ€’1 minute
5 readingsβ€’Total 50 minutes
  • Introductionβ€’10 minutes
  • Not-So-Parallel Prime Number Verificationβ€’10 minutes
  • A Kernel to Test for Prime Numbersβ€’10 minutes
  • How to Measure Execution Time on the GPUβ€’10 minutes
  • Vector Additionβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • Parallel Computing Fundamentalsβ€’10 minutes

In this section, we explore GPU thread, block, and grid configurations, asynchronous data transfer, streams, events, and shared memory to optimize performance in parallel computing.

What's included

1 video5 readings1 assignment

1 videoβ€’Total 1 minute
  • A Closer Look into the World of GPUs - Overview Videoβ€’1 minute
5 readingsβ€’Total 70 minutes
  • Introductionβ€’20 minutes
  • Putting It All Togetherβ€’10 minutes
  • Asynchronous Data Transfersβ€’10 minutes
  • Parallelizing with Streamsβ€’20 minutes
  • Following the Eventsβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • Exploring GPU Architecture and Optimization Techniquesβ€’10 minutes

In this section, we explore parallel algorithm design, focusing on matrix operations, reduction, and workload balancing for efficient GPU execution.

What's included

1 video8 readings1 assignment

1 videoβ€’Total 1 minute
  • Parallel Algorithms with CUDA - Overview Videoβ€’1 minute
8 readingsβ€’Total 80 minutes
  • Introductionβ€’10 minutes
  • Understanding How to Spot and Exploit Parallelismβ€’10 minutes
  • Balancing the Workloadsβ€’10 minutes
  • Computing Matrix Addition and Multiplicationβ€’10 minutes
  • Calculating Numerical Integralsβ€’10 minutes
  • Reducing from Manyβ€’10 minutes
  • Sorting Dataβ€’10 minutes
  • Processing Sensor Data with a Convolutionβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • Data Management and Parallel Executionβ€’10 minutes

In this section, we explore GPU optimization and profile with NVIDIA Nsight Compute.

What's included

1 video5 readings1 assignment

1 videoβ€’Total 1 minute
  • Performance Strategies - Overview Videoβ€’1 minute
5 readingsβ€’Total 100 minutes
  • Introductionβ€’10 minutes
  • Profiling with NVIDIA Nsight Computeβ€’20 minutes
  • Optimizing to Speed Up Our Codeβ€’20 minutes
  • Using the Release Configurationβ€’20 minutes
  • Using Loop Unroll for Further Improvementsβ€’30 minutes
1 assignmentβ€’Total 10 minutes
  • Performance Strategies in Software Developmentβ€’10 minutes

In this section, we explore debugging CUDA code with VS Code, using CUDA streams to overlap memory and kernel operations, and configuring multiple GPUs for parallel processing.

What's included

1 video4 readings1 assignment

1 videoβ€’Total 1 minute
  • Overlaying Multiple Operations - Overview Videoβ€’1 minute
4 readingsβ€’Total 80 minutes
  • Introductionβ€’20 minutes
  • Using CUDA Streams to Overlay Operationsβ€’10 minutes
  • Measuring Our Limitsβ€’30 minutes
  • Running Multiple GPUs Togetherβ€’20 minutes
1 assignmentβ€’Total 10 minutes
  • Multi-GPU and CUDA Stream Fundamentalsβ€’10 minutes

In this section, we explore methods to integrate C++ GPU code with Python, focusing on Ctypes, custom wrappers, and performance analysis for efficient cross-language execution.

What's included

1 video4 readings1 assignment

1 videoβ€’Total 1 minute
  • Exposing Your Code to Python - Overview Videoβ€’1 minute
4 readingsβ€’Total 80 minutes
  • Introductionβ€’20 minutes
  • Creating the C++ Libraryβ€’20 minutes
  • Using Ctypesβ€’20 minutes
  • Passing NumPy Arrays to Your Libraryβ€’20 minutes
1 assignmentβ€’Total 10 minutes
  • Integrating C++ with Python for High-Performance Computingβ€’10 minutes

In this section, we explore GPU development using cuBLAS and Thrust, optimize code for memory and thread efficiency, and test with GTest and Pytest to ensure reliability and performance.

What's included

1 video5 readings1 assignment

1 videoβ€’Total 1 minute
  • Exploring Existing GPU Models - Overview Videoβ€’1 minute
5 readingsβ€’Total 50 minutes
  • Exploring Existing GPU Modelsβ€’10 minutes
  • Using Thrust to Write GPU Codeβ€’10 minutes
  • Moving Sequential Code to the GPUβ€’10 minutes
  • Testing Your Code with GTest and Pytestβ€’10 minutes
  • Using Pytest with Our Codeβ€’10 minutes
1 assignmentβ€’Total 10 minutes
  • GPU Programming Fundamentalsβ€’10 minutes

Instructor

Packt
1,926 Coursesβ€’560,010 learners

Why people choose Coursera for their career

πŸ‘ Image

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
πŸ‘ Image

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
πŸ‘ Image

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
πŸ‘ Image

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.

If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.

Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.

If you complete the course successfully, your electronic Course Certificate will be added to your Accomplishments page - from there, you can print your Course Certificate or add it to your LinkedIn profile.

This course is currently available only to learners who have paid or received financial aid, when available.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Financial aid available,