The programming language Julia is being more and more adopted in High Performance Computing (HPC) due to its unique way to combine performance with simplicity and interactivity, enabling unprecedented productivity in HPC development. This course will discuss both basic and advanced topics relevant for single and Multi-GPU computing with Julia. It will focus on the CUDA.jl package, which enables writing native Julia code for GPUs. Topics covered include the following:
- GPU array programming;
- GPU kernel programming;
- kernel launch parameters;
- usage of on-chip memory;
- Multi-GPU computing;
- code reflection and introspection; and
- diverse advanced optimization techniques.
This course combines lectures and hands-on sessions.
This course addresses scientists interested in developing HPC applications using Julia. Previous Julia or GPU computing knowledge is not mandatory, but advantageous, and a good general understanding of programming is expected.
- Dr. Tim Besard (Creator and Lead developer of CUDA.jl, JuliaHub Inc.)
- Dr. Samuel Omlin (Computational Scientist | Responsible for Julia computing, CSCS)
This git repository contains the material for the part 2 of the course (speaker: Dr. Samuel Omlin, CSCS). The material for the part 1 is found in this git repository (speaker: Dr. Tim Besard, JuliaHub Inc.).
The edited course recording is found here (part 1) and here (part 2). The following list provides key entry points into the videos.
Part 1
00:00: Introduction to the course
04:59: Presentation of notebook 1-0: Introduction
24:19: Presentation of notebook 1-1: Array programming
43:18: Presentation of notebook 1-2: Application analysis and optimization
1:33:22: Presentation of notebook 1-3: Kernel programming
2:25:23: Presentation of notebook 1-4: Kernel analysis and optimization
3:19:16: Presentation of notebook 2-1: CUDA libraries
3:41:08: Presentation of notebook 2-2: Memory management
4:03:44: Presentation of notebook 2-3: Concurrent computing
Part 2
00:51: High-speed introduction/thoughts on GPU supercomputing
08:38: Overview on course notebooks of part 1
11:08: Presentation of notebook 1: Memory copy and performance evaluation
43:59: Walk through solutions of notebook 2: Application performance evaluation and optimization
58:29: Presentation on sustainable HPC building block development in Julia
1:27:56: Walk through solutions of notebook 3: Using shared memory
1:37:35: Walk through solutions of notebook 4: Steering registers and using warp level functions
1:57:02: Walk through solutions of notebook 5: Distributed parallelization