Why the block size in gpu-latency is 64 #10

yjl0101 · 2024-08-29T03:14:57Z

Hi, I notice that the block size in gpu-latency is 64 rather than 32 (warp size) or a single thread. Is there any consideration to set as 64? Looking forward to your reply:)

te42kyfo · 2024-08-29T06:28:29Z

A single thread would work just the same. Using a full warp just feels better, and a full warp is 64 threads on CDNA hardware. On NVIDIA, this actually runs two warps instead of just one, but from what I remember, the interference is low.

So no particularly strong reasoning.

yjl0101 · 2024-08-29T06:54:18Z

@te42kyfo aha, 64 for AMD gpus, got it and thanks a lot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the block size in gpu-latency is 64 #10

Why the block size in gpu-latency is 64 #10

yjl0101 commented Aug 29, 2024

te42kyfo commented Aug 29, 2024

yjl0101 commented Aug 29, 2024

Why the block size in gpu-latency is 64 #10

Why the block size in gpu-latency is 64 #10

Comments

yjl0101 commented Aug 29, 2024

te42kyfo commented Aug 29, 2024

yjl0101 commented Aug 29, 2024