Join Nostr
2026-02-02 16:01:42 UTC

Andrew Zonenberg on Nostr: So, I have an answer to my previous question about GPU transfer efficiency. Original ...

So, I have an answer to my previous question about GPU transfer efficiency.

Original code: write data to staging buffer on CPU, vkCopyBuffer to GPU local memory, run int-float32 conversion on GPU out of that buffer. The copy operation shows 50% SM occupancy by compute warps, 50% unallocated warp slots in active SMs.

GPU memory write bandwidth is sitting around 2%, about 1.9 ms copy/shader run time.