So, I have an answer to my previous question about GPU transfer efficiency. Original ...

2026-02-02 16:01:42 UTC

So, I have an answer to my previous question about GPU transfer efficiency.

Original code: write data to staging buffer on CPU, vkCopyBuffer to GPU local memory, run int-float32 conversion on GPU out of that buffer. The copy operation shows 50% SM occupancy by compute warps, 50% unallocated warp slots in active SMs.

GPU memory write bandwidth is sitting around 2%, about 1.9 ms copy/shader run time.

Author Public Key

npub1cddglts94qutscms0qpmk87lel9m8xku7q0wr20u2th5fxvvunqqxz9vpd

Seen on

wss://relay.ditto.pub

Show more details

Andrew Zonenberg on Nostr: So, I have an answer to my previous question about GPU transfer efficiency. Original ...