Is fixing/pinning memory also page locking? #67016
-
I'm currently investigating a problem where I have to hand over page-locked memory to an async CUDA call.
For that reason I have to pin the managed memory and hand it over to the Since I'm writing a library that wraps the CUDA calls and hides the internals, I wrote a wrapper class that handles the inner workings and the unsafe void SetData<T>(in ReadOnlySpan<T> data) where T : unmanaged
{
fixed (T* ptr = data)
{
cuMemcpy(devptr, ptr, sizeof(T) * data.Length);
}
} Now comes the tricky part with async CUDA invocation. Using async calls you create a stream object (something like a monitor) that "flows" from method to method so I can execute multiple operations in different streams that run in parallel. For example I copy data from the host to the device, execute a kernel on the data and copy the data back from device to host (just in short - not working) devptr.SetData(mem);
runKernel(devptr);
devptr.GetData(); All is running in sync one after the other. With streams the problem is that the method immediately returns and also the requirement of the method is that the memory is page-locked. For that reason CUDA provides a method to page-lock memory which takes a pinned memory handle/pointer.
so first I need to pin the memory and then I also need to page-lock it. So my first question is if pinning also means page-locking the memory? The question I'm asking is that I want to provide a safe way to pin and page-lock memory so that for async call the user of the library has to do the following call: static unsafe IDisposable PageLock<T>(in ReadOnlySpan<T> data) where T : unmanaged
{
// Here is the part I cannot do with Span<T> since the lifetime of the fixed is limited
// If I really need it I have to go with arrays because only here I can say GCHandle.Alloc(data, Pinned).
// Which is a severe problem since the ReadOnlySpan<T> is the only common denominator because the data can come from
// managed and unmanaged sources (either from managed memory, memory mapped files or from managed CUDA memory.
// This is what currently prevent from doing this at all.
fixed(T* ptr = data)
{
cuMemHostRegister(ptr);
return // some diposable handle
}
} So that the final async call in the example above should look like float[] data = new float[100];
using var stream = new CudaStream()
using var lockedmem = CudaMem.Lock(mem);
devptr.SetData(lockedmem, stream);
runKernel(devptr, stream);
devptr.GetData(stream); Are there any ideas
I know there is memory but it cannot be used in this case. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
The short answer is that you can't, the long answer is that you can, but really shouldn't. Let that sample code serve as a caution rather than a recommendation (seriously). Using a Keep in mind that you need to ensure the |
Beta Was this translation helpful? Give feedback.
-
No. Pinning is a concept of the GC and ensures that an object's virtual address does not change. Page-locking is a concept of the operating system and ensures that a page stays in physical memory. |
Beta Was this translation helpful? Give feedback.
No. Pinning is a concept of the GC and ensures that an object's virtual address does not change. Page-locking is a concept of the operating system and ensures that a page stays in physical memory.