Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support turing and ampere for CUDA #220

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

forrestjgq
Copy link

In turing and ampere device, invalid device symbol will be reported in:

123 static __constant__ GpuData cData;
124 static GpuData cpuData;
125 
126 void SetKernelsGpuData(GpuData* pData)
127 {
128     cudaError_t status;
129     status = cudaMemcpyToSymbol(cData, pData, sizeof(GpuData));                                                         
130    >> RTERROR(status, "SetKernelsGpuData copy to cData failed");
131     memcpy(&cpuData, pData, sizeof(GpuData));
132 }

@syntesys87
Copy link

Tried in a system with a 4070ti and cuda 11.8. Working!

@TitouanCh
Copy link

works for rtx a4500

@atillack
Copy link
Member

atillack commented Jul 23, 2024

@forrestjgq Thank you for submitting your PR. While we won't go ahead and merge it - as adding to the static TARGETS list (and it existing in the first place) is not the best choice - it did inspire me to solve this particular pain point more generally.

To that end, I added PR #270 which by default will compile Cuda for every supported target of the installed Cuda version (larger than compute capability 50 as lower compute capabilities have a deprecation warning). In other words, Cuda 11 will compile up to compute capability 86, while Cuda 12 will go to compute capability 90 (and beyond if/when Nvidia adds it).

As with the current code, TARGETS can be used to override this during compilation, i.e. make DEVICE=GPU TARGETS=86 will only produce code optimized for compute architecture 86.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants