Updating the order of conditional checks to load library files in the Cuda component #268
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
This PR updates the ordering of conditional checks in the Cuda component. Specifically for the functions
load_cudart_sym
,load_cupti_common_sym
, andload_nvpw_sym
.Current master has the following workflow of checks:
PAPI_CUDA_RUNTIME
/PAPI_CUDA_CUPTI
/PAPI_CUDA_PERFWORKS
are set, if so we will search this path for the necessary library files.dl_iterate_phdr()
to walk through the list of shared objects linked in the executable.PAPI_CUDA_ROOT
is set and the previous two steps failed; we then search the path thatPAPI_CUDA_ROOT
is set to for the necessary library files.dlopen
.This PR updates the workflow of checks to:
PAPI_CUDA_RUNTIME
/PAPI_CUDA_CUPTI
/PAPI_CUDA_PERFWORKS
are set, if so we will search this path for the necessary library files.PAPI_CUDA_ROOT
is set and the previous step failed; we then search the path thatPAPI_CUDA_ROOT
is set to for the necessary library files.dl_iterate_phdr()
to walk through the list of shared objects linked in the executable.dlopen
.This is being done since we state in the documentation that
PAPI_CUDA_ROOT
is needed for both runtime and compilation.Tested on Leconte with eight Tesla V100's with results:
PAPI_CUDA_RUNTIME
/PAPI_CUDA_CUPTI
/PAPI_CUDA_PERFWORKS
are set these do indeed take precedent as expected.PAPI_CUDA_ROOT
then we do have successfully have this as the second option to be searched.Author Checklist
Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
Commits are self contained and only do one thing
Commits have a header of the form:
module: short description
Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
The PR needs to pass all the tests