-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid OpenCL version and non-cpp source language lead to crash while generating device-dependent OpenCL binaries from a valid SPIR-V input #2515
Comments
Thanks for opening this issue. I think this documentation in SPIR-V spec needs to be considered in this discussion. It says: Thanks |
Thank you @asudarsa , a good point! For me it sounds like an argument that a difference just in OpSource parameters (or absence of OpSource) can't be tolerated as a reason of crash of device-dependent binaries building. In other words, Translator has to account for the attached reproducer in a way that guarantees successful building of the reproducer with all mentioned in the description options for OpSource arguments: without a version (or without OpSource itself), with a version but C language instead of CPP, and with a version and CPP source language. |
Thanks for the report. Lets go step by step.
While I was writing the comment Slava has posted one more, so:
Not a crash, but unresolved symbol after supposed linking with builtin library, which is failure in device's compiler, and following that:
How? We don't know the target device and with a further extend its compiler and this compiler's implementation details (dependency on opencl.version metadata is an implementation detail). I agree, that if we generate opencl.version metadata, then it's value should be not 0.0, but something more valid. What I'm not sure about is that if we must to generate it and what would be the default value. Probably better examination of https://registry.khronos.org/SPIR/specs/spir_spec-2.0.pdf is required. @svenvh may be you have some ideas? |
I wonder if we should avoid relying on metadata at all, and instead tell the translator through a command line option what to do? See also #792. |
Definitely! It should be indeed looking like OpenCL's |
One quick pointer here. 80dfd86 does not rely on OpSource. It relies on metadata provided in LLVM IR. I think we should avoid relying on OpSource during reverse translation. But we can rely on LLVM metadata for forward translation. If we stop relying on metadata during forward translation, we have to revert/fix the entire OCLToSPIRV flow which is predicated by checking if source is OpenCL C. |
+1 for using -cl-std like option instead of metadata. |
Yes, I'd also vote for this option. |
|
At the moment, generation of device-dependent OpenCL program binary from a valid SPIR-V input crashes in two slightly different, but related cases:
ocl-ver-crash.ll
:After creating SPIR-V by
llvm-as ocl-ver-crash.ll -o - | llvm-spirv --spirv-ext=+SPV_EXT_shader_atomic_float_min_max -o ocl-ver-crash.spv
and running AOT byopencl-aot ocl-ver-crash.spv --device=cpu --cmd=build
we observe a crash of the build followed by this confusing diagnostics:Let's look into LLVM IR that is produced from the problematic SPIRV (
llvm-spirv ocl-ver-crash.spv -r -o - | llvm-dis -o ocl-ver-crash-to-run.ll
):There is a correct declaration and call to
@_Z10atomic_maxPU3AS4Vff
, however, generation of the device-dependent OpenCL program binary fails due to the wrong OpenCL version:As a side note, this invalid version would not be a problem for AOT/JIT when using the atomic buildin with global address space, but for generic address space evidently there are requirements to the version.
We can see from
ocl-ver-crash.spv
contents why Translator creates a wrong OpenCL version:Such an input along with a logic of SPIRVToLLVM::transSourceLanguage():
SPIRV-LLVM-Translator/lib/SPIRV/SPIRVReader.cpp
Line 4823 in e1f7ebe
lead to 0.0 OpenCL version.
This is the first part of the issue. It has a clear path forward how to fix it, by not allowing producing invalid OpenCL version in SPIRVToLLVM::transSourceLanguage().
ocl-ver-crash-ver1.0.spv
:Translator generates a valid OpenCL 1.0 version this time, and refers to
OpenCL_C
source language (that is3
in!1 = !{i32 3, i32 100000}
):Running
opencl-aot ocl-ver-crash-ver1.0.spv --device=cpu --cmd=build
we get the same diagnostic, as before, no changes to better:Let's get back to SPIRV input and set OpSource using CPP rather than C source language, and create a new SPIRV input
ocl-ver-crash-ver1.0CPP.spv
:Output LLVM IR after
llvm-spirv ocl-ver-crash-ver1.0CPP.spv -r -o - | llvm-dis -o ocl-ver-crash-ver1.0CPP.ll
is:The only change is a reference to CPP language (value
4
):Let's try the new version of SPIRV input (
ocl-ver-crash-ver1.0CPP.spv
). Nowopencl-aot ocl-ver-crash-ver1.0CPP.spv --device=cpu --cmd=build
executes successfully, the crash has gone.This chain of input SPIRV transformation shows a dependency of the issue on C/C++ source language, independent from OpenCL version issue, suggests C on mangling vs. C++ mangling, and it also may (or may not) be related to #2513.
Note, that there is no user-defined mangling neither in the input LLVM IR, where builtin is referred to as
@_Z21__spirv_AtomicFMaxEXT
, not in generated SPIR-V code, where it's transformed toOpAtomicFMaxEXT
. This points out to the second issue in Translator that needs further investigation and maybe further discussion.FYI @MrSidims @svenvh @asudarsa
The text was updated successfully, but these errors were encountered: