Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda device functions in header files not compiled correctly #276

Closed
grueyg opened this issue Sep 24, 2024 · 2 comments
Closed

Cuda device functions in header files not compiled correctly #276

grueyg opened this issue Sep 24, 2024 · 2 comments

Comments

@grueyg
Copy link

grueyg commented Sep 24, 2024

Hello, @cloudhan .

Thank you for your contributions to rules_cuda.

I am trying to link a CUDA library to my project, but I encounter errors due to the presence of CUDA device functions (e.g., __umul64hi) in some of the library's header files. When using GCC for compilation, these functions are not recognized, leading to errors.

As I am not very familiar with Bazel and rules_cuda, I couldn't find a way to instruct Bazel to use nvcc for compiling these headers. I am unsure if I have misunderstood the cause of the errors.

Here is a minimal example that reproduces the issue:

main.cc:

#include <iostream>
#include "cuda_example.cuh"

int main() {
    std::cout << "Calling CUDA static library function..." << std::endl;

    const int n = 5; 
    uint64_t operand1[n] = {123456789012345, 987654321098765, 112233445566778, 998877665544332, 123456789012345};
    uint64_t operand2[n] = {543210987654321, 112233445566778, 998877665544332, 123456789012345, 987654321098765};
    uint64_t result[n] = {0}; 

    test_umul64hi(operand1, operand2, result, n);

    std::cout << "Results of __umul64hi computation:" << std::endl;
    for (int i = 0; i < n; ++i) {
        std::cout << "result[" << i << "] = " << result[i] << std::endl;
    }

    return 0;
}

BUILD file for main:

cc_binary(
    name = "main",
    srcs = ["main.cc"],
    deps = [
        "//third_party/cuda_example:cuda_example",
    ],
)

cuda_example.cuh:

#include <cuda_runtime_api.h>
#include <stdint.h>

#ifndef CUDA_EXAMPLE_H
#define CUDA_EXAMPLE_H

extern "C" void test_umul64hi(uint64_t* operand1, uint64_t* operand2, uint64_t* result, int n);

__forceinline__ __device__ uint64_t multiply_high_uint64(uint64_t operand1, uint64_t operand2) {
    return __umul64hi(operand1, operand2);
}

__global__ void test_umul64hi_kernel(uint64_t* operand1, uint64_t* operand2, uint64_t* result, int n) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if (idx < n) {
        result[idx] = multiply_high_uint64(operand1[idx], operand2[idx]);
    }
}

#endif  // CUDA_EXAMPLE_H

BUILD file for cuda_example:

cc_library(
    name = "cuda_example_lib",
    hdrs = ["include/cuda_example.cuh"],
    srcs = ["lib/libcuda_example.a"],
    includes = ["include"], 
    linkstatic = True,
    visibility = ["//visibility:public"], 
    deps = [
        "@local_cuda//:cuda_runtime"
    ]
)

When I run the build command:

bazel build //src:main --@rules_cuda//cuda:enable=True

I encounter the following errors:

INFO: Analyzed target //src:main (0 packages loaded, 0 targets configured).
ERROR: /home/user/project/bazel-cuda-test/src/BUILD.bazel:1:10: Compiling src/main.cc failed: (Exit 1): 
gcc failed: error executing CppCompile command (from target //src:main) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++14' -MD -MF ... (remaining 32 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
In file included from src/main.cc:2:
third_party/cuda_example/include/cuda_example.cuh: In function 'uint64_t multiply_high_uint64(uint64_t, uint64_t)':
third_party/cuda_example/include/cuda_example.cuh:10:12: error: '__umul64hi' was not declared in this scope; did you mean 'test_umul64hi'?
   10 |     return __umul64hi(operand1, operand2);
      |            ^~~~~~~~~~
      |            test_umul64hi
third_party/cuda_example/include/cuda_example.cuh: In function 'void test_umul64hi_kernel(uint64_t*, uint64_t*, uint64_t*, int)':
third_party/cuda_example/include/cuda_example.cuh:14:15: error: 'blockIdx' was not declared in this scope
   14 |     int idx = blockIdx.x * blockDim.x + threadIdx.x;
      |               ^~~~~~~~
third_party/cuda_example/include/cuda_example.cuh:14:28: error: 'blockDim' was not declared in this scope
   14 |     int idx = blockIdx.x * blockDim.x + threadIdx.x;
      |                            ^~~~~~~~
third_party/cuda_example/include/cuda_example.cuh:14:41: error: 'threadIdx' was not declared in this scope
   14 |     int idx = blockIdx.x * blockDim.x + threadIdx.x;
      |                                         ^~~~~~~~~
Target //src:main failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.661s, Critical Path: 0.48s
INFO: 3 processes: 3 internal.
ERROR: Build did NOT complete successfully

I have found that the header files are not being compiled with nvcc, which seems to be the root cause of the errors. One direct but inefficient solution would be to add #ifdef __CUDACC__ to each header file, but my library contains a large number of such headers, making manual modifications impractical.

Could you please provide guidance on how to resolve this issue? Is there a way to ensure that .cuh files are processed by nvcc when using Bazel? Or is it possible that this error is not caused by this issue?

Thank you for your help!

@cloudhan
Copy link
Collaborator

You actually leaked some cuda specific implementation into a c++ translation units.

In main.cc, you #include "cuda_example.cuh" which IMO is a very bad practice.
You should either

  • make it main.cu and compile it as cuda_binary or
  • move cuda_example.cuh related implementation in to a cuda_library and only use its c/c++ interface.

@grueyg
Copy link
Author

grueyg commented Sep 24, 2024

@cloudhan . Sorry for my late reply. I followed your suggestions and successfully resolved the issue in my project. Both methods worked well.

Thanks again!

@grueyg grueyg closed this as completed Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants