Skip to content

Commit

Permalink
Merge pull request #228 from NVIDIA/ksimpson/update_nvjitlink_test
Browse files Browse the repository at this point in the history
update the nvjitlink bindings test
  • Loading branch information
leofang authored Nov 28, 2024
2 parents fdc76e8 + 30d65ed commit fd71ced
Show file tree
Hide file tree
Showing 6 changed files with 147 additions and 49 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CUDA-Python
# cuda-python

CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It consists of multiple components:

Expand All @@ -7,21 +7,21 @@ CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It c
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): Pythonic exposure of CUB cooperative algorithms
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): Pythonic exposure of Thrust parallel algorithms

For access to NVIDIA Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest).
For access to NVIDIA CPU & GPU Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest).

CUDA Python is currently undergoing an overhaul to improve existing and bring up new components. All of the previously available functionalities from the cuda-python package will continue to be available, please refer to the [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest) documentation for installation guide and further detail.

## CUDA-Python as a metapackage
## cuda-python as a metapackage

CUDA-Python is structured to become a metapackage that contains a collection of subpackages. Each subpackage is versioned independently, allowing installation of each component as needed.
`cuda-python` is being re-structured to become a metapackage that contains a collection of subpackages. Each subpackage is versioned independently, allowing installation of each component as needed.

### Subpackage: `cuda.core`

The `cuda.core` package offers idiomatic, pythonic access to CUDA Runtime and other functionalities.

The goals are to

1. Provide **idiomatic (pythonic)** access to CUDA Driver/Runtime
1. Provide **idiomatic ("pythonic")** access to CUDA Driver, Runtime, and JIT compiler toolchain
2. Focus on **developer productivity** by ensuring end-to-end CUDA development can be performed quickly and entirely in Python
3. **Avoid homegrown** Python abstractions for CUDA for new Python GPU libraries starting from scratch
4. **Ease** developer **burden of maintaining** and catching up with latest CUDA features
Expand Down
32 changes: 29 additions & 3 deletions cuda_bindings/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# `cuda.bindings`: Low-level CUDA interfaces

CUDA Python is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results.
`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results.

`cuda.bindings` is a subpackage of `cuda-python`.

## Installing

CUDA Python can be installed from:

* PYPI
* Conda (nvidia channel)
* PyPI
* Conda (conda-forge/nvidia channels)
* Source builds

Differences between these options are described in [Installation](https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html) documentation. Each package guarantees minor version compatibility.
Expand All @@ -31,6 +33,30 @@ Source builds work for multiple Python versions, however pre-build PyPI and Cond

* Python 3.9 to 3.12

## Developing

We use `pre-commit` to manage various tools to help development and ensure consistency.
```shell
pip install pre-commit
```

### Code linting

Run this command before checking in the code changes
```shell
pre-commit run -a --show-diff-on-failure
```
to ensure the code formatting is in line of the requirements (as listed in [`pyproject.toml`](./pyproject.toml)).

### Code signing

This repository implements a security check to prevent the CI system from running untrusted code. A part of the
security check consists of checking if the git commits are signed. See
[here](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/faqs/#why-did-i-receive-a-comment-that-my-pull-request-requires-additional-validation)
and
[here](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
for more details, including how to sign your commits.

## Testing

Latest dependencies can be found in [requirements.txt](https://github.com/NVIDIA/cuda-python/blob/main/cuda_bindings/requirements.txt).
Expand Down
112 changes: 76 additions & 36 deletions cuda_bindings/tests/test_nvjitlink.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,22 @@

import pytest

from cuda.bindings import nvjitlink
from cuda.bindings import nvjitlink, nvrtc

ptx_kernel = """
.version 8.5
.target sm_90
# Establish a handful of compatible architectures and PTX versions to test with
ARCHITECTURES = ["sm_60", "sm_75", "sm_80", "sm_90"]
PTX_VERSIONS = ["5.0", "6.4", "7.0", "8.5"]


def ptx_header(version, arch):
return f"""
.version {version}
.target {arch}
.address_size 64
"""


ptx_kernel = """
.visible .entry _Z6kernelPi(
.param .u64 _Z6kernelPi_param_0
)
Expand All @@ -28,18 +37,40 @@
"""

minimal_ptx_kernel = """
.version 8.5
.target sm_90
.address_size 64
.func _MinimalKernel()
{
ret;
}
"""

ptx_kernel_bytes = ptx_kernel.encode("utf-8")
minimal_ptx_kernel_bytes = minimal_ptx_kernel.encode("utf-8")
ptx_kernel_bytes = [
(ptx_header(version, arch) + ptx_kernel).encode("utf-8") for version, arch in zip(PTX_VERSIONS, ARCHITECTURES)
]
minimal_ptx_kernel_bytes = [
(ptx_header(version, arch) + minimal_ptx_kernel).encode("utf-8")
for version, arch in zip(PTX_VERSIONS, ARCHITECTURES)
]


# create a valid LTOIR input for testing
@pytest.fixture
def get_dummy_ltoir():
def CHECK_NVRTC(err):
if err != nvrtc.nvrtcResult.NVRTC_SUCCESS:
raise RuntimeError(f"Nvrtc Error: {err}")

empty_cplusplus_kernel = "__global__ void A() {}"
err, program_handle = nvrtc.nvrtcCreateProgram(empty_cplusplus_kernel.encode(), b"", 0, [], [])
CHECK_NVRTC(err)
nvrtc.nvrtcCompileProgram(program_handle, 1, [b"-dlto"])
err, size = nvrtc.nvrtcGetLTOIRSize(program_handle)
CHECK_NVRTC(err)
empty_kernel_ltoir = b" " * size
(err,) = nvrtc.nvrtcGetLTOIR(program_handle, empty_kernel_ltoir)
CHECK_NVRTC(err)
(err,) = nvrtc.nvrtcDestroyProgram(program_handle)
CHECK_NVRTC(err)
return empty_kernel_ltoir


def test_unrecognized_option_error():
Expand All @@ -52,39 +83,41 @@ def test_invalid_arch_error():
nvjitlink.create(1, ["-arch=sm_XX"])


def test_create_and_destroy():
handle = nvjitlink.create(1, ["-arch=sm_53"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_create_and_destroy(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
assert handle != 0
nvjitlink.destroy(handle)


def test_complete_empty():
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_complete_empty(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_add_data():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
nvjitlink.add_data(
handle, nvjitlink.InputType.ANY, minimal_ptx_kernel_bytes, len(minimal_ptx_kernel_bytes), "minimal_test_data"
)
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_add_data(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_add_file(tmp_path):
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_add_file(option, ptx_bytes, tmp_path):
handle = nvjitlink.create(1, [f"-arch={option}"])
file_path = tmp_path / "test_file.cubin"
file_path.write_bytes(ptx_kernel_bytes)
file_path.write_bytes(ptx_bytes)
nvjitlink.add_file(handle, nvjitlink.InputType.ANY, str(file_path))
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_get_error_log():
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_get_error_log(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.complete(handle)
log_size = nvjitlink.get_error_log_size(handle)
log = bytearray(log_size)
Expand All @@ -93,9 +126,10 @@ def test_get_error_log():
nvjitlink.destroy(handle)


def test_get_info_log():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_get_info_log(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
log_size = nvjitlink.get_info_log_size(handle)
log = bytearray(log_size)
Expand All @@ -104,9 +138,10 @@ def test_get_info_log():
nvjitlink.destroy(handle)


def test_get_linked_cubin():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_get_linked_cubin(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
cubin_size = nvjitlink.get_linked_cubin_size(handle)
cubin = bytearray(cubin_size)
Expand All @@ -115,11 +150,16 @@ def test_get_linked_cubin():
nvjitlink.destroy(handle)


def test_get_linked_ptx():
# TODO improve this test to call get_linked_ptx without this error
handle = nvjitlink.create(2, ["-arch=sm_90", "-lto"])
with pytest.raises(nvjitlink.nvJitLinkError, match="ERROR_NVVM_COMPILE"):
nvjitlink.complete(handle)
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_get_linked_ptx(option, get_dummy_ltoir):
handle = nvjitlink.create(3, [f"-arch={option}", "-lto", "-ptx"])
nvjitlink.add_data(handle, nvjitlink.InputType.LTOIR, get_dummy_ltoir, len(get_dummy_ltoir), "test_data")
nvjitlink.complete(handle)
ptx_size = nvjitlink.get_linked_ptx_size(handle)
ptx = bytearray(ptx_size)
nvjitlink.get_linked_ptx(handle, ptx)
assert len(ptx) == ptx_size
nvjitlink.destroy(handle)


def test_package_version():
Expand Down
36 changes: 35 additions & 1 deletion cuda_core/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,43 @@
# `cuda.core`: (experimental) pythonic CUDA module

Currently under active development. To build from source, just do:
Currently under active developmen; see [the documentation](https://nvidia.github.io/cuda-python/cuda-core/latest/) for more details.

## Installing

TO build from source, just do:
```shell
$ git clone https://github.com/NVIDIA/cuda-python
$ cd cuda-python/cuda_core # move to the directory where this README locates
$ pip install .
```
For now `cuda-python` is a required dependency.

## Developing

We use `pre-commit` to manage various tools to help development and ensure consistency.
```shell
pip install pre-commit
```

### Code linting

Run this command before checking in the code changes
```shell
pre-commit run -a --show-diff-on-failure
```
to ensure the code formatting is in line of the requirements (as listed in [`pyproject.toml`](./pyproject.toml)).

### Code signing

This repository implements a security check to prevent the CI system from running untrusted code. A part of the
security check consists of checking if the git commits are signed. See
[here](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/faqs/#why-did-i-receive-a-comment-that-my-pull-request-requires-additional-validation)
and
[here](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
for more details, including how to sign your commits.

## Testing

To run these tests:
* `python -m pytest tests/` against local builds
* `pytest tests/` against installed packages
4 changes: 1 addition & 3 deletions cuda_core/cuda/core/experimental/_stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,9 +211,7 @@ def wait(self, event_or_stream: Union[Event, Stream]):
try:
stream = Stream._init(event_or_stream)
except Exception as e:
raise ValueError(
"only an Event, Stream, or object supporting __cuda_stream__ can be waited"
) from e
raise ValueError("only an Event, Stream, or object supporting __cuda_stream__ can be waited") from e
else:
stream = event_or_stream
event = handle_return(cuda.cuEventCreate(cuda.CUevent_flags.CU_EVENT_DISABLE_TIMING))
Expand Down
2 changes: 1 addition & 1 deletion cuda_python/docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ multiple components:
- `cuda.cooperative`_: Pythonic exposure of CUB cooperative algorithms
- `cuda.parallel`_: Pythonic exposure of Thrust parallel algorithms

For access to NVIDIA Math Libraries, please refer to `nvmath-python`_.
For access to NVIDIA CPU & GPU Math Libraries, please refer to `nvmath-python`_.

.. _nvmath-python: https://docs.nvidia.com/cuda/nvmath-python/latest

Expand Down

0 comments on commit fd71ced

Please sign in to comment.