Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed May 25, 2024
1 parent 0228789 commit 0abb1a9
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 16 deletions.
9 changes: 9 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ AQL
builtins
Builtins
clr
cuCtx
cuDNN
EIGEN
enqueue
enqueues
embeded
Expand All @@ -14,15 +17,21 @@ GPGPU
hardcoded
hipcc
hipother
hcBLAS
icc
Interoperation
interoperate
IPC
latencies
LUID
Malloc
malloc
multicore
NDRange
Numa
Nsight
PTX
rocTX
RTC
SIMT
SYCL
Expand Down
4 changes: 2 additions & 2 deletions docs/how-to/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ Both HIP and CUDA are dialects of C++, and thus porting between them is relative
Both dialects support templates, classes, lambdas, and other C++ constructs.
As one example, the hipify-perl tool was originally a Perl script that used simple text conversions from CUDA to HIP.
HIP and CUDA provide similar math library calls as well. In summary, the HIP philosophy was to make the HIP language close enough to CUDA that the porting effort is relatively simple.
This reduces the potential for error, and also makes it easy to automate the translation. HIP's goal is to quickly get the ported program running on both platforms with little manual intervention, so that the programmer can focus on performance optimizations.
This reduces the potential for error, and also makes it easy to automate the translation. HIP goal is to quickly get the ported program running on both platforms with little manual intervention, so that the programmer can focus on performance optimizations.

There have been several tools that have attempted to convert CUDA into OpenCL, such as CU2CL. OpenCL is a C99-based kernel language (rather than C++) and also does not support single-source compilation.
As a result, the OpenCL syntax is different from CUDA, and the porting tools have to perform some heroic transformations to bridge this gap.
Expand Down Expand Up @@ -361,7 +361,7 @@ Due to different working mechanisms on operating systems like Windows vs Linux,
## Does HIP support LUID?

Starting ROCm 6.0, HIP runtime supports Locally Unique Identifier (LUID).
This feature enables the local physical device(s) to interoperate with other devices. For example, DX12.
This feature enables the local physical device(s) to interoperate with other devices. For example, DirectX 12.

HIP runtime sets device LUID properties so the driver can query LUID to identify each device for interoperability.

Expand Down
28 changes: 14 additions & 14 deletions docs/how-to/hip_rtc.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ hiprtcCreateProgram(&prog, // HIPRTC program
&header_names[0]); // Name of header files
```

hiprtcCreateProgram API also allows you to add headers which can be included in your rtc program.
For online compilation, the compiler pre-defines HIP device API functions, HIP specific types and macros for device compilation, but does not include standard C/C++ headers by default. Users can only include header files provided to hiprtcCreateProgram.
`hiprtcCreateProgram` API also allows you to add headers which can be included in your RTC program.
For online compilation, the compiler pre-defines HIP device API functions, HIP specific types and macros for device compilation, but does not include standard C/C++ headers by default. Users can only include header files provided to `hiprtcCreateProgram`.

After associating the kernel string with `hiprtcProgram`, you can now compile this program using:

Expand All @@ -51,7 +51,7 @@ hiprtcCompileProgram(prog, // hiprtcProgram
options); // Clang Options [Supported Clang Options](clang_options.md)
```
hiprtcCompileProgram returns a status value which can be converted to string via `hiprtcGetErrorString`. If compilation is successful, hiprtcCompileProgram will return `HIPRTC_SUCCESS`.
`hiprtcCompileProgram` returns a status value which can be converted to string via `hiprtcGetErrorString`. If compilation is successful, `hiprtcCompileProgram` will return `HIPRTC_SUCCESS`.
If the compilation fails, you can look up the logs via:
Expand Down Expand Up @@ -231,7 +231,7 @@ HIPRTC provides a few HIPRTC specific flags
* `--gpu-architecture` : This flag can guide the code object generation for a specific gpu arch. Example: `--gpu-architecture=gfx906:sramecc+:xnack-`, its equivalent to `--offload-arch`.
* This option is compulsory if compilation is done on a system without AMD GPUs supported by HIP runtime.
* Otherwise, HIPRTC will load the hip runtime and gather the current device and its architecture info and use it as option.
* `-fgpu-rdc` : This flag when provided during the hiprtcCompileProgram generates the bitcode (HIPRTC doesn't convert this bitcode into ISA and binary). This bitcode can later be fetched using hiprtcGetBitcode and `hiprtcGetBitcodeSize` APIs.
* `-fgpu-rdc` : This flag when provided during the `hiprtcCompileProgram` generates the bitcode (HIPRTC doesn't convert this bitcode into ISA and binary). This bitcode can later be fetched using hiprtcGetBitcode and `hiprtcGetBitcodeSize` APIs.

### Bitcode

Expand Down Expand Up @@ -283,7 +283,7 @@ hiprtcLinkCreate( num_options, // number of options
&rtc_link_state ); // HIPRTC link state created upon success
```
Following which, the bitcode data can be added to this link instance via hiprtcLinkAddData (if the data is present as a string) or `hiprtcLinkAddFile` (if the data is present as a file) with the appropriate input type according to the data or the bitcode used.
Following which, the bitcode data can be added to this link instance via `hiprtcLinkAddData` (if the data is present as a string) or `hiprtcLinkAddFile` (if the data is present as a file) with the appropriate input type according to the data or the bitcode used.
```cpp
hiprtcLinkAddData(rtc_link_state, // HIPRTC link state
Expand Down Expand Up @@ -321,7 +321,7 @@ hipModuleLoadData(&module, binary);
#### Note
* The compiled binary must be loaded before HIPRTC link instance is destroyed using the hiprtcLinkDestroy API.
* The compiled binary must be loaded before HIPRTC link instance is destroyed using the `hiprtcLinkDestroy` API.
```cpp
hiprtcLinkDestroy(rtc_link_state);
Expand All @@ -331,7 +331,7 @@ hiprtcLinkDestroy(rtc_link_state);

### Input Types

HIPRTC provides `hiprtcJITInputType` enumeration type which defines the input types accepted by the Linker APIs. Here are the enum values of `hiprtcJITInputType`. However only the input types `HIPRTC_JIT_INPUT_LLVM_BITCODE`, `HIPRTC_JIT_INPUT_LLVM_BUNDLED_BITCODE` and `HIPRTC_JIT_INPUT_LLVM_ARCHIVES_OF_BUNDLED_BITCODE` are supported currently.
HIPRTC provides `hiprtcJITInputType` enumeration type which defines the input types accepted by the Linker APIs. Here are the `enum` values of `hiprtcJITInputType`. However only the input types `HIPRTC_JIT_INPUT_LLVM_BITCODE`, `HIPRTC_JIT_INPUT_LLVM_BUNDLED_BITCODE` and `HIPRTC_JIT_INPUT_LLVM_ARCHIVES_OF_BUNDLED_BITCODE` are supported currently.

`HIPRTC_JIT_INPUT_LLVM_BITCODE` can be used to load both LLVM bitcode or LLVM IR assembly code. However, `HIPRTC_JIT_INPUT_LLVM_BUNDLED_BITCODE` and `HIPRTC_JIT_INPUT_LLVM_ARCHIVES_OF_BUNDLED_BITCODE` are only for bundled bitcode and archive of bundled bitcode.

Expand Down Expand Up @@ -376,9 +376,9 @@ hiprtcLinkCreate(2, jit_options.data(), (void**)lopts, &linkstate);
## Error Handling
HIPRTC defines the `hiprtcResult` enumeration type and a function hiprtcGetErrorString for API call error handling. `hiprtcResult` enum defines the API result codes. HIPRTC APIs return `hiprtcResult` to indicate the call result. hiprtcGetErrorString function returns a string describing the given `hiprtcResult` code, e.g., HIPRTC_SUCCESS to "HIPRTC_SUCCESS". For unrecognized enumeration values, it returns "Invalid HIPRTC error code".
HIPRTC defines the `hiprtcResult` enumeration type and a function `hiprtcGetErrorString` for API call error handling. `hiprtcResult` `enum` defines the API result codes. HIPRTC APIs return `hiprtcResult` to indicate the call result. `hiprtcGetErrorString` function returns a string describing the given `hiprtcResult` code, e.g., HIPRTC_SUCCESS to "HIPRTC_SUCCESS". For unrecognized enumeration values, it returns "Invalid HIPRTC error code".
`hiprtcResult` enum supported values and the hiprtcGetErrorString usage are mentioned below.
`hiprtcResult` `enum` supported values and the `hiprtcGetErrorString` usage are mentioned below.
```cpp
HIPRTC_SUCCESS = 0,
Expand Down Expand Up @@ -416,7 +416,7 @@ Currently, it returns hardcoded value. This should be implemented to return HIP

HIPRTC mangles the `__global__` function names and names of `__device__` and `__constant__` variables. If the generated binary is being loaded using the HIP Runtime API, the kernel function or `__device__/__constant__` variable must be looked up by name, but this is very hard when the name has been mangled. To overcome this, HIPRTC provides API functions that map `__global__` function or `__device__/__constant__` variable names in the source to the mangled names present in the generated binary.

The two APIs `hiprtcAddNameExpression` and hiprtcGetLoweredName provide this functionality. First, a 'name expression' string denoting the address for the `__global__` function or `__device__/__constant__` variable is provided to `hiprtcAddNameExpression`. Then, the program is compiled with hiprtcCompileProgram. During compilation, HIPRTC will parse the name expression string as a C++ constant expression at the end of the user program. Finally, the function hiprtcGetLoweredName is called with the original name expression and it returns a pointer to the lowered name. The lowered name can be used to refer to the kernel or variable in the HIP Runtime API.
The two APIs `hiprtcAddNameExpression` and `hiprtcGetLoweredName` provide this functionality. First, a 'name expression' string denoting the address for the `__global__` function or `__device__/__constant__` variable is provided to `hiprtcAddNameExpression`. Then, the program is compiled with `hiprtcCompileProgram`. During compilation, HIPRTC will parse the name expression string as a C++ constant expression at the end of the user program. Finally, the function `hiprtcGetLoweredName` is called with the original name expression and it returns a pointer to the lowered name. The lowered name can be used to refer to the kernel or variable in the HIP Runtime API.

### Note

Expand Down Expand Up @@ -444,7 +444,7 @@ __global__ void f3(int *result) { *result = sizeof(T); }
)"};
```

hiprtcAddNameExpression is called with various name expressions referring to the address of `__global__` functions and `__device__/__constant__` variables.
`hiprtcAddNameExpression` is called with various name expressions referring to the address of `__global__` functions and `__device__/__constant__` variables.

```cpp
kernel_name_vec.push_back("&f1");
Expand Down Expand Up @@ -497,15 +497,15 @@ HIPRTC follows the below versioning.

* Linux
* HIPRTC follows the same versioning as HIP runtime library.
* The `so` name field for the shared library is set to MAJOR version. For example, for HIP 5.3 the `so` name is set to 5 (hiprtc.so.5).
* The `so` name field for the shared library is set to MAJOR version. For example, for HIP 5.3 the `so` name is set to 5 (`hiprtc.so.5`).
* Windows
* HIPRTC dll is named as `hiprtcXXYY.dll` where XX is MAJOR version and YY is MINOR version. For example, for HIP 5.3 the name is hiprtc0503.dll.
* HIPRTC dll is named as `hiprtcXXYY.dll` where XX is MAJOR version and YY is MINOR version. For example, for HIP 5.3 the name is `hiprtc0503.dll`.

## HIP header support

* Added HIPRTC support for all the hip common header files such as library_types.h, hip_math_constants.h, hip_complex.h, math_functions.h, surface_types.h etc. from 6.1. HIPRTC users need not include any HIP macros or constants explicitly in their header files. All of these should get included via HIPRTC builtins when the app links to HIPRTC library.

## Deprecation notice

* Currently HIPRTC APIs are separated from HIP APIs and HIPRTC is available as a separate library `libhiprtc.so`/`libhiprtc.dll`. But on Linux, HIPRTC symbols are also present in `libhipamd64.so` in order to support the existing applications. Gradually, these symbols will be removed from HIP library and applications using HIPRTC will be required to explicitly link to HIPRTC library. However, on Windows hiprtc.dll must be used as the `hipamd64.dll` doesn't contain the HIPRTC symbols.
* Currently HIPRTC APIs are separated from HIP APIs and HIPRTC is available as a separate library `libhiprtc.so`/`libhiprtc.dll`. But on Linux, HIPRTC symbols are also present in `libhipamd64.so` in order to support the existing applications. Gradually, these symbols will be removed from HIP library and applications using HIPRTC will be required to explicitly link to HIPRTC library. However, on Windows `hiprtc.dll` must be used as the `hipamd64.dll` doesn't contain the HIPRTC symbols.
* Data types such as `uint32_t`, `uint64_t`, `int32_t`, `int64_t` defined in std namespace in HIPRTC are deprecated earlier and are being removed from ROCm release 6.1 since these can conflict with the standard C++ data types. These data types are now prefixed with `__hip__`, e.g. `__hip_uint32_t`. Applications previously using `std::uint32_t` or similar types can use `__hip_` prefixed types to avoid conflicts with standard std namespace or application can have their own definitions for these types. Also, type_traits templates previously defined in std namespace are moved to `__hip_internal` namespace as implementation details.

0 comments on commit 0abb1a9

Please sign in to comment.