Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed May 25, 2024
1 parent 3aaa0e5 commit c5a5021
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 40 deletions.
6 changes: 5 additions & 1 deletion .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,19 @@ Malloc
malloc
multicore
NDRange
nonnegative
Numa
Nsight
preprocessor
PTX
queryable
representable
rocTX
RTC
scalarizing
SIMT
structs
SYCL
typedefs
typedefs
WinGDB
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
78 changes: 39 additions & 39 deletions docs/how-to/debugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ HIP environment variable summary
Here are some of the more commonly used environment variables:

.. # COMMENT: The following lines define a break for use in the table below.
.. |br| raw:: html
.. |break| raw:: html

<br />

Expand All @@ -284,80 +284,80 @@ Here are some of the more commonly used environment variables:
- **Usage**

* - AMD_LOG_LEVEL
|br| Enable HIP log on different Level
|break| Enable HIP log on different Level
- 0
- 0: Disable log.
|br| 1: Enable log on error level
|br| 2: Enable log on warning and below levels
|br| 0x3: Enable log on information and below levels
|br| 0x4: Decode and display AQL packets
|break| 1: Enable log on error level
|break| 2: Enable log on warning and below levels
|break| 0x3: Enable log on information and below levels
|break| 0x4: Decode and display AQL packets

* - AMD_LOG_MASK
|br| Enable HIP log on different Level
|break| Enable HIP log on different Level
- 0x7FFFFFFF
- 0x1: Log API calls
|br| 0x02: Kernel and Copy Commands and Barriers
|br| 0x4: Synchronization and waiting for commands to finish
|br| 0x8: Enable log on information and below levels
|br| 0x20: Queue commands and queue contents
|br| 0x40: Signal creation, allocation, pool
|br| 0x80: Locks and thread-safety code
|br| 0x100: Copy debug
|br| 0x200: Detailed copy debug
|br| 0x400: Resource allocation, performance-impacting events
|br| 0x800: Initialization and shutdown
|br| 0x1000: Misc debug, not yet classified
|br| 0x2000: Show raw bytes of AQL packet
|br| 0x4000: Show code creation debug
|br| 0x8000: More detailed command info, including barrier commands
|br| 0x10000: Log message location
|br| 0xFFFFFFFF: Log always even mask flag is zero
|break| 0x02: Kernel and Copy Commands and Barriers
|break| 0x4: Synchronization and waiting for commands to finish
|break| 0x8: Enable log on information and below levels
|break| 0x20: Queue commands and queue contents
|break| 0x40: Signal creation, allocation, pool
|break| 0x80: Locks and thread-safety code
|break| 0x100: Copy debug
|break| 0x200: Detailed copy debug
|break| 0x400: Resource allocation, performance-impacting events
|break| 0x800: Initialization and shutdown
|break| 0x1000: Misc debug, not yet classified
|break| 0x2000: Show raw bytes of AQL packet
|break| 0x4000: Show code creation debug
|break| 0x8000: More detailed command info, including barrier commands
|break| 0x10000: Log message location
|break| 0xFFFFFFFF: Log always even mask flag is zero

* - HIP_LAUNCH_BLOCKING
|br| Used for serialization on kernel execution.
|break| Used for serialization on kernel execution.
- 0
- 0: Disable. Kernel executes normally.
|br| 1: Enable. Serializes kernel enqueue, behaves the same as AMD_SERIALIZE_KERNEL.
|break| 1: Enable. Serializes kernel enqueue, behaves the same as AMD_SERIALIZE_KERNEL.

* - HIP_VISIBLE_DEVICES (or CUDA_VISIBLE_DEVICES)
|br| Only devices whose index is present in the sequence are visible to HIP
|break| Only devices whose index is present in the sequence are visible to HIP
-
- 0,1,2: Depending on the number of devices on the system

* - GPU_DUMP_CODE_OBJECT
|br| Dump code object
|break| Dump code object
- 0
- 0: Disable
|br| 1: Enable
|break| 1: Enable

* - AMD_SERIALIZE_KERNEL
|br| Serialize kernel enqueue
|break| Serialize kernel enqueue
- 0
- 1: Wait for completion before enqueue
|br| 2: Wait for completion after enqueue
|br| 3: Both
|break| 2: Wait for completion after enqueue
|break| 3: Both

* - AMD_SERIALIZE_COPY
|br| Serialize copies
|break| Serialize copies
- 0
- 1: Wait for completion before enqueue
|br| 2: Wait for completion after enqueue
|br| 3: Both
|break| 2: Wait for completion after enqueue
|break| 3: Both

* - HIP_HOST_COHERENT
|br| Coherent memory in hipHostMalloc
|break| Coherent memory in hipHostMalloc
- 0
- 0: memory is not coherent between host and GPU
|br| 1: memory is coherent with host
|break| 1: memory is coherent with host

* - AMD_DIRECT_DISPATCH
|br| Enable direct kernel dispatch (Currently for Linux; under development for Windows)
|break| Enable direct kernel dispatch (Currently for Linux; under development for Windows)
- 1
- 0: Disable
|br| 1: Enable
|break| 1: Enable

* - GPU_MAX_HW_QUEUES
|br| The maximum number of hardware queues allocated per device
|break| The maximum number of hardware queues allocated per device
- 4
- The variable controls how many independent hardware queues HIP runtime can create per process,
per device. If an application allocates more HIP streams than this number, then HIP runtime reuses
Expand Down

0 comments on commit c5a5021

Please sign in to comment.