You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue documents how NVML can be used to get events from the GPU:
All calls to NVML have to be encased in a pair of nvmlinit_v2() and nvmlShutdown() calls.
To interface with a specific card, you need a nvmlDevice_t handle. A specific device can be retrieved in multiple ways, for example by its PCI Bus Id. (nvmlDeviceGetHandleByPciBusId_v2())
Samples are read with nvmlDeviceGetProcessUtilization:
the last parameter to nvmlDeviceGetProcessUtilization controls that only events with timestamp larger than, in this case, last_readout are returned.
First, nvmlDeviceGetProcessUtilization is called with the event buffer field set to NULL. This updates samples_count with the amount of samples that can be read, after that a buffer is allocated and the events are actually read.
Sample data includes the pid of the executing process and utilization values for the Decoder, Encoder, Framebuffer Memory and SM (Compute) units ( in percents between 1-100
This feature is not supported on Kepler or Ampere cards (So no luck with Taurus' A100s).
Sample rate can not be controlled by software."Each sample period may be between 1 second and
1/6 second, depending on the product being queried."
nvmlDeviceGetSamples can be used in the same vein to get device wide samples. For example for the clock speed of the memory and compute.
There is also a big zoo of different getters for a wide array of information. It might be worthwile to investigate if higher or at least user controlled sampling periods can be achieved by manually polling.
The text was updated successfully, but these errors were encountered:
This issue documents how NVML can be used to get events from the GPU:
All calls to NVML have to be encased in a pair of
nvmlinit_v2()
andnvmlShutdown()
calls.To interface with a specific card, you need a nvmlDevice_t handle. A specific device can be retrieved in multiple ways, for example by its PCI Bus Id. (
nvmlDeviceGetHandleByPciBusId_v2()
)Samples are read with
nvmlDeviceGetProcessUtilization
:the last parameter to
nvmlDeviceGetProcessUtilization
controls that only events with timestamp larger than, in this case,last_readout
are returned.First,
nvmlDeviceGetProcessUtilization
is called with the event buffer field set to NULL. This updates samples_count with the amount of samples that can be read, after that a buffer is allocated and the events are actually read.There is some wonkyness to the readout, as documented in this comment in the nvtop code
Sample data includes the pid of the executing process and utilization values for the Decoder, Encoder, Framebuffer Memory and SM (Compute) units ( in percents between 1-100
This feature is not supported on Kepler or Ampere cards (So no luck with Taurus' A100s).
Sample rate can not be controlled by software."Each sample period may be between 1 second and
1/6 second, depending on the product being queried."
nvmlDeviceGetSamples
can be used in the same vein to get device wide samples. For example for the clock speed of the memory and compute.There is also a big zoo of different getters for a wide array of information. It might be worthwile to investigate if higher or at least user controlled sampling periods can be achieved by manually polling.
The text was updated successfully, but these errors were encountered: