Model Analyzer

The Triton Model Analyzer is a tool that uses Performance Analyzer to send requests to your model while measuring GPU memory and compute utilization. The Model Analyzer is specifically useful for characterizing the GPU memory requirements for your model under different batching and model instance configurations. Once you have this GPU memory usage information you can more intelligently decide on how to combine multiple models on the same GPU while remaining within the memory capacity of the GPU.

For more detailed examples and explanations of using Model Analyzer, see:

Model Analyzer Conceptual Guide
Maximizing Deep Learning Inference Performance with NVIDIA Model Analyzer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model_analyzer.md

model_analyzer.md

Model Analyzer

Files

model_analyzer.md

Latest commit

History

model_analyzer.md

File metadata and controls

Model Analyzer