Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance of Categorify operation on Triton Inference Server #1885

Open
rahuljantwal-8451 opened this issue Oct 3, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@rahuljantwal-8451
Copy link

rahuljantwal-8451 commented Oct 3, 2024

Description

When running an NVTabular workflow with Categorify operations in Triton Inference Server, the performance is significantly slow when dealing with high cardinality data.

Environment

  • Merlin Tensorflow Container
  • 23.12

Steps to Reproduce

  1. Generate a High Cardinality Dataset using generate_dataset.py
  2. Process the dataset using NVTabular process_dataset.py
  3. Export the NVTabular workflow as a Triton ensemble export_ensemble.py
  4. Run the Triton server
tritonserver --model-repository=./ensemble/

Expected Behavior

The Categorify operation should perform efficiently, with category data being cached between requests, resulting in performance similar to that observed in a Jupyter notebook environment.

Actual Behavior

The Categorify operation is slow, with each request taking as long as the first request, suggesting that category data is not being effectively cached between requests.

Results

Below are the result based on benchmarking script - encode.sh

Cardinality Ensemble Triton TransformWorkflow Jupyter
50 30 ms 38 ms
5k 30 ms 43 ms
5M 1270 ms 88.8 ms
50M 15833 ms 550 ms
@rahuljantwal-8451 rahuljantwal-8451 added the bug Something isn't working label Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant