Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding various GPU execution providers #173

Merged
merged 4 commits into from
Dec 16, 2024
Merged

Conversation

deven96
Copy link
Owner

@deven96 deven96 commented Dec 16, 2024

Following execution provider documentation at https://ort.pyke.io/perf/execution-providers, this PR adds support for various GPU execution providers

  • CoreML (Used by Apple Machines >= M1)
  • TensorRT
  • CUDA
  • DirectML (Used by Windows devices)

Execution providers are loaded by ort::init in the order in which they are provided, whenever an execution provider is unavailable (either on the entire platform or for the operation in particular), the next available execution provider is used and if none is available, default CPU is used.

This has only been tested using CoreML on my Mac machine

Without --features coreml in build command

Screenshot 2024-12-16 at 11 59 32

With --features coreml in build command

Screenshot 2024-12-16 at 11 26 39

This also pins ort-sys dependency to =2.0.0-rc.8 else we run into this issue

Copy link

github-actions bot commented Dec 16, 2024

Test Results

157 tests   157 ✅  2m 24s ⏱️
  8 suites    0 💤
  2 files      0 ❌

Results for commit e4e8696.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Dec 16, 2024

Benchmark Results

group                                                        main                                   pr
-----                                                        ----                                   --
store_batch_insertion_without_predicates/size_100            1.00    715.6±3.21µs        ? ?/sec    1.00    716.5±3.20µs        ? ?/sec
store_batch_insertion_without_predicates/size_1000           1.12      8.2±0.04ms        ? ?/sec    1.00      7.3±0.07ms        ? ?/sec
store_batch_insertion_without_predicates/size_10000          1.11     85.2±0.88ms        ? ?/sec    1.00     76.7±3.12ms        ? ?/sec
store_batch_insertion_without_predicates/size_100000         1.00   748.0±82.09ms        ? ?/sec    1.05   782.7±83.29ms        ? ?/sec
store_retrieval_no_condition/size_100                        1.00      2.3±0.00ms        ? ?/sec    1.00      2.3±0.01ms        ? ?/sec
store_retrieval_no_condition/size_1000                       1.00     16.8±0.06ms        ? ?/sec    1.01     16.9±0.07ms        ? ?/sec
store_retrieval_no_condition/size_10000                      1.00    167.1±0.79ms        ? ?/sec    1.00    167.8±1.01ms        ? ?/sec
store_retrieval_no_condition/size_100000                     1.00   1678.5±8.42ms        ? ?/sec    1.00   1681.0±3.08ms        ? ?/sec
store_retrieval_non_linear_kdtree/size_100                   1.00      2.2±0.04ms        ? ?/sec    1.00      2.2±0.04ms        ? ?/sec
store_retrieval_non_linear_kdtree/size_1000                  1.01     16.1±0.26ms        ? ?/sec    1.00     15.9±0.12ms        ? ?/sec
store_retrieval_non_linear_kdtree/size_10000                 1.00    158.5±0.66ms        ? ?/sec    1.01    160.4±1.76ms        ? ?/sec
store_retrieval_non_linear_kdtree/size_100000                1.00   1600.0±7.18ms        ? ?/sec    1.01  1614.7±11.25ms        ? ?/sec
store_sequential_insertion_without_predicates/size_100       1.03  1508.9±27.40µs        ? ?/sec    1.00  1471.2±33.42µs        ? ?/sec
store_sequential_insertion_without_predicates/size_1000      1.00     14.8±0.31ms        ? ?/sec    1.01     14.9±0.34ms        ? ?/sec
store_sequential_insertion_without_predicates/size_10000     1.00    154.6±3.33ms        ? ?/sec    1.03    159.7±3.61ms        ? ?/sec
store_sequential_insertion_without_predicates/size_100000    1.00   1547.9±4.98ms        ? ?/sec    1.01  1559.1±14.70ms        ? ?/sec

@deven96 deven96 force-pushed the deven/execution-providers branch 2 times, most recently from b737ad0 to 8fe42b2 Compare December 16, 2024 11:04
@deven96 deven96 force-pushed the deven/execution-providers branch from 8fe42b2 to b3ac046 Compare December 16, 2024 11:43
@deven96 deven96 force-pushed the deven/execution-providers branch from b3ac046 to e4e8696 Compare December 16, 2024 11:48
@deven96 deven96 merged commit 4ed8654 into main Dec 16, 2024
5 checks passed
@deven96 deven96 deleted the deven/execution-providers branch December 16, 2024 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants