Skip to content

Commit

Permalink
Set correct cuda.current_device for multi-device onnx performance ben…
Browse files Browse the repository at this point in the history
…ch (#115670)

Summary:
Otherwise `torch.cuda.synchronize()` works on a different device from the one that
runs PyTorch model, which lead to incorrect performance number.

X-link: pytorch/pytorch#115670
Approved by: https://github.com/thiagocrepaldi

Reviewed By: jeanschmidt

Differential Revision: D52244270

fbshipit-source-id: 3b4133041ceec63aa6f5b72b840434d468345da7
  • Loading branch information
BowenBao authored and facebook-github-bot committed Dec 18, 2023
1 parent b7418c3 commit 1594d15
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion userbenchmark/dynamo/dynamobench/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -789,6 +789,11 @@ def timed_onnx(model, onnx_model: OnnxModel, inputs):
if should_randomize_input
else example_inputs
)
if torch.cuda.device_count() > 1:
# Manually set correct torch.cuda.current_device to ensure torch.cuda.synchronize() works as intended.
# When there are more than 1 cuda devices, the first one is used for pytorch eager.
# The second one is used for onnx ort.
torch.cuda.set_device(0)
timings[rep, 0], expected_output = timed(
model,
model_iter_fn,
Expand All @@ -797,7 +802,11 @@ def timed_onnx(model, onnx_model: OnnxModel, inputs):
times=times,
collect_outputs=args.collect_outputs,
)

if torch.cuda.device_count() > 1:
# Manually set correct torch.cuda.current_device to ensure torch.cuda.synchronize() works as intended.
# When there are more than 1 cuda devices, the first one is used for pytorch eager.
# The second one is used for onnx ort.
torch.cuda.set_device(1)
timings[rep, 1], actual_output = timed_onnx(model, onnx_model, inputs)

pvalue = ttest_ind(timings[:, 0], timings[:, 1]).pvalue
Expand Down

0 comments on commit 1594d15

Please sign in to comment.