Replies: 3 comments 3 replies
-
CC: @zewenli98, @narendasan, @laikhtewari OverviewWe compare the difference in performance between the PyTorch-decomposed Not Torch-Decomposed graph():
%l_x_ : torch.Tensor [num_users=1] = placeholder[target=l_x_]
%l_y_ : torch.Tensor [num_users=1] = placeholder[target=l_y_]
%l_z_ : torch.Tensor [num_users=1] = placeholder[target=l_z_]
%linear_default : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%l_x_, %l_y_, %l_z_), kwargs = {})
return linear_default Torch-Decomposed graph():
%arg0_1 : [num_users=1] = placeholder[target=arg0_1]
%arg1_1 : [num_users=1] = placeholder[target=arg1_1]
%arg2_1 : [num_users=1] = placeholder[target=arg2_1]
%view : [num_users=1] = call_function[target=torch.ops.aten.view.default](args = (%arg0_1, [128, 32]), kwargs = {})
%permute : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%arg1_1, [1, 0]), kwargs = {})
%mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%arg2_1, 1), kwargs = {})
%mm : [num_users=1] = call_function[target=torch.ops.aten.mm.default](args = (%view, %permute), kwargs = {})
%mul_1 : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%mm, 1), kwargs = {})
%add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%mul, %mul_1), kwargs = {})
%view_1 : [num_users=1] = call_function[target=torch.ops.aten.view.default](args = (%add, [4, 32, 64]), kwargs = {})
return (view_1,) Note: For Performance MethodsUsed the below PyTorch model and disabled import torch
import torch_tensorrt
class Linear(torch.nn.Module):
def forward(self, x, y, z):
return torch.ops.aten.linear.default(x, y, z)
opt_model = torch.compile(Linear().cuda(), backend="torch_tensorrt", options={"debug": True,
"min_block_size": 1,
"optimization_level": 5})
inputs = [torch.rand((4, 32, 32)).cuda(), torch.rand((64, 32)).cuda(), torch.rand((64,)).cuda()]
opt_model(*inputs) Results
Recommendation
|
Beta Was this translation helpful? Give feedback.
-
[going to move this to discussions] |
Beta Was this translation helpful? Give feedback.
-
@gs-olive Do you have a data on compilation times? Ideally for level 3 without lowering, level 3 with lowering, and level 5 without lowering |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
All reactions