Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DLRM Model Computational Graph #1532

Open
wants to merge 2 commits into
base: repo-refactor
Choose a base branch
from

Conversation

easyeasydev
Copy link
Collaborator

@easyeasydev easyeasydev commented Oct 31, 2024

Description of changes:

This PR adds the computational graph of DLRM model.

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

This change is Reviewable

Copy link
Collaborator Author

@easyeasydev easyeasydev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 2 unresolved discussions (waiting on @lockshaw and @reyna-abhyankar)


lib/models/src/models/dlrm/dlrm.cc line 44 at r1 (raw file):

}

tensor_guid_t create_dlrm_mlp(ComputationGraphBuilder &cgb,

This MLP implementation refers to the original FlexFlow implementation at https://github.com/flexflow/FlexFlow/blob/inference/examples/cpp/DLRM/dlrm.cc#L44. Not sure if there was a specific reason to do this in the past. We can decide what to do here. Torchrec doesn't have this logic to initialize the initializers.
@lockshaw


lib/models/src/models/dlrm/dlrm.cc line 145 at r1 (raw file):

      /*mlp_layers=*/config.dense_arch_layer_sizes);

  std::vector<tensor_guid_t> emb_outputs;

As suggested, I tried this:

  std::vector<tensor_guid_t> emb_outputs = transform(
      zip(config.embedding_size, sparse_inputs),
      [&](std::pair<int, tensor_guid_t> &combined_pair) -> tensor_guid_t {
        return create_dlrm_sparse_embedding_network(
            /*cgb=*/cgb,
            /*config=*/config,
            /*input=*/combined_pair.second,
            /*input_dim=*/combined_pair.first,
            /*output_dim=*/config.embedding_dim);
      });

But I couldn't make it compile by using transform and zip...

Copy link
Collaborator Author

@easyeasydev easyeasydev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 3 unresolved discussions (waiting on @lockshaw and @reyna-abhyankar)


lib/models/src/models/dlrm/dlrm.cc line 128 at r1 (raw file):

  // Create input tensors
  std::vector<tensor_guid_t> sparse_inputs(

This had the following review comment before:

Quote:
I think this current implementation makes sparse_inputs a vector of the same tensor_guid_t over and over again, rather than being a vector of identically-shaped tensors. Was that intentional? If not, then maybe the below code would be better?

std::vector<tensor_guid_t> sparse_inputs = repeat(config.embedding_size.size(), [&]() { return create_input_tensor({config.batch_size, config.embedding_bag_size}, DataType::INT64); });

The intention was to create multiple tensors as the inputs. I'm not sure if this vector creation syntax will lead to having all inputs referring to a single tensor? I would think create_input_tensor should return a copy of tensor for each item in the vector?

Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 95.69892% with 4 lines in your changes missing coverage. Please review.

Project coverage is 78.12%. Comparing base (1d5140d) to head (aca2810).

Files with missing lines Patch % Lines
bin/export-model-arch/src/export_model_arch.cc 0.00% 2 Missing ⚠️
lib/models/src/models/dlrm/dlrm.cc 97.01% 2 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                @@
##           repo-refactor    #1532      +/-   ##
=================================================
- Coverage          78.16%   78.12%   -0.04%     
=================================================
  Files                860      862       +2     
  Lines              27994    28280     +286     
  Branches             770      774       +4     
=================================================
+ Hits               21881    22095     +214     
- Misses              6113     6185      +72     
Flag Coverage Δ
unittests 78.12% <95.69%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...computation_graph_series_parallel_decomposition.cc 98.80% <100.00%> (+0.06%) ⬆️
lib/models/test/src/models/dlrm/dlrm.cc 100.00% <100.00%> (ø)
lib/pcg/src/pcg/computation_graph_builder.cc 79.41% <100.00%> (+4.60%) ⬆️
bin/export-model-arch/src/export_model_arch.cc 0.00% <0.00%> (ø)
lib/models/src/models/dlrm/dlrm.cc 97.01% <97.01%> (ø)

... and 113 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant