Add DLRM Model Computational Graph #1532

easyeasydev · 2024-10-31T00:13:19Z

Description of changes:

This PR adds the computational graph of DLRM model.

Related Issues:

Linked Issues:

Issue #

Issues closed by this PR:

Closes #

This change is

easyeasydev

Reviewable status: 0 of 7 files reviewed, 2 unresolved discussions (waiting on @lockshaw and @reyna-abhyankar)

lib/models/src/models/dlrm/dlrm.cc line 44 at r1 (raw file):

}

tensor_guid_t create_dlrm_mlp(ComputationGraphBuilder &cgb,

This MLP implementation refers to the original FlexFlow implementation at https://github.com/flexflow/FlexFlow/blob/inference/examples/cpp/DLRM/dlrm.cc#L44. Not sure if there was a specific reason to do this in the past. We can decide what to do here. Torchrec doesn't have this logic to initialize the initializers.
@lockshaw

lib/models/src/models/dlrm/dlrm.cc line 145 at r1 (raw file):

      /*mlp_layers=*/config.dense_arch_layer_sizes);

  std::vector<tensor_guid_t> emb_outputs;

As suggested, I tried this:

  std::vector<tensor_guid_t> emb_outputs = transform(
      zip(config.embedding_size, sparse_inputs),
      [&](std::pair<int, tensor_guid_t> &combined_pair) -> tensor_guid_t {
        return create_dlrm_sparse_embedding_network(
            /*cgb=*/cgb,
            /*config=*/config,
            /*input=*/combined_pair.second,
            /*input_dim=*/combined_pair.first,
            /*output_dim=*/config.embedding_dim);
      });

But I couldn't make it compile by using transform and zip...

easyeasydev

Reviewable status: 0 of 7 files reviewed, 3 unresolved discussions (waiting on @lockshaw and @reyna-abhyankar)

lib/models/src/models/dlrm/dlrm.cc line 128 at r1 (raw file):

  // Create input tensors
  std::vector<tensor_guid_t> sparse_inputs(

This had the following review comment before:

Quote:
I think this current implementation makes sparse_inputs a vector of the same tensor_guid_t over and over again, rather than being a vector of identically-shaped tensors. Was that intentional? If not, then maybe the below code would be better?

std::vector<tensor_guid_t> sparse_inputs = repeat(config.embedding_size.size(), [&]() { return create_input_tensor({config.batch_size, config.embedding_bag_size}, DataType::INT64); });

The intention was to create multiple tensors as the inputs. I'm not sure if this vector creation syntax will lead to having all inputs referring to a single tensor? I would think create_input_tensor should return a copy of tensor for each item in the vector?

codecov · 2024-10-31T00:55:43Z

Codecov Report

Attention: Patch coverage is 95.69892% with 4 lines in your changes missing coverage. Please review.

Project coverage is 78.12%. Comparing base (1d5140d) to head (aca2810).

Files with missing lines	Patch %	Lines
bin/export-model-arch/src/export_model_arch.cc	0.00%	2 Missing ⚠️
lib/models/src/models/dlrm/dlrm.cc	97.01%	2 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                @@
##           repo-refactor    #1532      +/-   ##
=================================================
- Coverage          78.16%   78.12%   -0.04%     
=================================================
  Files                860      862       +2     
  Lines              27994    28280     +286     
  Branches             770      774       +4     
=================================================
+ Hits               21881    22095     +214     
- Misses              6113     6185      +72

Flag	Coverage Δ
unittests	`78.12% <95.69%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...computation_graph_series_parallel_decomposition.cc	`98.80% <100.00%> (+0.06%)`	⬆️
lib/models/test/src/models/dlrm/dlrm.cc	`100.00% <100.00%> (ø)`
lib/pcg/src/pcg/computation_graph_builder.cc	`79.41% <100.00%> (+4.60%)`	⬆️
bin/export-model-arch/src/export_model_arch.cc	`0.00% <0.00%> (ø)`
lib/models/src/models/dlrm/dlrm.cc	`97.01% <97.01%> (ø)`

... and 113 files with indirect coverage changes

easyeasydev added 2 commits October 30, 2024 17:00

For safety...

7551234

Update

aca2810

easyeasydev requested review from reyna-abhyankar and lockshaw October 31, 2024 00:13

easyeasydev commented Oct 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DLRM Model Computational Graph #1532

Add DLRM Model Computational Graph #1532

easyeasydev commented Oct 31, 2024 •

edited by wmdi

Loading

easyeasydev left a comment

easyeasydev left a comment

codecov bot commented Oct 31, 2024

Add DLRM Model Computational Graph #1532

Are you sure you want to change the base?

Add DLRM Model Computational Graph #1532

Conversation

easyeasydev commented Oct 31, 2024 • edited by wmdi Loading

easyeasydev left a comment

Choose a reason for hiding this comment

easyeasydev left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 31, 2024

Codecov Report

easyeasydev commented Oct 31, 2024 •

edited by wmdi

Loading