[Task] Simplify Top-k recommender API #622

sararb · 2022-08-04T17:15:58Z

Problem

Generating the top-k prediction from a trained the retrieval model currently requires the user to follow these steps (Not straightforward). Besides, the top-k recommender is a ModelBlock that cannot be saved / re-loaded.
This feature should also fix issue [BUG] to_top_k_recommender model does not have batch_predict() method #499

Goals

Decouple the top-k local prediction top-k evaluation from the retrieval contrastive learning task.
Convert retrieval models to a top-k recommender model : Matrix Factorization, Two-Tower, and YoutubeDNN
Ensure the Keras analogy for the top-k recommender where the user can call .predict, .evaluate, .save, and load the model.
ItemRecommender should support different top-k strategies: Brute-force || Streaming || ANN or any user-specific top-k strategy.
Ensure retrieval experiments and CI performance tests are returning same level of performance with the new Retrieval API

Starting Point:

Definition: The Top-k recommender is a model with : Query encoder + Top-k layer
Prerequisite of the top-k recommender:
- Predict method: returns top-k items (scores and ids) for a given query (user)
- Evaluate method: compute ranking metrics for a dataset of users/queries
- batch_predict: return a dataset with top-k items for a dataset of users/queries
- save: The top-k model is the 'useful' part of the retrieval pipeline as it is the one that generates the prediction for the external endpoint. The user needs to save this model and reload it later for evaluation or local prediction
  - Supporting different top-k strategies
Arguments of the top-k layer:
- A cut-off k
- The dataset of candidates: pre-trained item embeddings
- The method index_from_dataset: to set the index for the top-k search
- The method score: the distance metric to use for computing the score between the query and the item embeddings. (default: dot-product)
- Call method:
  - Takes as input the query embeddings
  - Define the logic of how to retrieve the top-k items : * The scope of this first work is "Brute-Force"

Open questions:

Do we need to re-train the new recommender model with the pre-trained item embeddings? ==> e.g., convert a two-tower model to a youtube-dnn like model. ==> This can be done outside of the top-k recommender class. We should simplify the top-k recommender for supporting different top-k strategies
Should we define the Top-k layer as a sub-class of the CategoricalPrediction block?
implementation starting points:
--> main...tf/retrieval-models
--> https://github.com/NVIDIA-Merlin/models/pull/663/files

The text was updated successfully, but these errors were encountered:

gabrielspmoreira · 2022-08-18T17:13:56Z

When this refactory is done, we should retest #339 to check if the slowness building the top-k index still persists.

sararb · 2022-09-06T13:22:19Z

Closing this issue as it is a duplicate of a new task tracked in the session-based roadmap ticket.

sararb added status/needs-triage enhancement New feature or request P1 P0 and removed status/needs-triage P1 labels Aug 4, 2022

sararb mentioned this issue Aug 4, 2022

[Task] Improve negative sampling for retrieval NVIDIA-Merlin/Merlin#464

Open

9 tasks

sararb changed the title ~~[FEA] Simplify Top-k recommender API~~ [Task] Simplify Top-k recommender API Aug 10, 2022

sararb self-assigned this Aug 10, 2022

sararb mentioned this issue Aug 18, 2022

Top-k recommender model #663

Closed

14 tasks

gabrielspmoreira mentioned this issue Aug 19, 2022

[RMP] Provide to our customers best practices guidance for training Retrieval, Ranking and Multi-Stage RecSys models NVIDIA-Merlin/Merlin#553

Closed

25 tasks

sararb closed this as completed Sep 6, 2022

gabrielspmoreira mentioned this issue Nov 14, 2022

[RMP] Quick-start RecSys pipeline and best practices guidance for training and evaluating retrieval and ranking models NVIDIA-Merlin/Merlin#732

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task] Simplify Top-k recommender API #622

[Task] Simplify Top-k recommender API #622

sararb commented Aug 4, 2022 •

edited

Loading

gabrielspmoreira commented Aug 18, 2022

sararb commented Sep 6, 2022

[Task] Simplify Top-k recommender API #622

[Task] Simplify Top-k recommender API #622

Comments

sararb commented Aug 4, 2022 • edited Loading

Problem

Goals

Starting Point:

gabrielspmoreira commented Aug 18, 2022

sararb commented Sep 6, 2022

sararb commented Aug 4, 2022 •

edited

Loading