Skip to content

Commit

Permalink
fix broken torchbench mimo_cmf_30x
Browse files Browse the repository at this point in the history
Summary:
this test has been broken since 05/22.
https://pxl.cl/2S3Fg

## fix 1. refresh eval data or skip model export
here's an example of failed job complaining data expiration.
> MlDataComponentValidatorWrapper (data_config.ml_data_config): No available partitions found for given query filter. Namespace = ad_delivery. Table = ctr_mbl_feed_model_af_cd_async_ai_cd_30_neg_ds_md. filterClause = pipeline = 'ctr_mbl_feed_model_af_cd_async_ai_cd_30_neg_ds_md' AND ds = '2023-03-09'.

we do have `analyzer.refresh_dataset()` but still see such error because we didn't call `analyzer.refresh_eval_dataset()`.  the 'no partition found' error was happening at `self._serialize_inference_model`.

https://www.internalfb.com/code/fbsource/[c953aa6a0a497851da8ec4f4361d0202dd5c33f7]/fbcode/dper3/dper3_models/ads_ranking/base_models/mimo_nn/mimo_pytorch_model_builder_base.py?lines=1512-1524

~~furthermore, we realized even `export_mode` can be None in torchbench.  similar trick has been adopted before to speed up model instantiation for testing purpose https://www.internalfb.com/diff/D46180124?dst_version_fbid=100400963077454&transaction_fbid=6056114577843800~~

~~## fix 2. use_synthetic_data = True~~
~~It doesn't work here.  TODO. address it later.~~

## fix 3. simplify branching of getting model
Now ALWAYS generate model/data at run time to catch changes in model instantiation part too.

Reviewed By: bertmaher, xuzhao9

Differential Revision: D47157906

fbshipit-source-id: d7423e7329185d368361e7b40ef2ba5fa9f926c6
  • Loading branch information
dshi7 authored and facebook-github-bot committed Jul 10, 2023
1 parent 16e235e commit 48a8ef6
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion torchbenchmark/util/experiment/instantiator.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from torchbenchmark.util.model import BenchmarkModel
from torchbenchmark import _list_model_paths, load_model_by_name, ModelTask

WORKER_TIMEOUT = 1800 # seconds
WORKER_TIMEOUT = 3600 # seconds
BS_FIELD_NAME = "batch_size"

@dataclasses.dataclass
Expand Down

0 comments on commit 48a8ef6

Please sign in to comment.