Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unitxt datasets can't be dowloaded from LMEval with latest unitxt #367

Open
ruivieira opened this issue Nov 22, 2024 · 0 comments
Open

unitxt datasets can't be dowloaded from LMEval with latest unitxt #367

ruivieira opened this issue Nov 22, 2024 · 0 comments
Labels
kind/bug Something isn't working lm-eval Issues related to LM-Eval
Milestone

Comments

@ruivieira
Copy link
Member

Using an LMEvalJob CR such as

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: evaljob-sample
spec:
  # suspend: true
  model: hf
  modelArgs:
  - name: pretrained
    value: google/flan-t5-base 
  taskList:
    taskRecipes:
    - card:
        name: "cards.wnli" 
      template: "templates.classification.multi_class.relation.default" 
  logSamples: true

results in the following error from an LMEval pod:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/app-root/src/lm_eval/__main__.py", line 461, in <module>
    cli_evaluate()
  File "/opt/app-root/src/lm_eval/__main__.py", line 382, in cli_evaluate
    results = evaluator.simple_evaluate(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/evaluator.py", line 235, in simple_evaluate
    task_dict = get_task_dict(tasks, task_manager)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/tasks/__init__.py", line 618, in get_task_dict
    task_name_from_string_dict = task_manager.load_task_or_group(
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/tasks/__init__.py", line 414, in load_task_or_group
    collections.ChainMap(*map(self._load_individual_task_or_group, task_list))
  File "/opt/app-root/src/lm_eval/tasks/__init__.py", line 314, in _load_individual_task_or_group
    return _load_task(task_config, task=name_or_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/tasks/__init__.py", line 273, in _load_task
    task_object = config["class"](config=config)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/lm_eval/tasks/unitxt/task.py", line 45, in __init__
    super().__init__(
  File "/opt/app-root/src/lm_eval/api/task.py", line 818, in __init__
    self.download(self.config.dataset_kwargs)
  File "/opt/app-root/src/lm_eval/api/task.py", line 925, in download
    self.dataset = datasets.load_dataset(
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/.local/lib/python3.11/site-packages/datasets/load.py", line 2132, in load_dataset
    builder_instance = load_dataset_builder(
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/.local/lib/python3.11/site-packages/datasets/load.py", line 1888, in load_dataset_builder
    builder_cls = get_dataset_builder_class(dataset_module, dataset_name=dataset_name)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/.local/lib/python3.11/site-packages/datasets/load.py", line 248, in get_dataset_builder_class
    builder_cls = import_main_class(dataset_module.module_path)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/.local/lib/python3.11/site-packages/datasets/load.py", line 169, in import_main_class
    module = importlib.import_module(module_path)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/data.py", line 6, in <module>
    from .api import __file__ as _
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/api.py", line 7, in <module>
    from .artifact import fetch_artifact
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/artifact.py", line 18, in <module>
    from .error_utils import Documentation, UnitxtError, UnitxtWarning
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/error_utils.py", line 3, in <module>
    from .logging_utils import get_logger
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/logging_utils.py", line 7, in <module>
    from .settings_utils import get_settings
  File "/opt/app-root/src/hf_home/modules/datasets_modules/datasets/unitxt--data/f69d27e92178bb054c1b718942189e6a25f1ab595789d872054c748fca805e01/settings_utils.py", line 164, in <module>
    constants.package_dir = os.path.dirname(unitxt_pkg.origin)
                                            ^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'origin'
2024-11-21T10:33:47Z	INFO	driver	update status: job completed	{"state": {"state":"Complete","reason":"Failed","message":"exit status 1"}}
2024-11-21T10:33:52Z	ERROR	driver	Driver.Run failed	{"error": "exit status 1"}
main.main
	/go/src/github.com/trustyai-explainability/trustyai-service-operator/cmd/lmes_driver/main.go:125
runtime.main
	/usr/lib/golang/src/runtime/proc.go:267

This is due to the new check present at https://github.com/IBM/unitxt/blob/8a7ad4c550ea36e52944323716da66714e78f663/src/unitxt/settings_utils.py#L164, since unitxt is not explicitly installed in LMEval.

@ruivieira ruivieira added kind/bug Something isn't working lm-eval Issues related to LM-Eval labels Nov 22, 2024
@ruivieira ruivieira added this to the LM-Eval milestone Nov 22, 2024
@ruivieira ruivieira moved this from Todo to In Progress in TrustyAI planning Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working lm-eval Issues related to LM-Eval
Projects
Status: In Progress
Development

No branches or pull requests

1 participant