Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG: utils.validate_data has inconsistent use of tabs and spaces in indentation which leads to crashing] #95

Open
Hackerbone opened this issue Aug 26, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Hackerbone
Copy link

Python Version

(venv) (base) admin@testbench:~/mistral-finetune$ python -VV
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]

Pip Freeze

absl-py==2.1.0
annotated-types==0.7.0
attrs==24.2.0
certifi==2024.7.4
charset-normalizer==3.3.2
docstring_parser==0.16
filelock==3.15.4
fire==0.6.0
fsspec==2024.6.1
grpcio==1.66.0
idna==3.8
Jinja2==3.1.4
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
Markdown==3.7
MarkupSafe==2.1.5
mistral_common==1.3.4
mpmath==1.3.0
networkx==3.3
numpy==2.1.0
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.6.20
nvidia-nvtx-cu12==12.1.105
packaging==24.1
protobuf==5.27.3
pydantic==2.8.2
pydantic_core==2.20.1
PyYAML==6.0.2
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
rpds-py==0.20.0
safetensors==0.4.4
sentencepiece==0.2.0
simple_parsing==0.1.5
six==1.16.0
sympy==1.13.2
tensorboard==2.17.1
tensorboard-data-server==0.7.2
termcolor==2.4.0
tiktoken==0.7.0
torch==2.2.0
tqdm==4.66.5
triton==2.2.0
typing_extensions==4.12.2
urllib3==2.2.2
Werkzeug==3.0.4
xformers==0.0.24

Reproduction Steps

  1. Clone the repository
  2. Change directory to mistral-finetune
  3. Try running the validate script - python -m utils.validate_data --train_yaml example/7B.yaml

Output:

(venv) (base) admin@testbench:~/mistral-finetune$ python -m utils.validate_data --train_yaml example/7B.yaml 
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/conda/lib/python3.10/runpy.py", line 157, in _get_module_details
    code = loader.get_code(mod_name)
  File "<frozen importlib._bootstrap_external>", line 1017, in get_code
  File "<frozen importlib._bootstrap_external>", line 947, in source_to_code
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/admin/mistral-finetune/utils/validate_data.py", line 113
    else:
TabError: inconsistent use of tabs and spaces in indentation

There is an error on line number 113 of the file utils/validate_data.py due to improper indentation.

Expected Behavior

Expected Behaviour is that the script should run properly without crashing due to unhandled indentation in the else statement

Additional Context

No response

Suggested Solutions

Simple fix is fixing the indentation properly so that the code does not break. Will open a PR for the same and link it here.

From this:

        if params_config["dim"] == 4096 and params_config.get("moe") is None:
            model_id = "open-mistral-7b"
        elif params_config["dim"] == 4096 and params_config.get("moe") is not None:
            model_id = "open-mixtral-8x7b"
        elif params_config["dim"] == 6144:
            model_id = "open-mixtral-8x22b"
        elif params_config["dim"] == 12288:
            model_id = "mistral-large-latest"
        elif params_config["dim"] == 5120:
            model_id = "open-mistral-nemo"
    else:
            raise ValueError("Provided model folder seems incorrect.")
    else:
        model_id = train_args.model_id_or_path

To this:

        if params_config["dim"] == 4096 and params_config.get("moe") is None:
            model_id = "open-mistral-7b"
        elif params_config["dim"] == 4096 and params_config.get("moe") is not None:
            model_id = "open-mixtral-8x7b"
        elif params_config["dim"] == 6144:
            model_id = "open-mixtral-8x22b"
        elif params_config["dim"] == 12288:
            model_id = "mistral-large-latest"
        elif params_config["dim"] == 5120:
            model_id = "open-mistral-nemo"
        else:
            raise ValueError("Provided model folder seems incorrect.")
    else:
        model_id = train_args.model_id_or_path
@Hackerbone Hackerbone added the bug Something isn't working label Aug 26, 2024
@github-staff github-staff deleted a comment from Lxx-c Oct 23, 2024
@github-staff github-staff deleted a comment from Lxx-c Oct 23, 2024
@github-staff github-staff deleted a comment from Lxx-c Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants
@Hackerbone and others