Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch Model Exporting Issue: MLIR Verification Failed #24

Open
saienduri opened this issue Jun 4, 2024 · 1 comment
Open

Pytorch Model Exporting Issue: MLIR Verification Failed #24

saienduri opened this issue Jun 4, 2024 · 1 comment

Comments

@saienduri
Copy link
Contributor

saienduri commented Jun 4, 2024

With the latest torch (2.4) and iree-turbine, we are seeing this MLIR verification failure come up for a lot of our models during the export stage (aot.export).

Instructions to reproduce this error:

Follow setup instructions here including the "Turbine Mode" instructions: https://github.com/nod-ai/SHARK-TestSuite/blob/main/e2eshark/README.md.

Then, run the following command from the SHARK-TestSuite/e2eshark directory (example to only run bert model). Change --tests flag based on the model you want to test:

HF_TOKEN=<your_hf_token> python3.11 ./run.py \
          -r ./test-turbine \
          --report \
          --cachedir ~/huggingface_cache \
          --mode turbine \
          -g models \
          --postprocess \
          -v \
          --tests pytorch/models/bert-large-uncased

You can find the debug artifacts in SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/<model_name>
Here you can find the model-run.log file for example which will describe the error in more detail. You can also find the mlir generated for the model that failed verification in /tmp/turbine_module_builder_error.mlir
 
Models:
pytorch/models/vicuna-13b-v1.3
pytorch/models/llama2-7b-GPTQ
pytorch/models/mobilebert-uncased
pytorch/models/miniLM-L12-H384-uncased
pytorch/models/bert-large-uncased
pytorch/models/gpt2-xl
pytorch/models/phi-2
pytorch/models/phi-1_5
pytorch/models/bge-base-en-v1.5
pytorch/models/llama2-7b-hf
pytorch/models/gpt2

Traceback (most recent call last):
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
    self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: 'torch.aten.slice.Tensor' op operand #0 must be Any Torch tensor type, but got '!torch.none'
 note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: see current operation: %168 = "torch.aten.slice.Tensor"(%163, %164, %165, %166, %167) : (!torch.none, !torch.int, !torch.int, !torch.int, !torch.int) -> !torch.vtensor<[8,128],f32>

Traceback (most recent call last):
  File "/home/nod/sai/SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/vicuna-13b-v1.3/runmodel.py", line 131, in <module>
    module = aot.export(model, E2ESHARK_CHECK["input"])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/exporter.py", line 304, in export
    cm = TransformedModule(context=context, import_to="import")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/compiled_module.py", line 654, in __new__
    module_builder.finalize_construct()
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
    self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: 'torch.aten.slice.Tensor' op operand #0 must be Any Torch tensor type, but got '!torch.none'
 note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: see current operation: %168 = "torch.aten.slice.Tensor"(%163, %164, %165, %166, %167) : (!torch.none, !torch.int, !torch.int, !torch.int, !torch.int) -> !torch.vtensor<[8,128],f32>

Models:
pytorch/models/beit-base-patch16-224-pt22k-ft22k

Traceback (most recent call last):
  File "/home/nod/sai/SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/beit-base-patch16-224-pt22k-ft22k/runmodel.py", line 110, in <module>
    module = aot.export(model, E2ESHARK_CHECK["input"])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/exporter.py", line 304, in export
    cm = TransformedModule(context=context, import_to="import")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/compiled_module.py", line 654, in __new__
    module_builder.finalize_construct()
  File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
    self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/beit/modeling_beit.py":875:0: 'torch.aten.view' op operand #0 must be Any Torch tensor type, but got '!torch.none'
 note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/beit/modeling_beit.py":875:0: see current operation: %189 = "torch.aten.view"(%186, %188) : (!torch.none, !torch.list<int>) -> !torch.vtensor<[38809],si64>
@widiba03304
Copy link

Is there any update on this? I also need help with this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants