Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exp run fails when launched via Experiments tab, but runs via command line #1815

Closed
alex000kim opened this issue Jun 2, 2022 · 33 comments
Closed
Assignees
Labels
A: extension Area: core extension triage

Comments

@alex000kim
Copy link

See https://www.loom.com/share/bbefd727e9044467b56cd1891d9e4452

When I launch an experiment via the Experiments tab, it fails because of an import error.
Launching the same experiment via the command line works fine.
It looks like there's a difference in environment variables.
Is it possible that the extension is not using my virtual environment?
I am using pipenv to manage the virtual env in this project.

@alex000kim alex000kim changed the title Exp run fails when launched via Experiments tab, but run via command line Exp run fails when launched via Experiments tab, but runs via command line Jun 2, 2022
@maxagin
Copy link
Contributor

maxagin commented Jun 2, 2022

@shcheklein I think we already have a ticket for this issue. Right? This is something that happened to me also and we have talked about this last week.

@mattseddon
Copy link
Member

Relates to #1791.

A few questions:

  1. Is DVC installed in the virtual environment or globally?
  2. Do you have the Python extension installed?
  3. Have you gone through the "setup the workspace" wizard?

@alex000kim
Copy link
Author

Because that project needs a GPU, I am using this TPI repo https://github.com/iterative/tpi-remote-vscode to setup remote interpreter in VSCode.

  1. In the virtual environment
  2. Yes
  3. No

@alex000kim
Copy link
Author

Also, I suspect this issue may have something to do with this:
https://github.com/iterative/magnetic-tiles-defect/blob/main/README.md?plain=1#L85-L86
I.e. even if the extension is using the correct python interpreter (from the virtual env), it's not activating/exporting everything that's part of that virtual env.

@mattseddon
Copy link
Member

Yep, I can recreate in a terminal if I do not use those setup steps:

~/magnetic-tiles-defect main !1 ❯ dvc exp run                            0.02508s  magnetic-tiles-defect-f4NuKytM  3.0.0 13:33:25
Running stage 'check_packages':                                                                                                        
> pipenv run pip freeze > requirements.txt
Courtesy Notice: Pipenv found itself running within a virtual environment, so it will automatically use that environment, instead of creating its own for any project. You can set PIPENV_IGNORE_VIRTUALENVS=1 to force pipenv to ignore that environment and create its own instead. You can set PIPENV_VERBOSITY=-1 to suppress this warning.
                                                                                                                                       
Running stage 'data_load':
> python src/stages/data_load.py --config=params.yaml
Traceback (most recent call last):
  File "/Users/mattseddon/PP/magnetic-tiles-defect/src/stages/data_load.py", line 4, in <module>
    from src.data_utils import dataset_prep
ModuleNotFoundError: No module named 'src'
ERROR: failed to reproduce 'data_load': failed to run: python src/stages/data_load.py --config=params.yaml, exited with 1

@alex000kim
Copy link
Author

Right. Why don't we activate the entire virtual env when running dvc via the extension? (instead of using only the interpreter)

@mattseddon
Copy link
Member

I can fix this particular project by adding PYTHONPATH to the env variables in our child process:

image

But I am not 100% as to whether or not that will break other projects. I can look into it but not today.

@alex000kim
Copy link
Author

Thanks!
Out of curiosity, are there particular reasons why we can't/shouldn't export everything that's part of a virtual environment into the child process?

@mattseddon
Copy link
Member

Right. Why don't we activate the entire virtual env when running dvc via the extension? (instead of using only the interpreter)

We made a decision a long time ago not to try and recreate the logic in the Python extension which is used to "auto-activate" virtual environments.
There is an ongoing here: microsoft/vscode-python#11039 (comment) where they have further advocated for using the path to the interpreter instead of activating the environment. However, we know that does not meet all of our needs.

Thanks!
Out of curiosity, are there particular reasons why we can't/shouldn't export everything that's part of a virtual environment into the child process?

Currently, the Python extension does not actually expose the activation details. We don't know the details of the expected activation file.

@alex000kim
Copy link
Author

Got it, thanks!
For now, I can use sys.path.append to work around the issue.

@mattseddon mattseddon added bug Something isn't working A: extension Area: core extension labels Jun 2, 2022
@mattseddon
Copy link
Member

For the record:

I've been looking into this further. There is a "VS Code" way to set the PYTHONPATH for specific projects. We could potentially hook into that mechanism if it would be something that you'd be interested in/think would be useful. I will need to research more to see what is possible/exposed.

LMK what you think.

@shcheklein
Copy link
Member

@alex000kim if you run a Terminal inside VS Code and do which dvc - what does it return?

As far as I understand pipenv in this case is responsible to activate it.

Looking into this

Courtesy Notice: Pipenv found itself running within a virtual environment, so it will automatically use that environment, instead of creating its own for any project. You can set PIPENV_IGNORE_VIRTUALENVS=1 to force pipenv to ignore that environment and create its own instead. You can set PIPENV_VERBOSITY=-1 to suppress this warning.

@alex000kim Can it be the case that you also have .venv or .env in the root of the project?

@alex000kim
Copy link
Author

if you run a Terminal inside VS Code and do which dvc - what does it return?

/home/ubuntu/.local/share/virtualenvs/magnetic-tiles-defect-clR5l9w5/bin/dvc i.e. location in my pipenv virtual env.

Can it be the case that you also have .venv or .env in the root of the project?

Nope

As far as I understand pipenv in this case is responsible to activate it.

Yes. And this is exactly what happens when I run dvc from the command line.
However, the extension uses only the interpreter from the virtual env. That's the core of the issue.

What @mattseddon is suggesting (i.e. using .env to set environment variables) could be a good solution!
However, after playing with it for 5 min (i.e. creating .env with PYTHONPATH value + setting "python.envFile": "${workspaceFolder}/.env"), I couldn't get VSCode to export values from .env.
I may be missing something, and need to look into it more.

@alex000kim
Copy link
Author

Another byproduct of this issue.
I have a pipeline stage that freezes my pipenv virtual env.

stages:
  check_packages:
    cmd: pipenv run pip freeze > requirements.txt
    always_changed: true
    outs:
      - requirements.txt
...

And because the extension is running dvc exp run outside of the environment, I get these warnings:

Running: dvc exp run --run-all

Running stage 'check_packages':
> pipenv run pip freeze > requirements.txt
Creating a virtualenv for this project…
Using /home/ubuntu/.virtualenvs/magnetic-tiles-defect-clR5l9w5/bin/python3 (3.8.10) to create virtualenv…
created virtual environment CPython3.8.10.final.0-64 in 117ms
  creator CPython3Posix(dest=/home/ubuntu/.local/share/virtualenvs/tmpx6eibpu2-HTVzh22Z, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

Virtualenv location: /home/ubuntu/.local/share/virtualenvs/tmpx6eibpu2-HTVzh22Z
WARNING: 'requirements.txt' is empty.
Updating lock file 'dvc.lock'

Other stages of the pipeline depend on requirements.txt. So if I run exps via the extension requirements.txt is empty, but if run them via command line, it's not empty.
Thus, the same experiments that are run in two different ways aren't reproducable.
And that's a problem...

@shcheklein
Copy link
Member

thanks @alex000kim !

It feels I still don't understand how pipenv works in general. Could you please clarify this step-by-step:

i.e. location in my pipenv virtual env.

how do you usually run dvc in your CLI? how do you activate the environment?

However, the extension uses only the interpreter from the virtual env.

either I'm missing something, or it is supposed to be this way. That's why you have to wrap into pipenv every command. The wrapper itself activates it. No?

@alex000kim
Copy link
Author

I think one of the major differences between (pip + venv) VS pipenv is that the latter drops you into a new subprocess, and not just the same terminal shell with a different environment.

how do you usually run dvc in your CLI? how do you activate the environment?

$ pipenv install # run this once to install deps from Pipfile
$ pipenv shell # spawns a new shell subprocess
(project_venv) $ dvc exp run

See: https://pipenv.pypa.io/en/latest/basics/#example-pipenv-workflow

@mattseddon mattseddon self-assigned this Jun 3, 2022
@shcheklein
Copy link
Member

shcheklein commented Jun 5, 2022

Okay, I see a few things confused me.

  1. pipenv run pip freeze - it should be just pip freeze for this project, right? since, you do pipenv shell
  2. for me it even doesn't initialize DVC (that what I would expect if MS Python doesn't support pipenv for example). The whole extension is inactive. How did you manage to initialize the extension? Solved by manually switching MS Python to pipenv.

Yep, I can recreate in a terminal if I do not use those setup steps:

@mattseddon how made it work at all? manually did the wizard part?

For me I had to manually set pipenv using this https://code.visualstudio.com/docs/python/environments#_select-and-activate-an-environment

@shcheklein
Copy link
Member

@alex000kim removing pipenv run and manually switching Python extension to pipenv (by default it picks brew Python 3 for me) https://code.visualstudio.com/docs/python/environments#_select-and-activate-an-environment fixed it completely:

>>>> DVC Terminal >>>>

Running: dvc exp run

Running stage 'check_packages':
> pip freeze > requirements.txt

Running stage 'data_load':
> python src/stages/data_load.py --config=params.yaml
100%|██████████| 392/392 [00:00<00:00, 494.60it/s]
Updating lock file 'dvc.lock'

Running stage 'data_split':
> python src/stages/data_split.py --config=params.yaml
Updating lock file 'dvc.lock'

Running stage 'train':
> python src/stages/train.py --config=params.yaml
/Users/ivan/.local/share/virtualenvs/magnetic-tiles-defect-4MZ-4MEx/lib/python3.9/site-packages/torch/_tensor.py:1142: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  ret = func(*args, **kwargs)

The only thing is that it's stuck on

> python src/stages/train.py --config=params.yaml
/Users/ivan/.local/share/virtualenvs/magnetic-tiles-defect-4MZ-4MEx/lib/python3.9/site-packages/torch/_tensor.py:1142: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  ret = func(*args, **kwargs)
INFO:dvclive:Overriding _path with value provided by DVC: training_metrics
INFO:dvclive:Report path (if generated): /Users/ivan/Projects/magnetic-tiles-defect/training_metrics_dvc_plots/index.html
epoch     train_loss  valid_loss  foreground_acc  jaccard_coeff  dice_multi  time    
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)

but it looks like it's a separate issue related to the project itself.

Also, a different bug is here - #1837 that should be addressed.

@shcheklein shcheklein added triage and removed bug Something isn't working labels Jun 5, 2022
@alex000kim
Copy link
Author

removing pipenv run and manually switching Python extension to pipenv ...

Oh, good point! So this would solve this particular issue. The question of how to approach env variables from virtual envs is still open.

but it looks like it's a separate issue related to the project itself.

Yeah, I think that's something to do with pytorch running in MacOS (maybe even M1 specific)

@shcheklein
Copy link
Member

The question of how to approach env variables from virtual envs is still open.

Hmm, either I'm missing something or it's not an issue. For me after removing pipenv from the command fixed both - terminal and extension (button). I can run from both w/o applying any special tricks. MS Python deals with pipenv properly in my case.

@alex000kim
Copy link
Author

alex000kim commented Jun 5, 2022

I believe it works now because of a recent workaround I made (specifically to address this issue): iterative/magnetic-tiles-defect@8ccc5fb

Before, I used to set PYTHONPATH as part of the virtual env activation script:
iterative/magnetic-tiles-defect@357a0a1
The extension did not pick up this env var.

@shcheklein
Copy link
Member

Okay, but it's the same if I do it in a regular terminal (even outside VS Code):

pipenv shell
python src/stages/data_load.py --config=params.yaml

gives me:

Traceback (most recent call last):
  File "/Users/ivan/Projects/magnetic-tiles-defect/src/stages/data_load.py", line 8, in <module>
    from src.data_utils import dataset_prep
ModuleNotFoundError: No module named 'src'

If looks to me like a general Python path problem, right?

https://stackoverflow.com/questions/1054271/how-to-import-a-python-class-that-is-in-a-directory-above

@alex000kim
Copy link
Author

alex000kim commented Jun 5, 2022

Yes, it's a general python problem, but if you run:

echo "export PYTHONPATH=$PWD" >> $VIRTUAL_ENV/bin/activate
source $VIRTUAL_ENV/bin/activate

you address the problem without having to resort to the hack of doing:

import sys
sys.path.append(<dir_path>)

The right PYTHONPATH becomes part of the virtual env, i.e you always get the right PYTHONPATH as long as you are inside your virtual env.

We do this in several places (including teaching this as part of the online course):

@shcheklein
Copy link
Member

Sounds like a hack, not sure virtualenv file made to be modified. I think I would either use relative paths in imports (I think should be possible, right?). It will be easier to use projects I think. Need to do a bit of reserach here ...

But, I understand the problem finally, and ideally it should be working no matter if we use it in the demo projects or not, since VS Code terminal supports it.

@mattseddon could you please point me to the code that runs DVC commands? how does it interact with MS Python extension?

@mattseddon
Copy link
Member

There is a fair bit of logic there. I would start in extension/src/cli/options.ts.

This is next on my list of things to fix.

I have not tested yet but I am hoping that when pipenv is involved we can use the same run command as we should be using for conda. I.e for running experiments we should be able to simply call pipenv run dvc exp run. This should hook us into the correct virtual environment and do the same thing as sourcing the activate file. I would only do this for running experiments (and leave the rest of the logic as is) because there is an overhead to starting up the environment/pipenv.

quick test:

Remove sys hack from project and then run

❯ echo $PWD                    
/magnetic-tiles-defect
❯ echo "PYTHONPATH=$PWD" > .env 
❯ pipenv run dvc exp run  
Loading .env environment variables...
Running stage 'check_packages':
> pipenv run pip freeze > requirements.txt
Loading .env environment variables...

Running stage 'data_load':
> python src/stages/data_load.py --config=params.yaml

...

Seems to work!

We should be able to update the setup wizard to ask the user which package manager they use (conda, virtualenv or venv, or pipenv) and adjust the way we call Python accordingly 👍🏻.

@mattseddon
Copy link
Member

Automatic loading of .env in pipenv: https://pipenv.pypa.io/en/latest/advanced/#automatic-loading-of-env

@shcheklein
Copy link
Member

Wrapping up everything with pipenv makes sense. In that case I would expect that any of the hacks / solutions - .env, modifying the activate script, changing Python files, packaging code properly, etc - would work. We don't need focus specifically on .env.

@shcheklein
Copy link
Member

There is a fair bit of logic there. I would start in extension/src/cli/options.ts

Do I get it right, but it's not even pipenv specific - it's the same problem exists for any virtualenv - we don't activate it? Is it possible to get the same logic as MS Python is using to activate different envs before running dvc exp at least?

@mattseddon
Copy link
Member

mattseddon commented Jun 6, 2022

There is a fair bit of logic there. I would start in extension/src/cli/options.ts

Do I get it right, but it's not even pipenv specific - it's the same problem exists for any virtualenv - we don't activate it? Is it possible to get the same logic as MS Python is using to activate different envs before running dvc exp at least?

From #1815 (comment)

Right. Why don't we activate the entire virtual env when running dvc via the extension? (instead of using only the interpreter)

We made a decision a long time ago not to try and recreate the logic in the Python extension which is used to "auto-activate" virtual environments. There is an ongoing here: microsoft/vscode-python#11039 (comment) where they have further advocated for using the path to the interpreter instead of activating the environment. However, we know that does not meet all of our needs.

Currently, the Python extension does not actually expose the activation details. We don't know the details of the expected activation file.

For now I will

Hack around with the interpreter path that we are given specifically for the runner and source this before running dvc exp run (and whatever the windows equivalent is). I will have to change that subprocess for the runner back to a shell so that source is available. I will also need to test/fix the way we run conda. I will do this under #1791 and document the current rules for running commands.

Alternative

Go back to sending commands directly to our own integrated terminal as we did back in 2020:

Screen.Recording.2021-01-22.at.10.29.04.am.mov

We can reliably wait for the terminal to be activated based on this hack but moving back to the integrated terminal would bring its own set of problems we'd need to workaround.

See #51 (comment).

Notes:

There are a lot of outstanding issues around virtual environments and package managers within the Python extension:

microsoft/vscode-python#8870
microsoft/vscode-python#8770
microsoft/vscode-python#16988
microsoft/vscode#99878
microsoft/vscode-python#4581
microsoft/vscode-python#13327
microsoft/vscode-python#15818
microsoft/vscode-python#4013

All of them seem to come back to microsoft/vscode-python#11039 as the grand plan for solving everything. I have already commented on the issue and I will continue to watch to see how it develops.

Pipenv specific details:
The python extension moved away from running pipenv shell because it breaks their debugger, installing tools and because activation is slow. (microsoft/vscode-python#15745 & microsoft/vscode-python#16988)
pipenv run does not source extra environment variables provided to the activate script.
pipenv shell does but will not work in a child process.
source-ing the activate script will not automatically load .env files whereas pipenv run and pipenv shell does.

@shcheklein
Copy link
Member

Thanks, @mattseddon . Awesome research and summary. I think we should probably close this specific ticket and consolidate all this info in one place? There is nothing that unique to this situation. We can summarize some workarounds that work here in the description for the community.

Regarding the solution - how much of an effort it would be to copy-paste the logic from MS Python - is it open source?

@mattseddon
Copy link
Member

Regarding the solution - how much of an effort it would be to copy-paste the logic from MS Python - is it open source?

Yes, it is open source. It would be a significant amount of effort. Even finding the specific part of the code is difficult.

I will raise an issue to have the activation command added to their API.

@mattseddon
Copy link
Member

Closing this in favour of #1791.

Proposal for a workaround that we can implement is here: #1791 (comment)

@mattseddon
Copy link
Member

@alex000kim I have found another workaround for this issue.

  1. Close all instances of VS Code.
  2. Open a terminal and navigate to magnetic-tiles-defect.
  3. export PYTHONPATH=$PWD && code .

This should put the PYTHONPATH variable into the VS Code session's environment which we pick up.

Hope this helps as an interim solution (better than sys.path.append).

I will keep working on it 👍🏻.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: extension Area: core extension triage
Projects
None yet
Development

No branches or pull requests

4 participants