Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where is python? #3365

Closed
1 task done
klmcwhirter opened this issue Dec 31, 2024 · 5 comments · Fixed by #3367
Closed
1 task done

Where is python? #3365

klmcwhirter opened this issue Dec 31, 2024 · 5 comments · Fixed by #3367
Labels
⭐ enhancement Improvements for existing features

Comments

@klmcwhirter
Copy link

klmcwhirter commented Dec 31, 2024

Feature Description

The app should provide a deterministic, long term way to specify the python interpreter to use in venv's.

Problem and Solution

There is currently a single practical global config item that drives python interpreter discovery:
python.providers = ['venv', 'path', 'asdf', 'pyenv', 'rye', 'winreg', 'macos']

But I have dozens of python projects in my src dir that all use pdm. Needing different versions of Python.

When I back up the src dir, I intentionally delete all .venv dirs because they should be reproducible.

When I get back to a project, I perform a pdm create task that I define every where that looks like this:

create.shell = "pdm venv remove -y in-project; pdm install  --no-editable"

The reason for the remove is: python upgrades. By removing the venv first this task serves multiple purposes.

The problem is that when the implicit recreation of the .venv happens I get mixed results depending on the project context.

If the first item in the list of resolvers was 'use an entry in pyproject.toml' that would probably cover all my use cases.

e.g.,

[tool.pdm.python]
path = "~/.local/bin/python3.14"

or

[tool.pdm.python]
path = "/opt/python/bin/python"

I realize this stinks because of the duplication of the version info. But if I need to do this, well, it is on me to get it right.

In the situation I find myself right now, there is NO executable named python that will match the constraint:
requires-python = "==3.14.*"

Additional Context

When setting .pdm-python to the path of the interpreter to use - as the place that survives the lifecycle of the venv - after recreating a venv I enter PEP-582 mode; no .venv, but a __pypackages__ dir instead. Even though python.use_venv = True is set. In my mind this is a bug because it is counter to what the docs say. But, they also are not clear on how to disable PEP-582. Just how to enable it is discussed.

I need to be able to recreate the .venv, perhaps months later, after a backup.

As to my answer about contributing below ... yes, I have decades of experience with Python and contributing to FOSS projects. But almost no experience with the pdm source. With some guidance (read pair programming / testing) I might be able to make it work.

Are you willing to contribute to the development of this feature?

  • Yes, I am willing to contribute to the development of this feature.
@klmcwhirter klmcwhirter added the ⭐ enhancement Improvements for existing features label Dec 31, 2024
@frostming frostming linked a pull request Jan 2, 2025 that will close this issue
2 tasks
@klmcwhirter
Copy link
Author

Is it common for you guys to close requests for enhancements without doing a review with the requester?

I believe this needs to be re-opened as the main requirement was missed.

Python interpreter discovery needs to be deterministic AND not implicit via a version number. We already have that in pyproject.toml.

The example I provided you was:

  • "python" in the PATH is a symlink to /usr/bin/python3.13
  • I need 3.14 for testing purposes but there is no "python" in any dir in PATH that will satisfy requires_python="==3.14.*". It needs to look explicitly for the absolute path I provide. I cannot predict what naming convention might be used for a cache of various versions of python3.14 (a1, a2, etc.) for different platforms. There most certainly will be multiple projects being tested on the same server(s) at the same time in varying states of verifying different versions.

We must have control of the absolute path to the interpreter. Even pdm init takes a path to an interpreter and then just throws that information away.

Specifying a version in an env var or in a file does not provide anything new.

I guess we will continue to manage the symlinks in .venv/bin ourselves.

But it sure would be good if we could correct this before it goes further. It is much less expensive now, vs later.

Thoughts? I am in an information vacuum here. I was completely blind-sided by this closure without anyone reaching out to me at all.

Because this was closed I can only assume that someone got in too big of a hurry and missed my main requirement.

Please help me understand.

@frostming frostming reopened this Jan 2, 2025
@frostming
Copy link
Collaborator

frostming commented Jan 2, 2025

  • I made this improvement because I think it is reasonable and in line with the community's conventions, not necessarily to accurately solve your problem (though I did indeed get some inspiration from this issue). If you think I linked incorrectly, I have already reopened it.
  • Writing absolute paths In the config file, especially in pyproject.toml makes it unshareable, which is not accepted. We should seek for other solutions. The main problem is your desired interpreter is not discoverable by PDM's python finder, so why not makes it so?
  • Most of the time, PDM works by creating an in-project venv from whatever interpreter you picks, so as long as the version and architecture are the same, the venv should behave identically no matter the (base) interpreter is from path A or B.
  • If the precise interpreter path does matter that much, a possible solution is to use script hooks, like following:
    [tool.pdm.scripts]
    post_init = "pdm venv create -f /opt/python/bin/python"
    This command will be run immediately after pdm init and -f means to overwrite the existing one created by pdm init, if any.

@klmcwhirter
Copy link
Author

All that is fair. I just would have preferred an opportunity to discuss before this was closed.

Another approach would be to provide a hint to the resolver as to the executable name (e.g., 'python3.14-a1' as well as 'python'). Or - this will be harder I think, but for completeness, support a list of "well-known executable names".

I say that would be harder, because developing a standard like that would need to be as complete as possible. On x86_64 Linux there will be 32-bit and 64-bit variants of the various versions.

It would be much more straightforward to expect a project to provide a list of executable names to interrogate for meeting the requires_python constraint.

It is totally fair to push back on the full path. In my case, it is just a company preference to have each team "control their own destiny explicitly". But absolute path is not the only way - it is just the way to which we are accustomed.

Leaving this open for now to get it out of the way of the release you are trying to get out is the right choice.

We have a workaround in place now. I just do not want younger devs to get comfortable mutating a venv dir outside of the tool we are using.

Thanks for responding so quickly.

Let me do some digging once I can free up some time to get familiar with the code base. I do realize that there are multiple resolvers.

Perhaps, worse case scenario, we could write our own custom resolver and configure it to be used. But I barely know what I am saying.

Although I gladly gave up poetry for pdm because you guys apparently ran into the same ting we did and you designed pdm to run in its own venv! But I have not had a need until now to pay close attention to the code.

Thanks again. Let's leave this open for now please.

@klmcwhirter
Copy link
Author

These are just some notes to guide my further exploration. But capturing them here to be transparent and to solicit early feedback.

What is an environment vs environment type?

An environment at my company is a collection of infrastructure, services and software (some deployed, some installed, some SaaS) devoted to running a particular version of a product being developed, under test, and eventually in production.

A server is associated with an environment type - dev, QA, integrated test, performance test or production. Environment types may have different storage levels/policies, network architecture, etc. - think about how performance test and production environments need to be similar. Whereas the other types of environment types usually are designed to support fast changing functional requirements.

A server may have many environments, but all of the same environment type.

Note that most servers do not have a network route to the internet because of security policy. Package resolution (e.g., python, maven, gradle, Nuget, npm, etc.) happens against a product like Sonatype Nexus Repository Manager.

What are the characteristics of a python installation for an environment?

  • python version
  • interpreter type - cpython, cython, etc.
  • compile time feature set, optimizations, other internal patches, etc.
    • internal version identifier of those sets of customizations
  • installation path (sometimes associated with a specific environment or set of related environments)

Environment Software Resolution

Paths to software components are never implicit; always explicit. We cannot afford to have any magic that could cost the project $s to chase down issues related to invalid environment setup. Validation of each environment is automated and can consist of a dozen pages of (what would be manual) steps covering multiple servers, cloud service envs, etc.

Our most common approach (works whether environment is in VM, Kubernetes, etc.) is to use a set of environment variables to build up the absolute path to each executable, config file, data directory, db, service urls, etc.

A path to a set of binaries might be something like:

/storage-root/project-1234b/team-function/team-name/experiment-name/env-name/bin

e.g.,
/opt3/new-regulation/accounting/acs/improve-stp/qa4/bin
/opt3/new-regulation/accounting/acs/improve-stp/bin
/opt3/new-regulation/accounting/acs/bin
/opt3/new-regulation/accounting/bin

And in that bin dir there might be 5 python executables for different parts of the project whose software is versioned independently.

Summary

All that is to explain why a full path to a python executable would be the most straight-forward way for an application to tell a tool like pdm what it needs.

Standard Python version resolution is simply not sufficient.

I will do some digging into the pdm source over the coming weeks. But I suspect that if a full path specifier is not aligned with the pdm-project policies I will need to find a custom solution that utilizes the Open-Closed principle - such as the custom resolver to which I alluded above; if that is possible.

@klmcwhirter
Copy link
Author

klmcwhirter commented Jan 3, 2025

OK. I have confirmed that your suggestion will meet our needs. Thank you. That is precisely why I filed this enhancement request.

The path passed into pdm create ends up in .venv/pyenv.cfg. And if I need to recreate per your suggestion then the env vars I need are encoded into a script in pyproject.toml; version controlled with the rest of the app(s) source - including where the needed vars are set, etc.

Here is my POC. This repo also uses a devcontainer for those interested --> freethreading_python.

Here is how I ended up implementing the create script:

[tool.pdm.scripts]
# The ${HOME} env var represents any set of env vars needed to construct the path to python.
# The point is that information is encoded here and is version controlled so the venv creation is repeatable.
_create_venv.shell = "pdm venv create -f ${HOME}/.local/bin/python3.14.0a3"
# This would not typically be needed. It just happens that both executables are needed for this specific project.
_create_symlinks.shell = "ln -s ${HOME}/.local/bin/python3.14.0a3t .venv/bin/python3.14.0a3t"
create.composite = ["_create_venv", "_create_symlinks"]
create.help = "Create the .venv and needed symlinks"

As such, I am closing this ER.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐ enhancement Improvements for existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants