-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional python packaging documentation, version correction #9405
Conversation
@rouault I downloaded the source distribution (direct link to @vilhelmen I have owed the project some additional documentation on the topic of I think your changes need some additional detail, which I have so far failed to provide to the project. As noted above, I'm surprised that the PyPI package does not reflect these changes yet, so I have not verified my suggestions. The The The commands below should possibly use the Users wanting to install $ GDAL_PYTHON_BINDINGS_WITHOUT_NUMPY= pip install "gdal" Users wanting to install $ pip install numpy
$ pip install "gdal[numpy]" Users can verify that raster support has been installed with: $ python3 -c "from osgeo import gdal_array" Vector support should always be enabled. It is also possible for users wanting raster support to build a version of # Assume Numpy is already installed
# No vector support
$ pip install gdal
# Oops I need vector support. Let me unisntall.
$ pip uninstall gdal
# And reinstall. This picks up the previously built version of 'gdal' that was built without vector support.
$ pip install numpy
$ pip install "gdal[numpy]" will still result in an install that does not have raster support. To get out of this situation, users must instruct $ pip install gdal
# Users can use this command
$ pip install gdal --force-reinstall --no-cache
# Or they can explicitly uninstall and reinstall GDAL - the '--no-cache' flag is _very_ important.
$ pip uninstall gdal
$ pip install gdal --no-cache A major downside to this setup is that any package depending on $ pip install numpy
$ pip install isofit In an ideal world users would be able to explicitly define which portions of GDAL they want with $ pip install gdal[ogr,gdal] These problems are really hard to solve, and any solution would require the GDAL maintainers to take on an even larger burden than the current setup. |
swig/python/README.rst
Outdated
|
||
:: | ||
|
||
pip install wheel setuptools>=67 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For at least newer (very new?) versions of $ pip
, I don't think wheel
has to be explicitly installed. I think it is automatically installed as part of the build process?
Looking at the setuptools
changelog, it isn't really clear to me what feature in v67
is required for installing GDAL. Maybe this?
#3790: Bump vendored version of packaging to 23.0 (pyparsing is no longer required and was removed). As a consequence, users will experience a more strict parsing of requirements. Specifications that don’t comply with PEP 440 and PEP 508 will result in build errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uninstalled wheel and confirmed it's still necessary as of (seemingly) latest pip 24.0:
Running command Getting requirements to build wheel
WARNING: numpy not available! Array support will not be enabled
I'm not up to date with the bleeding edge of packaging, but I was still under the impression that wheel was necessary for anything compiled/non-sdist.
I think the venvs based off system python are secretly finding their way to wheel from the system install, which is why it makes it look like setuptools and wheel aren't needed, but I haven't done any tracing to verify.
All I can say for setuptools is that I tired setuptools==64
through setuptools==69
and 67.0
made it work.
FWIW, here's a trimmed down Dockerfile of what I'm working with:
ARG BASE_CONTAINER=ghcr.io/osgeo/gdal:ubuntu-full-3.8.4
FROM $BASE_CONTAINER
# Base tools and gdal stuff
RUN export DEBIAN_FRONTEND=noninteractive \
&& apt-get update && apt-get install -y \
ca-certificates \
nano tmux htop less git jq wget \
gcc g++ make cmake \
software-properties-common \
&& apt-add-repository -yn ppa:deadsnakes/ppa \
&& apt-get purge --autoremove -y software-properties-common \
&& echo 'UGFja2FnZTogKgpQaW46IHJlbGVhc2Ugbz1MUC1QUEEtZGVhZHNuYWtlcwpQaW4tUHJpb3JpdHk6IDYwMAo=' | base64 -d | tee /etc/apt/preferences.d/snakes_prefer \
&& apt-get update \
&& apt-get install -y python3.12 python3.12-dev python3.12-distutils python3.12-venv \
&& update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 10 \
&& python3.12 -m ensurepip --upgrade --default-pip \
&& apt-get install -y geotiff-bin \
&& rm -rf /var/lib/apt/lists/*
# NO GDAL YOU'RE BAD. Reinstall it! May need to blast numpy tbh.
RUN pip3 install --no-cache --upgrade 'setuptools>=67' 'pip>=23' wheel 'numpy>=1.24' \
&& pip3 uninstall -y GDAL \
&& pip3 install --no-cache --no-cache-dir --upgrade 'GDAL[numpy]=='"$(gdal-config --version)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not up to date with the bleeding edge of packaging, but I was still under the impression that wheel was necessary for anything compiled/non-sdist.
Historically I don't think this is the case, but I'm more than willing to be wrong.All of the different configurations in the current packaging world are super confusing.
I looked through the pip
source code, and this message:
Running command Getting requirements to build wheel
means that pip
is installing wheel
in the build environment. I couldn't figure out exactly how, though.
Doing some more experimenting, it does seem like wheel
is required to install the gdal
source distribution. I would have expected the version of wheel
that pip
seems to install in the build environment to satisfy this requirement, but I guess not.
So, yes, it seems like your install command here is correct.
yes, those changes have been done only on the master / 3.9.0dev branch. The 3.8.x point releases are done from a separate maintenance branch, and I didn't feel #8926 was appropriate for backport |
@rouault Ah thanks for the context! @vilhelmen in that case, my general explanation is the same, but users installing Users wanting only vector support should use: $ pip install gdal and users wanting raster + vector support should use: $ pip install numpy
$ pip install gdal Users wanting raster support can still verify their $ python3 -c "from osgeo import gdal_array" I updated my previous comment with a note about how to get out of a situation where |
Throw out section in Unix because it doesn't really say anything Ty for the notes @pl-kevinwurster
@pl-kevinwurster Thanks for the additional notes. I've lost many hours to python packaging and pip caching, so I feel your pain. I've fleshed out the pip instructions so they should actually be useful to people now (thoughts and prayers to Conda users). I've only eyeballed the RST in my text editor so I'm not 100% on the formatting. Is there an easy way to check it from CI or point the docs page to this commit? Looks like the CI HTML zip artifact doesn't contain the new page? Out of scope for this PR, and I know it would be a pain to verify, but how certain are we on numpy 1.0.0? |
At this point the Have you seen people trying to install |
You've got a good point. If someone is working out of a hard 1.0.0 requirement, they should already know they're in for a bad time. Just wanted to bring it up since we're touching the file. |
@vilhelmen Numpy 1.0 was released 18 years ago. The There are some differences based on CPU architecture, but generally the oldest version of Numpy compatible with Python 3.8 is |
Given how we build the bindings, I've some doubts our CI actually triggers this new pyproject.toml requires. But it would be safe to indicate that as the minimum version.
The gdal_array module makes very primitive use of Numpy, so it could potentially work with very ancient versions (and actually I tested a couple weeks ago against numpy 2.0dev and things seemed to work fine too), but yes hard to be sure how old as people in practice tests against quite recent one |
swig/python/README.rst
Outdated
|
||
To install with numpy support, you need to require the optional numpy component: | ||
In order to enable raster support, libgdal and its development headers must be installed as well as the Python packages numpy, setuptools, and wheel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"raster support". Well GDAL without numpy does support raster, it is just that it is not very convenient if you need to inspect array values, but you can for example script raster conversion with gdal.Translate() without requiring numpy. Maybe use "numpy-based raster support" ? (here and below). The sentence "The base GDAL package contains support for OGR, OSR, and GDAL vectors:" above should also be reworked, as it gives the impression that there's no raster support at all
swig/python/README.rst
Outdated
pip install gdal[numpy]=="$(gdal-config --version).*" | ||
|
||
|
||
Users can verify that reaster support has been installed with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can verify that reaster support has been installed with: | |
Users can verify that raster support has been installed with: |
What does this PR do?
Updates docs on building numpy support. I've been maintaining this SO answer on how to fix it for a year now and I just ran into a new problem installing the numpy bindings. I have tested it and wheel and setuptools cannot be installed alongside gdal[numpy], it must be done before.
There is also a tweak to the version of setuptools in the
pyproject.toml
I determined through experimentation. Pip, or setuptools, or whatever completely ignores this string (happens in my packages, too), but at least it's accurate now. On its own, pip will think >=48 is good enough, which is incredibly wrong for multiple reasons. I'm unaware of a fix, I had the issue back in setuptools 64.What are related issues/pull requests?
None
Tasklist
Environment
ghcr.io/osgeo/gdal:ubuntu-full-3.8.3 with python 3.12.1
Gdal 3.8.4 via homebrew and Python 3.12.2 on Mac