-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pip wheel build jobs #2969
Fix pip wheel build jobs #2969
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2969
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New FailuresAs of commit b5c293c with merge base d3326a2 (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Attempting a wheel build in https://github.com/pytorch/executorch/actions/runs/8636383613, triggered from https://github.com/pytorch/executorch/actions/workflows/build-wheels-linux.yml by pointing at this branch. EDIT: I do see at least It looks like we're currently using the unmodified pytorch/pytorch script to build our pip packages, so I can't just edit it directly to try things out. And the action scripts live in a different repo, which means I can't tweak them as part of this PR: https://github.com/pytorch/test-infra/blob/main/.github/actions/setup-binary-builds/action.yml And that action assumes that setting various env vars will affect the version, using logic like in https://github.com/pytorch/pytorch/blob/main/tools/generate_torch_version.py -- so we may need to duplicate some of that logic. Though some parts, like adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need requires like setuptools and wheel?
For TorchFix building wheels is just fine without those: https://github.com/pytorch-labs/torchfix/blob/main/pyproject.toml
You probably also want requires-python = ">=3.9"
or something like that to specify min Python version.
Also in the future you should add project info like description, license, URL, etc. - again see https://github.com/pytorch-labs/torchfix/blob/main/pyproject.toml for an example.
I tried building in a totally empty environment locally, and it works for me:
Built just fine; in the CI job, it would have failed during the And I was able to install the wheel:
|
dec3465
to
4213aff
Compare
Trying another run at https://github.com/pytorch/executorch/actions/runs/8655047528 after updating EDIT: This was failing because it only tried building for python3.8, which ET isn't compatible with. Huy added ciflow/binaries/all to this PR to try all python versions, and I started a new job in https://github.com/pytorch/executorch/actions/runs/8655199802 |
Got past the build deps, now failing with
Huy says: they are indeed running as root inside the Docker image |
f453b6e
to
4a05874
Compare
Most wheel builds are green after changing the workflow to not get recursive submodules: https://github.com/pytorch/executorch/actions/runs/8668397609/job/23773398885?pr=2969 Zipped wheels available at https://github.com/pytorch/executorch/actions/runs/8668397609?pr=2969#artifacts I created pytorch/test-infra#5095 to add that option to upstream test-infra, but worst case we can stick with this workflow fork in the ET release branch |
078ec4b
to
786ffd2
Compare
These pip dependencies need to be present to build the pip wheel. Also, change the version to a stub that looks less like a real version, until we can hook up the logic to get the version from the git repo state.
Manually install build requirements because `python setup.py bdist_wheel` does not install them.
setup.py is sometimes run as root in docker containers. buck2 doesn't allow running as root unless $HOME is owned by root or does not exist. So temporarily undefine it while configuring cmake, which runs buck2 to get some source lists. Also, the buck2 daemon can sometimes get stuck on the CI workers. Try killing it before starting the build, ignoring any failures.
Some CI jobs can fail with "OS file watch limit reached" when running buck2. This section should reduce the number of files that it tries to watch.
Change the build-wheels workflow to only fetch the first layer of submodules. ExecuTorch only needs the first layer of submodules to build its pip package, but the `build_wheels_*.yaml` workflows will recursively fetch all submodules by default. Fetching all submodules can also cause `buck2` to fail because it will try to watch too many files. This change makes `buck2` work on the CI runners, speeds up the jobs, and reduces disk/network usage.
62e7d06
to
4f31c17
Compare
Latest version turns on pybindings and xnnpack everywhere, and coreml/mps on macos. Linux build failing at https://github.com/pytorch/executorch/actions/runs/8676736127/job/23791536165?pr=2969#step:14:142
macos build failing in coreml https://github.com/pytorch/executorch/actions/runs/8676736126/job/23791536162?pr=2969#step:12:249
And similar for |
Always build the pybindings when building the pip wheel. Always link in XNNPACK. On macos, also link in MPS. Core ML can't build on the worker machine, though, because the version of macOS is too old; Core ML requires some features introduced in macOS 10.15.
4f31c17
to
96ab9cc
Compare
Latest wheel jobs generally look good. Even if they don't all work, some of them do, which is an improvement over the current state. |
Passing the `std::` functions directory to unary_ufunc_realhb_to_bool can cause "error: cannot resolve overloaded function ‘isinf’ based on conversion to type ‘torch::executor::FunctionRef<bool(double)>’" in some compilation environments. Might be because these functions can be templatized, or because they became constexpr in C++23.
fe5ccd2
to
b5c293c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the detailed comments! Really helpful to understand what the code does.
Did we test the mps/xnnpack are actually linked and registered? |
Good point @cccclai! Testing xnnpack pybindings:
Used hexedit to show that the .pte file does have XNNPACK delegate entries:
Then tried loading the program:
This shows that it successfully loaded the program and all of its methods, which means it was able to resolve the XNNPACK backend entries. Also verified that the pybindings .so contains some XNNPACK-related symbols:
|
I am merging this directly into release/0.2 and will later cherry-pick these changes into main. |
These pip dependencies need to be present to build the pip wheel.
Also, change the version to a stub that looks less like a real version, until we can hook up the logic to get the version from the git repo state.