Skip to content

Commit

Permalink
Merge branch 'main' into dorotat/use-ngc-pytest
Browse files Browse the repository at this point in the history
Signed-off-by: Dorota Toczydlowska <115542912+dorotat-nv@users.noreply.github.com>
  • Loading branch information
dorotat-nv authored Jan 7, 2025
2 parents 7032a0c + dca314e commit a1e4dac
Show file tree
Hide file tree
Showing 82 changed files with 3,942 additions and 1,463 deletions.
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"NUMBA_CACHE_DIR": "/tmp/"
},
"postCreateCommand": "./.devcontainer/postCreateCommand.sh",
"remoteUser": "bionemo",
"remoteUser": "ubuntu",
"customizations": {
"vscode": {
"extensions": [
Expand Down
59 changes: 24 additions & 35 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,34 @@
(**NOTE:** _**delete** these instructional lines as you fill-out this PR template_)
### Description
<!-- Provide a detailed description of the changes in this PR -->

(**NOTE:** _template is designed to be filled-in and used as the **squashed commit message for the entire PR**. _Italicized text_ is intended to be deleted as you fill in this template. Use the text between the `---`)
### Type of changes
<!-- Mark the relevant option with an [x] -->

---
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Refactor
- [ ] Documentation update
- [ ] Other (please describe):

_High level summary of changes. Try to keep this as short and informative as possible: less is more._
### CI Pipeline Configuration
Configure CI behavior by checking relevant boxes below. This will automatically apply labels.

_Describe your changes. You can be more detailed and descriptive here. If it is a code change, Be sure to answer:_
- _What is changing?_
- _What is the new or fixed functionality?_
- _Why or when would someone want to use these changes?_
- _How can someone use these changes?_
---
- [ ] [SKIP_CI](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#skip_ci) - Skip all continuous integration tests
- [ ] [INCLUDE_NOTEBOOKS_TESTS](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#include_notebooks_tests) - Execute notebook validation tests in pytest

## Summary
_High level summary of changes. Try to keep this as short and informative as possible: less is more._
> [!NOTE]
> By default, the notebooks validation tests are skipped unless explicitly enabled.
## Details
_Describe your changes. You can be more detailed and descriptive here._

## Usage
_How does a user interact with the changed code?_
### Usage
<!--- How does a user interact with the changed code -->
```python
python -m your.new.module -and -all -options
```

## Testing
_How do you prove that your code behaves the way you claim?_

Tests for these changes can be run via:
```shell
pytest -v tests/your/new/or/existing/test_functions.py::test_function
TODO: Add code snippet
```

### Pre-submit Checklist
<!--- Ensure all items are completed before submitting -->

(**NOTE:** _also **delete** this checklist as you fill-out this PR template_)

**Most of the changes** to files with extensions `*.py`, `*.yaml`, `*.yml`, `Dockerfile*` or `requirements.txt` **DO REQUIRE both `pytest-` and `jet-` CI stages**.

- [ ] Did you review the [Before your PR is "Ready for review" section](https://github.com/NVIDIA/bionemo-framework/-/blob/dev/CONTRIBUTING.md?ref_type=heads#before-pr-ready) before asking for review?
- [ ] Did you make sure your changes have tests? Did you test your changes locally?
- [ ] Can you add [the `SKIP_CI` label](https://github.com/NVIDIA/bionemo-framework/-/blob/dev/CONTRIBUTING.md?ref_type=heads#skip-ci) to your PR?
- [ ] Can you add [the `PYTEST_NOT_REQUIRED` label](https://github.com/NVIDIA/bionemo-framework/-/blob/dev/CONTRIBUTING.md?ref_type=heads#skip-pytest) to your PR?
- [ ] Can you add [the `JET_NOT_REQUIRED` label](https://github.com/NVIDIA/bionemo-framework/-/blob/dev/CONTRIBUTING.md?ref_type=heads#skip-jet) to your PR?
- [ ] I have tested these changes locally
- [ ] I have updated the documentation accordingly
- [ ] I have added/updated tests as needed
- [ ] All existing tests pass successfully
48 changes: 48 additions & 0 deletions .github/workflows/pr-labels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: PR Label Management

on:
pull_request:
types: [opened, edited, synchronize]

jobs:
manage-labels:
runs-on: ubuntu-latest
permissions:
pull-requests: write

steps:
- name: Check PR body and manage labels
uses: actions/github-script@v6
with:
script: |
const prBody = context.payload.pull_request.body;
const labelChecks = {
'SKIP_CI': /\[x\]\s*\[SKIP_CI\]/i,
'INCLUDE_NOTEBOOKS_TESTS': /\[x\]\s*\[INCLUDE_NOTEBOOKS_TESTS\]/i
};
const currentLabels = new Set(
context.payload.pull_request.labels.map(label => label.name)
);
for (const [label, pattern] of Object.entries(labelChecks)) {
const shouldHaveLabel = pattern.test(prBody);
const hasLabel = currentLabels.has(label);
if (shouldHaveLabel && !hasLabel) {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
labels: [label]
});
} else if (!shouldHaveLabel && hasLabel) {
await github.rest.issues.removeLabel({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
name: label
});
}
}
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
docs/site/
*.nemo
protein/
singlecell/
results/

# Local configs
Expand Down
2 changes: 1 addition & 1 deletion 3rdparty/Megatron-LM
Submodule Megatron-LM updated 226 files
2 changes: 1 addition & 1 deletion 3rdparty/NeMo
Submodule NeMo updated 372 files
2 changes: 1 addition & 1 deletion CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -93,4 +93,4 @@ sub-packages/bionemo-geneformer @jstjohn @malcolmgreaves @skothenhill-nv

sub-packages/bionemo-scdl @jstjohn @malcolmgreaves @polinabinder1 @skothenhill-nv

sub-packages/bionemo-noodles @skothenhill-nv @malcolmgreaves @jstjohn @edawson
sub-packages/bionemo-noodles @skothenhill-nv @malcolmgreaves @jstjohn @edawson @cspades
90 changes: 48 additions & 42 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
# Base image with apex and transformer engine, but without NeMo or Megatron-LM.
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:24.02-py3
# Note that the core NeMo docker container is defined here:
# https://gitlab-master.nvidia.com/dl/JoC/nemo-ci/-/blob/main/llm_train/Dockerfile.train
# with settings that get defined/injected from this config:
# https://gitlab-master.nvidia.com/dl/JoC/nemo-ci/-/blob/main/.gitlab-ci.yml
# We should keep versions in our container up to date to ensure that we get the latest tested perf improvements and
# training loss curves from NeMo.
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:24.12-py3

FROM rust:1.82.0 as rust-env
FROM rust:1.82.0 AS rust-env

RUN rustup set profile minimal && \
rustup install 1.82.0 && \
rustup target add x86_64-unknown-linux-gnu && \
rustup default 1.82.0
rustup install 1.82.0 && \
rustup target add x86_64-unknown-linux-gnu && \
rustup default 1.82.0

FROM ${BASE_IMAGE} AS bionemo2-base

Expand All @@ -25,7 +31,7 @@ RUN git clone https://github.com/NVIDIA/apex.git && \
--config-settings "--build-option=--cpp_ext --cuda_ext --fast_layer_norm --distributed_adam --deprecated_fused_adam --group_norm"

# Transformer Engine pre-1.7.0. 1.7 standardizes the meaning of bits in the attention mask to match
ARG TE_COMMIT=c27ee60ec746210bcea4ec33958dbbff06706506
ARG TE_COMMIT=2215fa5c7557b66034068816020f9f611019e457
RUN git clone https://github.com/NVIDIA/TransformerEngine.git && \
cd TransformerEngine && \
git fetch origin ${TE_COMMIT} && \
Expand All @@ -49,11 +55,11 @@ RUN apt-get install -y gnupg
# Check the nemo dependency for causal conv1d and make sure this checkout
# tag matches. If not, update the tag in the following line.
RUN CAUSAL_CONV1D_FORCE_BUILD=TRUE pip --disable-pip-version-check --no-cache-dir install \
git+https://github.com/Dao-AILab/causal-conv1d.git@v1.2.0.post2
git+https://github.com/Dao-AILab/causal-conv1d.git@v1.2.2.post1

# Mamba dependancy installation
RUN pip --disable-pip-version-check --no-cache-dir install \
git+https://github.com/state-spaces/mamba.git@v2.0.3
git+https://github.com/state-spaces/mamba.git@v2.2.2

RUN pip install hatchling # needed to install nemo-run
ARG NEMU_RUN_TAG=34259bd3e752fef94045a9a019e4aaf62bd11ce2
Expand All @@ -67,30 +73,23 @@ RUN rm -rf /build

# Addressing Security Scan Vulnerabilities
RUN rm -rf /opt/pytorch/pytorch/third_party/onnx
RUN apt-get update && \
apt-get install -y openssh-client=1:8.9p1-3ubuntu0.10 && \
rm -rf /var/lib/apt/lists/*
RUN apt purge -y libslurm37 libpmi2-0 && \
apt autoremove -y
RUN source /usr/local/nvm/nvm.sh && \
NODE_VER=$(nvm current) && \
nvm deactivate && \
nvm uninstall $NODE_VER && \
sed -i "/NVM/d" /root/.bashrc && \
sed -i "/nvm.sh/d" /etc/bash.bashrc


# Use UV to install python packages from the workspace. This just installs packages into the system's python
# environment, and does not use the current uv.lock file.
# environment, and does not use the current uv.lock file. Note that with python 3.12, we now need to set
# UV_BREAK_SYSTEM_PACKAGES, since the pytorch base image has made the decision not to use a virtual environment and UV
# does not respect the PIP_BREAK_SYSTEM_PACKAGES environment variable set in the base dockerfile.
COPY --from=ghcr.io/astral-sh/uv:0.4.25 /uv /usr/local/bin/uv
ENV UV_LINK_MODE=copy \
UV_COMPILE_BYTECODE=1 \
UV_PYTHON_DOWNLOADS=never \
UV_SYSTEM_PYTHON=true
UV_SYSTEM_PYTHON=true \
UV_NO_CACHE=1 \
UV_BREAK_SYSTEM_PACKAGES=1

# Install the bionemo-geomtric requirements ahead of copying over the rest of the repo, so that we can cache their
# Install the bionemo-geometric requirements ahead of copying over the rest of the repo, so that we can cache their
# installation. These involve building some torch extensions, so they can take a while to install.
RUN --mount=type=bind,source=./sub-packages/bionemo-geometric/requirements.txt,target=/requirements-pyg.txt \
--mount=type=cache,id=uv-cache,target=/root/.cache,sharing=locked \
uv pip install --no-build-isolation -r /requirements-pyg.txt

WORKDIR /workspace/bionemo2
Expand All @@ -107,19 +106,32 @@ ENV PATH="/usr/local/cargo/bin:/usr/local/rustup/bin:${PATH}"
ENV RUSTUP_HOME="/usr/local/rustup"

# Note, we need to mount the .git folder here so that setuptools-scm is able to fetch git tag for version.
# Includes a hack to install tensorstore 0.1.45, which doesn't distribute a pypi wheel for python 3.12, and the metadata
# in the source distribution doesn't match the expected pypi version.
RUN --mount=type=bind,source=./.git,target=./.git \
--mount=type=bind,source=./requirements-test.txt,target=/requirements-test.txt \
--mount=type=bind,source=./requirements-cve.txt,target=/requirements-cve.txt \
<<EOF
set -eo pipefail
uv pip install maturin --no-build-isolation && uv pip install --no-build-isolation \
uv pip install maturin --no-build-isolation

pip install --use-deprecated=legacy-resolver --no-build-isolation \
tensorstore==0.1.45
sed -i 's/^Version: 0\.0\.0$/Version: 0.1.45/' \
/usr/local/lib/python3.12/dist-packages/tensorstore-0.0.0.dist-info/METADATA
mv /usr/local/lib/python3.12/dist-packages/tensorstore-0.0.0.dist-info \
/usr/local/lib/python3.12/dist-packages/tensorstore-0.1.45.dist-info

uv pip install --no-build-isolation \
./3rdparty/* \
./sub-packages/bionemo-* \
-r /requirements-cve.txt \
-r /requirements-test.txt

rm -rf ./3rdparty
rm -rf /tmp/*
rm -rf ./sub-packages/bionemo-noodles/target
rm -rf /root/.cache/*
EOF

# In the devcontainer image, we just copy over the finished `dist-packages` folder from the build image back into the
Expand All @@ -138,35 +150,32 @@ apt-get install -qyy \
rm -rf /tmp/* /var/tmp/*
EOF

# Create a non-root user to use inside a devcontainer.
ARG USERNAME=bionemo
ARG USER_UID=1000
ARG USER_GID=$USER_UID
RUN groupadd --gid $USER_GID $USERNAME \
&& useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
&& echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
# Use a non-root user to use inside a devcontainer (with ubuntu 23 and later, we can use the default ubuntu user).
ARG USERNAME=ubuntu
RUN echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
&& chmod 0440 /etc/sudoers.d/$USERNAME

# Here we delete the dist-packages directory from the pytorch base image, and copy over the dist-packages directory from
# the build image. This ensures we have all the necessary dependencies installed (megatron, nemo, etc.).
RUN <<EOF
set -eo pipefail
rm -rf /usr/local/lib/python3.10/dist-packages
mkdir -p /usr/local/lib/python3.10/dist-packages
chmod 777 /usr/local/lib/python3.10/dist-packages
rm -rf /usr/local/lib/python3.12/dist-packages
mkdir -p /usr/local/lib/python3.12/dist-packages
chmod 777 /usr/local/lib/python3.12/dist-packages
chmod 777 /usr/local/bin
EOF

USER $USERNAME

COPY --from=bionemo2-base --chown=$USERNAME:$USERNAME --chmod=777 \
/usr/local/lib/python3.10/dist-packages /usr/local/lib/python3.10/dist-packages
/usr/local/lib/python3.12/dist-packages /usr/local/lib/python3.12/dist-packages

COPY --from=ghcr.io/astral-sh/uv:0.4.25 /uv /usr/local/bin/uv
ENV UV_LINK_MODE=copy \
UV_COMPILE_BYTECODE=0 \
UV_PYTHON_DOWNLOADS=never \
UV_SYSTEM_PYTHON=true
UV_SYSTEM_PYTHON=true \
UV_BREAK_SYSTEM_PACKAGES=1

# Bring in the rust toolchain, as maturin is a dependency listed in requirements-dev
COPY --from=rust-env /usr/local/cargo /usr/local/cargo
Expand All @@ -184,7 +193,7 @@ EOF

RUN <<EOF
set -eo pipefail
rm -rf /usr/local/lib/python3.10/dist-packages/bionemo*
rm -rf /usr/local/lib/python3.12/dist-packages/bionemo*
pip uninstall -y nemo_toolkit megatron_core
EOF

Expand All @@ -208,9 +217,6 @@ COPY --from=rust-env /usr/local/rustup /usr/local/rustup
ENV PATH="/usr/local/cargo/bin:/usr/local/rustup/bin:${PATH}"
ENV RUSTUP_HOME="/usr/local/rustup"

RUN uv pip uninstall maturin
RUN uv pip install maturin --no-build-isolation

RUN <<EOF
set -eo pipefail
find . -name __pycache__ -type d -print | xargs rm -rf
Expand All @@ -220,8 +226,8 @@ for sub in ./3rdparty/* ./sub-packages/bionemo-*; do
done
EOF

# Since the entire repo is owned by root, swithcing username for development breaks things.
ARG USERNAME=bionemo
# Since the entire repo is owned by root, switching username for development breaks things.
ARG USERNAME=ubuntu
RUN chown $USERNAME:$USERNAME -R /workspace/bionemo2/
USER $USERNAME

Expand Down
Loading

0 comments on commit a1e4dac

Please sign in to comment.