Merge pull request #233 from latchbio/aidan/dind+bugfix

Aidan/dind+bugfix
latchbio · Feb 21, 2023 · d8da2cc · d8da2cc
2 parents 470de08 + 2564b1c
commit d8da2cc
Show file tree

Hide file tree

Showing 15 changed files with 219 additions and 82 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -18,6 +18,18 @@ Types of changes
 
 ### Fixed
 
+* Internal state file should be automatically created when running `latch register` and `latch develop`
+
+### Added
+
+* `latch init`: Docker in Docker template workflow
+* `latch init`: Docker base image
+* Small, medium, and large tasks use the [Sysbox runtime](https://github.com/nestybox/sysbox) to run Docker and other system software within task containers
+
+## 2.13.1 - 2023-02-17
+
+### Fixed
+
 * Add latch/latch_cli/services/init/common to pypi release
 
 ## 2.13.0 - 2023-02-17

diff --git a/docs/source/api/latch_cli.rst b/docs/source/api/latch_cli.rst
@@ -57,18 +57,18 @@ latch\_cli.tui module
    :undoc-members:
    :show-inheritance:
 
-latch\_cli.types module
+latch\_cli.utils module
 -----------------------
 
-.. automodule:: latch_cli.types
+.. automodule:: latch_cli.utils
    :members:
    :undoc-members:
    :show-inheritance:
 
-latch\_cli.utils module
------------------------
+latch\_cli.workflow\_config module
+----------------------------------
 
-.. automodule:: latch_cli.utils
+.. automodule:: latch_cli.workflow_config
    :members:
    :undoc-members:
    :show-inheritance:

diff --git a/docs/source/basics/defining_environment.md b/docs/source/basics/defining_environment.md
@@ -2,11 +2,7 @@
 
 Workflow code is rarely free of dependencies. It may require python or system packages or make use of environment variables. For example, a task that downloads compressed reference data from AWS S3 will need the `aws-cli` and `unzip` [APT](https://en.wikipedia.org/wiki/APT_(software)) packages, then use the `pyyaml` python package to read the included metadata.
 
-Latch manages the execution environment of a workflow using Dockerfiles. A Dockerfile specifies the environment in which the workflow runs.
-
-Latch has three [base images](https://docs.docker.com/build/building/base-images/), one without hardware acceleration drivers, one with CUDA drivers, and one with OPENCL drivers. To use the CUDA or OPENCL base image, use the `--cuda` or `--opencl` flags when running `latch init`.
-
-The workflow environment is encapsulated in [a Docker container](https://en.wikipedia.org/wiki/Docker_(software)), which is created from a recipe defined in [a text document called Dockerfile.](https://docs.docker.com/engine/reference/builder/) In most cases these recipes are repetitive and unnecessarily complicated, so Latch will automatically generate one using conventional dependency lists and heuristics. To use a handwritten Dockerfile, [run the eject command](#ejecting-auto-generation).
+The workflow environment is encapsulated in [a Docker container](https://en.wikipedia.org/wiki/Docker_(software)), which is created from a recipe defined in [a text document named Dockerfile.](https://docs.docker.com/engine/reference/builder/). Latch provides [four baseline environments](../subcommands.md#base-image--b) which each latch workflow inherits from. In most cases, modifying the `Dockefile` manually is unnecessary, so Latch will automatically generate one using conventional dependency lists and heuristics. To use a handwritten Dockerfile, [run the eject command](#ejecting-auto-generation).
 
 ## Automatic Dockerfile Generation
 
@@ -258,9 +254,9 @@ To exclude files from the build use [a `.dockerignore`.](https://docs.docker.com
 
 The default `.dockerignore` includes files auto-generated by Latch.
 
-## Docker Limitations
+## GPU Task Limitations
 
-Docker containers have limitations on some system administration functions. While each workflow runs as a genuine `root` user (UID 0), commands that require [kernel capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html) will fail with "Permission denied". This includes `mount` and `chroot` among others.
+Commands that require certain [kernel capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html) will fail with "Permission denied" in GPU tasks (`small-gpu-task`, `large-gpu-task`). This includes `mount` and `chroot` among others.
 
 ---
 

diff --git a/docs/source/basics/environment/docker_recipes.md b/docs/source/basics/environment/docker_recipes.md
@@ -9,3 +9,7 @@ RUN curl -L https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.4.4/bowt
     unzip bowtie2-2.4.4.zip &&\
     mv bowtie2-2.4.4-linux-x86_64 bowtie2
 ```
+
+## Run Docker in Docker
+
+Use [`--base-image docker` with `latch init`](../../subcommands.md##latch-init) to use a base workflow environment which includes Docker. An example of running a containerized `bowtie2` aligner in a Latch workflow can be found using `latch init --template docker my_bowtie2_example`.
diff --git a/docs/source/subcommands.md b/docs/source/subcommands.md
@@ -20,13 +20,15 @@ One of `r`, `conda`, `subprocess`, `empty`. If not provided, user will be prompt
 
 Generate a Dockerfile for the workflow instead of relying on [auto-generation.](basics/defining_environment.md#automatic-dockerfile-generation)
 
-#### `--cuda`
+#### `--base-image`, `-b`
 
-Make cuda drivers available to task code.
+Each environment is build on one of the following base distributions:
+- `default` with no additional dependencies
+- `cuda` with Nvidia CUDA/cuDNN (cuda 11.4.2, cudnn 8) drivers
+- `opencl` with OpenCL (ubuntu 18.04) drivers
+- `docker` with the Docker daemon
 
-#### `--opencl`
-
-Make opencl drivers available to task code.
+Only one option can be given at a time. If not provided or `default`, only the bare minimum packages to execute the workflow will be installed.
 
 ## `latch register`
 

diff --git a/latch/resources/tasks.py b/latch/resources/tasks.py
@@ -87,7 +87,9 @@ def _get_large_pod() -> Pod:
     primary_container.resources = resources
 
     return Pod(
+        annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
         pod_spec=V1PodSpec(
+            runtime_class_name="sysbox-runc",
             containers=[primary_container],
             tolerations=[
                 V1Toleration(effect="NoSchedule", key="ng", value="cpu-96-spot")
@@ -108,7 +110,9 @@ def _get_medium_pod() -> Pod:
     primary_container.resources = resources
 
     return Pod(
+        annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
         pod_spec=V1PodSpec(
+            runtime_class_name="sysbox-runc",
             containers=[primary_container],
             tolerations=[
                 V1Toleration(effect="NoSchedule", key="ng", value="cpu-32-spot")
@@ -129,7 +133,9 @@ def _get_small_pod() -> Pod:
     primary_container.resources = resources
 
     return Pod(
+        annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
         pod_spec=V1PodSpec(
+            runtime_class_name="sysbox-runc",
             containers=[primary_container],
         ),
         primary_container_name="primary",
@@ -289,14 +295,18 @@ def custom_task(cpu: int, memory: int):
     primary_container.resources = resources
     if cpu < 48 and memory < 128:
         task_config = Pod(
+            annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
             pod_spec=V1PodSpec(
+                runtime_class_name="sysbox-runc",
                 containers=[primary_container],
             ),
             primary_container_name="primary",
         )
     elif cpu < 96 and memory < 180:
         task_config = Pod(
+            annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
             pod_spec=V1PodSpec(
+                runtime_class_name="sysbox-runc",
                 containers=[primary_container],
                 tolerations=[
                     V1Toleration(effect="NoSchedule", key="ng", value="cpu-96-spot")

diff --git a/latch/types/directory.py b/latch/types/directory.py
@@ -148,7 +148,6 @@ def to_python_value(
                 "Casting from Pathlike to LatchDir is currently not supported."
             )
 
-
         while get_origin(expected_python_type) == Annotated:
             expected_python_type = get_args(expected_python_type)[0]
 

diff --git a/latch_cli/constants.py b/latch_cli/constants.py
@@ -7,7 +7,7 @@
 @dataclass(frozen=True)
 class LatchConstants:
     base_image: str = (
-        "812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:ace9-main"
+        "812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:9c8f-main"
     )
 
     mib: int = 2**20

diff --git a/latch_cli/docker_utils/__init__.py b/latch_cli/docker_utils/__init__.py
@@ -6,10 +6,11 @@
 from textwrap import dedent
 from typing import List
 
+import click
 import yaml
 
 from latch_cli.constants import latch_constants
-from latch_cli.types import LatchWorkflowConfig
+from latch_cli.workflow_config import LatchWorkflowConfig, create_and_write_config
 
 
 class DockerCmdBlockOrder(str, Enum):
@@ -197,14 +198,17 @@ def generate_dockerfile(pkg_root: Path, outfile: Path) -> None:
 
     print("Generating Dockerfile")
     try:
-        with open(pkg_root / latch_constants.pkg_config) as f:
+        with (pkg_root / latch_constants.pkg_config).open("r") as f:
             config: LatchWorkflowConfig = LatchWorkflowConfig(**json.load(f))
             print("  - base image:", config.base_image)
             print("  - latch version:", config.latch_version)
     except FileNotFoundError as e:
-        raise RuntimeError(
-            "Could not find a .latch/config file in the supplied directory. If your workflow was created prior to release 2.13.0, you may need to run `latch init` to generate a .latch/config file."
-        ) from e
+        print(
+            "Could not find a .latch/config file in the supplied directory. Creating configuration"
+        )
+        create_and_write_config(pkg_root)
+        with (pkg_root / latch_constants.pkg_config).open("r") as f:
+            config: LatchWorkflowConfig = LatchWorkflowConfig(**json.load(f))
 
     with outfile.open("w") as f:
         f.write("\n".join(get_prologue(config)) + "\n\n")

diff --git a/latch_cli/main.py b/latch_cli/main.py
@@ -12,6 +12,7 @@
 from latch_cli.exceptions.handler import CrashHandler
 from latch_cli.services.init.init import template_flag_to_option
 from latch_cli.utils import get_latest_package_version, get_local_package_version
+from latch_cli.workflow_config import BaseImageOptions
 
 latch_cli.click_utils.patch()
 
@@ -161,23 +162,20 @@ def login(connection: Optional[str]):
     default=False,
 )
 @click.option(
-    "--cuda",
-    help="Create a user editable Dockerfile for this workflow.",
-    is_flag=True,
-    default=False,
-)
-@click.option(
-    "--opencl",
-    help="Create a user editable Dockerfile for this workflow.",
-    is_flag=True,
-    default=False,
+    "--base-image",
+    "-b",
+    help="Which base image to use for the Dockerfile.",
+    type=click.Choice(
+        list(BaseImageOptions._member_names_),
+        case_sensitive=False,
+    ),
+    default="default",
 )
 def init(
     pkg_name: str,
     template: Optional[str] = None,
     dockerfile: bool = False,
-    cuda: bool = False,
-    opencl: bool = False,
+    base_image: str = "default",
 ):
     """Initialize boilerplate for local workflow code."""
 
@@ -186,7 +184,7 @@ def init(
 
     from latch_cli.services.init import init
 
-    created = init(pkg_name, template, dockerfile, cuda, opencl)
+    created = init(pkg_name, template, dockerfile, base_image)
     if created:
         click.secho(f"Created a latch workflow in `{pkg_name}`", fg="green")
         click.secho("Run", fg="green")

diff --git a/latch_cli/services/init/example_docker/assemble.py b/latch_cli/services/init/example_docker/assemble.py
@@ -0,0 +1,63 @@
+import subprocess
+from pathlib import Path
+
+from latch import small_task
+from latch.functions.messages import message
+from latch.types import LatchFile, LatchOutputDir
+
+
+@small_task
+def assembly_task(
+    read1: LatchFile, read2: LatchFile, output_directory: LatchOutputDir
+) -> LatchFile:
+
+    outdir = Path("/root/outputs").resolve()
+    outdir.mkdir(exist_ok=True)
+
+    bowtie2_cmd = [
+        "docker",
+        "run",
+        "--user",
+        "root",
+        "--env",
+        "BOWTIE2_INDEXES=/reference",
+        "--mount",
+        "type=bind,source=/root/reference,target=/reference",
+        "--mount",
+        f"type=bind,source={read1.local_path},target=/r1.fq",
+        "--mount",
+        f"type=bind,source={read2.local_path},target=/r2.fq",
+        "--mount",
+        f"type=bind,source={outdir},target=/outputs",
+        "biocontainers/bowtie2:v2.4.1_cv1",
+        "bowtie2",
+        "--local",
+        "--very-sensitive-local",
+        "-x",
+        "wuhan",
+        "-1",
+        "/r1.fq",
+        "-2",
+        "/r2.fq",
+        "-S",
+        "/outputs/covid_assembly.sam",
+    ]
+
+    try:
+        # When using shell=True, we pass the entire command as a single string as
+        # opposed to a list since the shell will parse the string into a list
+        # using its own rules.
+        subprocess.run(" ".join(bowtie2_cmd), shell=True, check=True)
+    except subprocess.CalledProcessError as e:
+        message(
+            "error",
+            {"title": "Bowtie2 Failed", "body": f"Error: {str(e)}"},
+        )
+        raise e
+
+    # intended output path of the file in Latch console, constructed from
+    # the user provided output directory
+    output_location = f"{output_directory.remote_directory}/covid_assembly.sam"
+    local_sam_file = outdir / "covid_assembly.sam"
+
+    return LatchFile(str(local_sam_file), output_location)