Skip to content

Commit

Permalink
feat(reliability): integrate the ryuk container for better container …
Browse files Browse the repository at this point in the history
…cleanup (testcontainers#314)

> [!NOTE]
> Editor's note from @totallyzen:
> After a thorough discussion between @santi @alexanderankin, @kiview
and @totallyzen on the
[slack](https://testcontainers.slack.com/archives/C04SRG5AXNU/p1710156743640249)
we've decided this isn't a breaking change in api, but a modification in
behaviour. Therefore not worth a 5.0 but we'll release it under 4.x
> If this did end up breaking your workflow, come talk to us about your
use-case!

**What are you trying to do?**

Use Ryuk as the default resource cleanup strategy.

**Why should it be done this way?**

The current implementation of tc-python does not handle container
lifecycle management the same way as other TC implementations. In this
PR I introduce Ryuk to be the default resource cleanup strategy, in
order to be better aligned with other TC implementations.

Ryuk is enabled by default, but can be disabled with the
`TESTCONTAINERS_RYUK_DISABLED` env variable (which follows the behavior
of other TC implementations). Ryuk behavior can further be altered with
the `TESTCONTAINERS_RYUK_PRIVILEGED`, `RYUK_CONTAINER_IMAGE` and
`TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE` (also following the same
conventions from other implementations). Documentation of these env
variables is added to the README in the root of the repo.

The implementation uses a singleton container that starts an instance of
Ryuk and stores it on class / module level, so that it is reused within
a process. This follows the same pattern as in other languages, and Ryuk
behaviour itself is implemented the exact same way as other
implementations.

**BREAKING CHANGE**

`__del__` support is now removed, in order to better align with other TC
implementations. From the comments in the `__del__` implementation, it
seems like this was added as a final attempt to cleanup resources on
exit/garbage collection. This leaves three ways to cleanup resources,
which has better defined behaviors and follows modern patterns for
resource management in Python:

Method 1: Context Manager
Init/cleanup through a context manager (__enter__/__exit__):
```
with DockerContainer("postgres") as container:
    # some logic here

# end of context: container is killed by `__exit__` method
```

Method 2: Manual start() and stop()
```
container = DockerContainer("postgres").start()
# some logic here
container.stop()
```

Method 3: Ryuk
```
container = DockerContainer("postgres").start()
# some logic here

# You forget to .stop() the container, but Ryuk kills it for you 10 seconds after your process exits.
```

_Why remove `__del__`?_ According to the previous maintainer of the
repo, it has been causing “[a bunch of
issues](https://github.com/testcontainers/testcontainers-python/pull/314#discussion_r1185321083)”,
which I have personally experienced while using TC in a Django app, due
to the automatic GC behavior when no longer referencing the container
with a variable. E.g. if you instantiate the container in a method, only
returning the connection string, the Python garbage collector will
automatically call `__del__` on the instance at the end of the function,
thus killing your container. This leads to clunky workarounds like
having to store a reference to the container in a module-level variable,
or always having to return a reference to the container from the
function creating the container. In addition, the gc behaviour is not
consistent across Python implementations, making the reliance on
`__del__` flaky at best.

Also, having the __del__ method cleanup your container prevents us from
implementing `with_reuse()` (which is implemented in other TC
implementations) in the future, as a process exit would always instantly
kill the container, preventing us to use it in another process before
Ryuk reaps it.

**Next steps**

Once this PR is accepted, my plan is to implement the `with_reuse()`
functionality seen in other implementations, to enable faster / instant
usage of existing containers. This is very useful in simple testing
scenarios or local development workflows using hot reload behaviour. The
`with_reuse()` API requires the removal of `__del__` cleanup, as
otherwise the container would not be available for reuse due to the GC
reaping the container as soon as the process exits.

**Other changes**
- Adds “x-tc-sid=SESSION_ID” header to the underlying Docker API client
with the value of the current session ID (created on module init), in
order to enable Testcontainers Cloud to operate in “Turbo mode”
testcontainers#314 (comment)
- Adds labels `org.testcontainers.lang=python` and
`org.testcontainers.session-id=SESSION_ID`to the containers created by
TC
- As mentioned above, the env variables TESTCONTAINERS_RYUK_DISABLED,
TESTCONTAINERS_RYUK_PRIVILEGED, RYUK_CONTAINER_IMAGE and
TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE are now used for customizing
tc-python behavior.

---------

Co-authored-by: Andre Hofmeister <9199345+HofmeisterAn@users.noreply.github.com>
Co-authored-by: Balint Bartha <39852431+totallyzen@users.noreply.github.com>
  • Loading branch information
3 people authored and bstrausser committed Mar 30, 2024
1 parent ae7b163 commit fdadee4
Show file tree
Hide file tree
Showing 8 changed files with 153 additions and 33 deletions.
15 changes: 15 additions & 0 deletions INDEX.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,21 @@ When trying to launch a testcontainer from within a Docker container, e.g., in c
1. The container has to provide a docker client installation. Either use an image that has docker pre-installed (e.g. the `official docker images <https://hub.docker.com/_/docker>`_) or install the client from within the `Dockerfile` specification.
2. The container has to have access to the docker daemon which can be achieved by mounting `/var/run/docker.sock` or setting the `DOCKER_HOST` environment variable as part of your `docker run` command.

Configuration
-------------

+-------------------------------------------+-------------------------------+------------------------------------------+
| Env Variable | Example | Description |
+===========================================+===============================+==========================================+
| ``TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE`` | ``/var/run/docker.sock`` | Path to Docker's socket used by ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``TESTCONTAINERS_RYUK_PRIVILEGED`` | ``false`` | Run ryuk as a privileged container |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``TESTCONTAINERS_RYUK_DISABLED`` | ``false`` | Disable ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+
| ``RYUK_CONTAINER_IMAGE`` | ``testcontainers/ryuk:0.5.1`` | Custom image for ryuk |
+-------------------------------------------+-------------------------------+------------------------------------------+

Development and Contributing
----------------------------

Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,12 @@ For more information, see [the docs][readthedocs].
```

The snippet above will spin up a postgres database in a container. The `get_connection_url()` convenience method returns a `sqlalchemy` compatible url we use to connect to the database and retrieve the database version.

## Configuration

| Env Variable | Example | Description |
| ----------------------------------------- | ----------------------------- | ---------------------------------------- |
| `TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE` | `/var/run/docker.sock` | Path to Docker's socket used by ryuk |
| `TESTCONTAINERS_RYUK_PRIVILEGED` | `false` | Run ryuk as a privileged container |
| `TESTCONTAINERS_RYUK_DISABLED` | `false` | Disable ryuk |
| `RYUK_CONTAINER_IMAGE` | `testcontainers/ryuk:0.5.1` | Custom image for ryuk |
5 changes: 5 additions & 0 deletions core/testcontainers/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@
MAX_TRIES = int(environ.get("TC_MAX_TRIES", 120))
SLEEP_TIME = int(environ.get("TC_POOLING_INTERVAL", 1))
TIMEOUT = MAX_TRIES * SLEEP_TIME

RYUK_IMAGE: str = environ.get("RYUK_CONTAINER_IMAGE", "testcontainers/ryuk:0.5.1")
RYUK_PRIVILEGED: bool = environ.get("TESTCONTAINERS_RYUK_PRIVILEGED", "false") == "true"
RYUK_DISABLED: bool = environ.get("TESTCONTAINERS_RYUK_DISABLED", "false") == "true"
RYUK_DOCKER_SOCKET: str = environ.get("TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE", "/var/run/docker.sock")
91 changes: 72 additions & 19 deletions core/testcontainers/core/container.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
import contextlib
from platform import system
from typing import Optional

from docker.models.containers import Container
from socket import socket
from typing import TYPE_CHECKING, Optional

from testcontainers.core.config import RYUK_DISABLED, RYUK_DOCKER_SOCKET, RYUK_IMAGE, RYUK_PRIVILEGED
from testcontainers.core.docker_client import DockerClient
from testcontainers.core.exceptions import ContainerStartException
from testcontainers.core.labels import LABEL_SESSION_ID, SESSION_ID
from testcontainers.core.utils import inside_container, is_arm, setup_logger
from testcontainers.core.waiting_utils import wait_container_is_ready
from testcontainers.core.waiting_utils import wait_container_is_ready, wait_for_logs

if TYPE_CHECKING:
from docker.models.containers import Container

logger = setup_logger(__name__)

Expand All @@ -25,7 +28,12 @@ class DockerContainer:
... delay = wait_for_logs(container, "Hello from Docker!")
"""

def __init__(self, image: str, docker_client_kw: Optional[dict] = None, **kwargs) -> None:
def __init__(
self,
image: str,
docker_client_kw: Optional[dict] = None,
**kwargs,
) -> None:
self.env = {}
self.ports = {}
self.volumes = {}
Expand Down Expand Up @@ -58,7 +66,10 @@ def maybe_emulate_amd64(self) -> "DockerContainer":
return self.with_kwargs(platform="linux/amd64")
return self

def start(self) -> "DockerContainer":
def start(self):
if not RYUK_DISABLED and self.image != RYUK_IMAGE:
logger.debug("Creating Ryuk container")
Reaper.get_instance()
logger.info("Pulling image %s", self.image)
docker_client = self.get_docker_client()
self._container = docker_client.run(
Expand All @@ -69,7 +80,7 @@ def start(self) -> "DockerContainer":
ports=self.ports,
name=self._name,
volumes=self.volumes,
**self._kwargs
**self._kwargs,
)
logger.info("Container started: %s", self._container.short_id)
return self
Expand All @@ -78,21 +89,12 @@ def stop(self, force=True, delete_volume=True) -> None:
self._container.remove(force=force, v=delete_volume)
self.get_docker_client().client.close()

def __enter__(self) -> "DockerContainer":
def __enter__(self):
return self.start()

def __exit__(self, exc_type, exc_val, exc_tb) -> None:
self.stop()

def __del__(self) -> None:
"""
__del__ runs when Python attempts to garbage collect the object.
In case of leaky test design, we still attempt to clean up the container.
"""
with contextlib.suppress(Exception):
if self._container is not None:
self.stop()

def get_container_host_ip(self) -> str:
# infer from docker host
host = self.get_docker_client().host()
Expand Down Expand Up @@ -140,7 +142,7 @@ def with_volume_mapping(self, host: str, container: str, mode: str = "ro") -> "D
self.volumes[host] = mapping
return self

def get_wrapped_container(self) -> Container:
def get_wrapped_container(self) -> "Container":
return self._container

def get_docker_client(self) -> DockerClient:
Expand All @@ -155,3 +157,54 @@ def exec(self, command) -> tuple[int, str]:
if not self._container:
raise ContainerStartException("Container should be started before executing a command")
return self._container.exec_run(command)


class Reaper:
_instance: "Optional[Reaper]" = None
_container: Optional[DockerContainer] = None
_socket: Optional[socket] = None

@classmethod
def get_instance(cls) -> "Reaper":
if not Reaper._instance:
Reaper._instance = Reaper._create_instance()

return Reaper._instance

@classmethod
def delete_instance(cls) -> None:
if Reaper._socket is not None:
Reaper._socket.close()
Reaper._socket = None

if Reaper._container is not None:
Reaper._container.stop()
Reaper._container = None

if Reaper._instance is not None:
Reaper._instance = None

@classmethod
def _create_instance(cls) -> "Reaper":
logger.debug(f"Creating new Reaper for session: {SESSION_ID}")

Reaper._container = (
DockerContainer(RYUK_IMAGE)
.with_name(f"testcontainers-ryuk-{SESSION_ID}")
.with_exposed_ports(8080)
.with_volume_mapping(RYUK_DOCKER_SOCKET, "/var/run/docker.sock", "rw")
.with_kwargs(privileged=RYUK_PRIVILEGED)
.start()
)
wait_for_logs(Reaper._container, r".* Started!")

container_host = Reaper._container.get_container_host_ip()
container_port = int(Reaper._container.get_exposed_port(8080))

Reaper._socket = socket()
Reaper._socket.connect((container_host, container_port))
Reaper._socket.send(f"label={LABEL_SESSION_ID}={SESSION_ID}\r\n".encode())

Reaper._instance = Reaper()

return Reaper._instance
19 changes: 5 additions & 14 deletions core/testcontainers/core/docker_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import atexit
import functools as ft
import os
import urllib
Expand All @@ -19,10 +18,10 @@
from typing import Optional, Union

import docker
from docker.errors import NotFound
from docker.models.containers import Container, ContainerCollection

from .utils import default_gateway_ip, inside_container, setup_logger
from testcontainers.core.labels import SESSION_ID, create_labels
from testcontainers.core.utils import default_gateway_ip, inside_container, setup_logger

LOGGER = setup_logger(__name__)
TC_FILE = ".testcontainers.properties"
Expand All @@ -42,6 +41,7 @@ def __init__(self, **kwargs) -> None:
self.client = docker.DockerClient(base_url=docker_host)
else:
self.client = docker.from_env(**kwargs)
self.client.api.headers["x-tc-sid"] = SESSION_ID

@ft.wraps(ContainerCollection.run)
def run(
Expand All @@ -50,6 +50,7 @@ def run(
command: Optional[Union[str, list[str]]] = None,
environment: Optional[dict] = None,
ports: Optional[dict] = None,
labels: Optional[dict[str, str]] = None,
detach: bool = False,
stdout: bool = True,
stderr: bool = False,
Expand All @@ -65,10 +66,9 @@ def run(
detach=detach,
environment=environment,
ports=ports,
labels=create_labels(image, labels),
**kwargs,
)
if detach:
atexit.register(_stop_container, container)
return container

def port(self, container_id: str, port: int) -> int:
Expand Down Expand Up @@ -145,12 +145,3 @@ def read_tc_properties() -> dict[str, str]:
tuples = [line.split("=") for line in contents.readlines() if "=" in line]
settings = {**settings, **{item[0].strip(): item[1].strip() for item in tuples}}
return settings


def _stop_container(container: Container) -> None:
try:
container.stop()
except NotFound:
pass
except Exception as ex:
LOGGER.warning("failed to shut down container %s with image %s: %s", container.id, container.image, ex)
20 changes: 20 additions & 0 deletions core/testcontainers/core/labels.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from typing import Optional
from uuid import uuid4

from testcontainers.core.config import RYUK_IMAGE

SESSION_ID: str = str(uuid4())
LABEL_SESSION_ID = "org.testcontainers.session-id"
LABEL_LANG = "org.testcontainers.lang"


def create_labels(image: str, labels: Optional[dict[str, str]]) -> dict[str, str]:
if labels is None:
labels = {}
labels[LABEL_LANG] = "python"

if image == RYUK_IMAGE:
return labels

labels[LABEL_SESSION_ID] = SESSION_ID
return labels
24 changes: 24 additions & 0 deletions core/tests/test_ryuk.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from testcontainers.core import container
from testcontainers.core.container import Reaper
from testcontainers.core.container import DockerContainer
from testcontainers.core.waiting_utils import wait_for_logs


def test_wait_for_reaper():
container = DockerContainer("hello-world").start()
wait_for_logs(container, "Hello from Docker!")

assert Reaper._socket is not None
Reaper._socket.close()

assert Reaper._container is not None
wait_for_logs(Reaper._container, r".* Removed \d .*", timeout=30)

Reaper.delete_instance()


def test_container_without_ryuk(monkeypatch):
monkeypatch.setattr(container, "RYUK_DISABLED", True)
with DockerContainer("hello-world") as cont:
wait_for_logs(cont, "Hello from Docker!")
assert Reaper._instance is None
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,9 @@ ignore = [
"INP001"
]

[tool.ruff.lint.pyupgrade]
keep-runtime-typing = true

[tool.ruff.lint.flake8-type-checking]
strict = true

Expand Down

0 comments on commit fdadee4

Please sign in to comment.