Skip to content

Commit

Permalink
Merge pull request #260 from cityofaustin/mike/19797_maintenance_tasks
Browse files Browse the repository at this point in the history
Update readme and update Airflow version
  • Loading branch information
mddilley authored Dec 18, 2024
2 parents a355323 + 8cb7f8a commit 36ffbae
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 46 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM apache/airflow:2.9.0
FROM apache/airflow:2.10.3

USER root
RUN apt-get update
Expand Down
76 changes: 49 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,23 @@ The stack is composed of:

## Table of Contents

- [Getting Started](#getting-started)
- [Developing a new DAG](#developing-a-new-dag)
- [Tags](#tags)
- [Moving to production](#moving-to-production)
- [Utilities](#utilities)
- [1Password utility](#1password-utility)
- [Slack operator utility](#slack-operator-utility)
- [Useful Commands](#useful-commands)
- [Updating the stack](#updating-the-stack)
- [HAProxy and SSL](#haproxy-and-ssl)
- [HAProxy operation](#haproxy-operation)
- [DTS Airflow](#dts-airflow)
- [Table of Contents](#table-of-contents)
- [Getting Started](#getting-started)
- [Developing a new DAG](#developing-a-new-dag)
- [Tags](#tags)
- [Moving to production](#moving-to-production)
- [Utilities](#utilities)
- [1Password utility](#1password-utility)
- [Slack operator utility](#slack-operator-utility)
- [Useful Commands](#useful-commands)
- [Updating the stack](#updating-the-stack)
- [Update Process](#update-process)
- [Testing a new Airflow version](#testing-a-new-airflow-version)
- [Update the production stack after merge](#update-the-production-stack-after-merge)
- [HAProxy and SSL](#haproxy-and-ssl)
- [HAProxy operation](#haproxy-operation)
- [Ideas](#ideas)

## Getting Started

Expand All @@ -51,7 +57,7 @@ DOCKER_HUB_TOKEN=<A docker hub access token assigned to specifically to you>
3. Start the Docker the stack (optionlly use the `-d` flag to run containers in the background):

```bash
$ docker compose up -d
docker compose up -d
```

4. Log in to the dashboard at ` http://localhost:8080` using the username and password set in your `.env` file.
Expand Down Expand Up @@ -101,7 +107,7 @@ exit;
The production Airflow deployment uses a second Docker compose file which provides haproxy configuration overrides. To start the production docker compose stack use you must load both files in order:

```shell
$ docker compose -f docker-compose.yaml -f docker-compose-production.yaml up -d
docker compose -f docker-compose.yaml -f docker-compose-production.yaml up -d
```

## Utilities
Expand Down Expand Up @@ -151,26 +157,28 @@ The Slack operator utility makes use of the integration between the Airflow and

To configure the Slack operator in your local instance, from the Airflow UI go to **Admin** > **Connections** and choose **Slack API** as the **connection type**. You can find the remaining settings in 1Password under the **Airflow - Slack Bot** item.

**To test the Slack operator locally**, see the DAG named `test_slack_notifier`.

## Useful Commands

- 🐚 get a shell on a worker, for example

```shell
$ docker exec -it airflow-airflow-worker-1 bash
docker exec -it airflow-airflow-worker-1 bash
```

- ⛔ Stop all containers and execute this to reset your local database.
- Do not run in production unless you feel really great about your backups.
- This will reset the history of your dag runs and switch states.

```shell
$ docker compose down --volumes --remove-orphans
docker compose down --volumes --remove-orphans
```

Start the production docker compose stack with haproxy overrides:

```shell
$ docker compose -f docker-compose.yaml -f docker-compose-production.yaml up -d
docker compose -f docker-compose.yaml -f docker-compose-production.yaml up -d
```

## Updating the stack
Expand All @@ -181,26 +189,40 @@ Follow these steps to update the Airflow docker step. Reasons for doing this inc
- Modifying the `Dockerfile` to upgrade the Airflow version
- Modifying the haproxy configuration

#### Update Process
### Update Process

#### Testing a new Airflow version

- Read the "Significant Changes" sections of the Airflow release notes between the versions in question: https://github.com/apache/airflow/releases/
- Apache Airflow is a very active project, and these release notes are pretty dense. Keeping a regular update cadence will be helpful to keep up the task of updating airflow from becoming an "information overload" job.
- Snap a backup of the Airflow postgreSQL database
- Create a local branch with the [Dockerfile](./Dockerfile) modified to the version you intend to test
- In the [docker-compose.yaml](./docker-compose.yaml), replace `image: atddocker/atd-airflow:production` with `build: .`
- Build the Docker images locally:
```shell
docker compose build --no-cache
```
- Bring up the services and check the logging for errors and see that everything runs as expected:
```shell
docker compose up
```
- Check if you can reach the Airflow dashboard at `http://localhost:8080`
- If any updates affect the Slack notifier, see the [instructions to test it](#slack-operator-utility)
- Bring down the services:
```shell
docker compose down
```
- In the [docker-compose.yaml](./docker-compose.yaml), switch `build: .` back to `image: atddocker/atd-airflow:production`
- Push your branch and create a PR for review

#### Update the production stack after merge
- After review and merge, snap a backup of the production Airflow postgreSQL database
- You shouldn't need it, but it can't hurt.
- The following command requires that the stack being updated is running.
- The string `postgres` in the following command is denoting the `docker compose` service name and not the `postgres` system database which is present on all postgres database servers. The target database is set via the environment variable `PGDATABASE`.
- `docker compose exec -t -e PGUSER=airflow -e PGPASSWORD=airflow -e PGDATABASE=airflow postgres pg_dump > DB_backup.sql`
- Stop the Airflow stack
- `docker compose stop`
- Compare the `docker-compose.yaml` file in a way that is easily sharable with the team if needed
- Start a new, blank gist at https://gist.github.com/
- Copy the source code of the older version, for example: https://raw.githubusercontent.com/apache/airflow/2.5.3/docs/apache-airflow/howto/docker-compose/docker-compose.yaml
- Paste that into your gist and save it. Make it public if you want to demonstrate the diff to anyone.
- Copy the source code of the newer, target version and replace the contents of the file in your gist. An example URL would be: https://raw.githubusercontent.com/apache/airflow/2.6.1/docs/apache-airflow/howto/docker-compose/docker-compose.yaml.
- Look at the revisions of this gist and find the most recent one. This diff represents the changes from the older to the newer versions of the upstream `docker-compose.yaml` file. For example (2.5.3 to 2.6.1): https://gist.github.com/frankhereford/c844d0674e9ad13ece8e2354c657854e/revisions.
- Consider each change, and generally, you'll want to apply these changes to the `docker-compose.yaml` file.
- Update the `FROM` line in the `Dockerfile` found in the top of the repo to the target version.
- Update the comments in the docker-compose file that reference the version number. (2X)
- Pull the updates using the instructions in the [Moving to production section](#moving-to-production)
- Build the core docker images
- `docker compose build`
- Build the `airflow-cli` image, which the Airflow team keeps in its own profile
Expand Down
17 changes: 4 additions & 13 deletions dags/utils/slack_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ def get_central_time_exec_data(context):


def task_fail_slack_alert_critical(context):
slack_webhook_token = BaseHook.get_connection(SLACK_CONN_ID).password
slack_msg = """
<!channel> :red_circle: Critical Failure
*Task*: {task}
Expand All @@ -25,22 +24,19 @@ def task_fail_slack_alert_critical(context):
""".format(
task=context.get("task_instance").task_id,
dag=context.get("task_instance").dag_id,
ti=context.get("task_instance"),
exec_date=get_central_time_exec_data(context),
log_url=context.get("task_instance").log_url,
)
failed_alert = SlackWebhookOperator(
task_id="slack_critical_failure",
http_conn_id="slack",
webhook_token=slack_webhook_token,
slack_webhook_conn_id=SLACK_CONN_ID,
message=slack_msg,
username="airflow",
)
return failed_alert.execute(context=context)


def task_fail_slack_alert(context):
slack_webhook_token = BaseHook.get_connection(SLACK_CONN_ID).password
slack_msg = """
:red_circle: Task Failed.
*Task*: {task}
Expand All @@ -50,39 +46,34 @@ def task_fail_slack_alert(context):
""".format(
task=context.get("task_instance").task_id,
dag=context.get("task_instance").dag_id,
ti=context.get("task_instance"),
exec_date=get_central_time_exec_data(context),
log_url=context.get("task_instance").log_url,
)
failed_alert = SlackWebhookOperator(
task_id="slack_failure",
http_conn_id="slack",
webhook_token=slack_webhook_token,
slack_webhook_conn_id=SLACK_CONN_ID,
message=slack_msg,
username="airflow",
)
return failed_alert.execute(context=context)


def task_success_slack_alert(context):
slack_webhook_token = BaseHook.get_connection(SLACK_CONN_ID).password
slack_msg = """
:white_check_mark: Task Sucessfully Completed.
:white_check_mark: Task Successfully Completed.
*Task*: {task}
*Dag*: {dag}
*Execution Time*: {exec_date}
*Log Url*: {log_url}
""".format(
task=context.get("task_instance").task_id,
dag=context.get("task_instance").dag_id,
ti=context.get("task_instance"),
exec_date=get_central_time_exec_data(context),
log_url=context.get("task_instance").log_url,
)
success_alert = SlackWebhookOperator(
task_id="slack_success",
http_conn_id="slack",
webhook_token=slack_webhook_token,
slack_webhook_conn_id=SLACK_CONN_ID,
message=slack_msg,
username="airflow",
)
Expand Down
7 changes: 4 additions & 3 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# Docker compose based on official Airflow example, avaialble here:
# Docker compose based on official Airflow example, available here:
# https://airflow.apache.org/docs/apache-airflow/2.6.1/docker-compose.yaml
# and documented here:
# https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#running-airflow-in-docker
x-airflow-common: &airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
# and uncomment the "build" line below, Then run `docker compose build` to build the images.
# build: .
image: atddocker/atd-airflow:production
environment: &airflow-common-env
AIRFLOW__CORE__FERNET_KEY: ${_AIRFLOW__CORE__FERNET_KEY}
AIRFLOW__WEBSERVER__BASE_URL: ${_AIRFLOW__WEBSERVER__BASE_URL:-http://localhost:8080}
# You need to explictly allow .env vars into the airflow containers's environments here
# You need to explicitly allow .env vars into the airflow containers's environments here
OP_API_TOKEN: ${OP_API_TOKEN}
OP_CONNECT: ${OP_CONNECT}
OP_VAULT_ID: ${OP_VAULT_ID}
Expand Down
2 changes: 0 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
onepasswordconnectsdk==1.*
docker==6.*
apache-airflow-providers-slack==7.3.*

0 comments on commit 36ffbae

Please sign in to comment.