-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docker release to the full release process for final releases #1004
Changes from 18 commits
3186972
446bb0c
43e100c
548f7df
8e31cc4
e9420b5
acdc453
fb197f0
fc7de15
d9d27b0
c685f9d
a82838a
c1f7359
1d9fe5d
6c544c6
52ad6cb
5f3e52d
2326b2d
b87999b
85c9d9f
28b8aff
d49184f
78b9a36
0680722
dcdab2d
29011fd
4674f9f
74f8c78
c7731c3
562bebd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,42 @@ | ||
ARG OPENJDK_VERSION=8 | ||
FROM eclipse-temurin:${OPENJDK_VERSION}-jre | ||
|
||
ARG BUILD_DATE | ||
ARG SPARK_VERSION=3.3.2 | ||
ARG HADOOP_VERSION=3 | ||
|
||
LABEL org.label-schema.name="Apache Spark ${SPARK_VERSION}" \ | ||
org.label-schema.build-date=$BUILD_DATE \ | ||
org.label-schema.version=$SPARK_VERSION | ||
|
||
ENV SPARK_HOME /usr/spark | ||
ENV PATH="/usr/spark/bin:/usr/spark/sbin:${PATH}" | ||
|
||
RUN apt-get update && \ | ||
apt-get install -y wget netcat procps libpostgresql-jdbc-java && \ | ||
wget -q "http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
tar xzf "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
rm "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
mv "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}" /usr/spark && \ | ||
ln -s /usr/share/java/postgresql-jdbc4.jar /usr/spark/jars/postgresql-jdbc4.jar && \ | ||
apt-get remove -y wget && \ | ||
apt-get autoremove -y && \ | ||
apt-get clean | ||
|
||
COPY entrypoint.sh /scripts/ | ||
RUN chmod +x /scripts/entrypoint.sh | ||
|
||
ENTRYPOINT ["/scripts/entrypoint.sh"] | ||
CMD ["--help"] | ||
# this image gets published to GHCR for production use | ||
ARG py_version=3.11.2 | ||
|
||
FROM python:$py_version-slim-bullseye as base | ||
|
||
RUN apt-get update \ | ||
&& apt-get dist-upgrade -y \ | ||
&& apt-get install -y --no-install-recommends \ | ||
build-essential=12.9 \ | ||
ca-certificates=20210119 \ | ||
gcc=4:10.2.1-1 \ | ||
git=1:2.30.2-1+deb11u2 \ | ||
libpq-dev=13.14-0+deb11u1 \ | ||
libsasl2-dev=2.1.27+dfsg-2.1+deb11u1 \ | ||
make=4.3-4.1 \ | ||
openssh-client=1:8.4p1-5+deb11u3 \ | ||
python-dev-is-python2=2.7.18-9 \ | ||
software-properties-common=0.96.20.2-2.1 \ | ||
unixodbc-dev=2.3.6-0.1+b1 \ | ||
&& apt-get clean \ | ||
&& rm -rf \ | ||
/var/lib/apt/lists/* \ | ||
/tmp/* \ | ||
/var/tmp/* | ||
|
||
ENV PYTHONIOENCODING=utf-8 | ||
ENV LANG=C.UTF-8 | ||
|
||
RUN python -m pip install --upgrade "pip==24.0" "setuptools==69.2.0" "wheel==0.43.0" --no-cache-dir | ||
|
||
|
||
FROM base as dbt-spark | ||
Check failure on line 32 in docker/Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerMissing User Instruction
Raw output
|
||
|
||
ARG commit_ref=main | ||
ARG extras=all | ||
|
||
HEALTHCHECK CMD dbt --version || exit 1 | ||
|
||
WORKDIR /usr/app/dbt/ | ||
ENTRYPOINT ["dbt"] | ||
|
||
RUN python -m pip install --no-cache-dir "dbt-spark[${extras}] @ git+https://github.com/dbt-labs/dbt-spark@${commit_ref}" | ||
Check warning on line 42 in docker/Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerUnpinned Package Version in Pip Install
Raw output
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Docker for dbt | ||
`Dockerfile` is suitable for building dbt Docker images locally or using with CI/CD to automate populating a container registry. | ||
|
||
## Building an image: | ||
This Dockerfile can create images for the following target: `dbt-spark` | ||
|
||
In order to build a new image, run the following docker command. | ||
```shell | ||
docker build --tag <your_image_name> --target dbt-spark <path/to/dockerfile> | ||
``` | ||
--- | ||
> **Note:** Docker must be configured to use [BuildKit](https://docs.docker.com/develop/develop-images/build_enhancements/) in order for images to build properly! | ||
|
||
--- | ||
|
||
By default the image will be populated with the latest version of `dbt-spark` on `main`. | ||
If you need to use a different version you can specify it by git ref using the `--build-arg` flag: | ||
```shell | ||
docker build --tag <your_image_name> \ | ||
--target dbt-spark \ | ||
--build-arg commit_ref=<commit_ref> \ | ||
<path/to/dockerfile> | ||
``` | ||
|
||
### Examples: | ||
To build an image named "my-dbt" that supports Snowflake using the latest releases: | ||
```shell | ||
cd dbt-core/docker | ||
docker build --tag my-dbt --target dbt-spark . | ||
``` | ||
|
||
To build an image named "my-other-dbt" that supports Snowflake using the adapter version 1.0.0b1: | ||
```shell | ||
cd dbt-core/docker | ||
docker build \ | ||
--tag my-other-dbt \ | ||
--target dbt-spark \ | ||
--build-arg commit_ref=v1.0.0b1 \ | ||
. | ||
``` | ||
|
||
## Special cases | ||
There are a few special cases worth noting: | ||
* The `dbt-spark` database adapter comes in three different versions named `PyHive`, `ODBC`, and the default `all`. | ||
If you wish to override this you can use the `--build-arg` flag with the value of `extras=<extras_name>`. | ||
See the [docs](https://docs.getdbt.com/reference/warehouse-profiles/spark-profile) for more information. | ||
```shell | ||
docker build --tag my_dbt \ | ||
--target dbt-spark \ | ||
--build-arg commit_ref=v1.0.0b1 \ | ||
--build-arg extras=PyHive \ | ||
<path/to/dockerfile> | ||
``` | ||
|
||
## Running an image in a container: | ||
The `ENTRYPOINT` for this Dockerfile is the command `dbt` so you can bind-mount your project to `/usr/app` and use dbt as normal: | ||
```shell | ||
docker run \ | ||
--network=host \ | ||
--mount type=bind,source=path/to/project,target=/usr/app \ | ||
--mount type=bind,source=path/to/profiles.yml,target=/root/.dbt/profiles.yml \ | ||
my-dbt \ | ||
ls | ||
``` | ||
--- | ||
**Notes:** | ||
* Bind-mount sources _must_ be an absolute path | ||
* You may need to make adjustments to the docker networking setting depending on the specifics of your data warehouse/database host. | ||
|
||
--- |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
ARG OPENJDK_VERSION=8 | ||
FROM eclipse-temurin:${OPENJDK_VERSION}-jre | ||
Check failure on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerMissing User Instruction
Raw output
Check warning on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerApt Get Install Pin Version Not Defined
Raw output
Check warning on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerApt Get Install Pin Version Not Defined
Raw output
Check warning on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerApt Get Install Pin Version Not Defined
Raw output
Check warning on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerApt Get Install Pin Version Not Defined
Raw output
Check notice on line 2 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerHealthcheck Instruction Missing
Raw output
|
||
|
||
ARG BUILD_DATE | ||
ARG SPARK_VERSION=3.3.2 | ||
ARG HADOOP_VERSION=3 | ||
|
||
LABEL org.label-schema.name="Apache Spark ${SPARK_VERSION}" \ | ||
org.label-schema.build-date=$BUILD_DATE \ | ||
org.label-schema.version=$SPARK_VERSION | ||
|
||
ENV SPARK_HOME /usr/spark | ||
ENV PATH="/usr/spark/bin:/usr/spark/sbin:${PATH}" | ||
|
||
RUN apt-get update && \ | ||
Check notice on line 15 in docker/spark.Dockerfile Wiz Inc. (266a8a9c32) / Wiz IaC ScannerAPT-GET Not Avoiding Additional Packages
Raw output
|
||
apt-get install -y wget netcat procps libpostgresql-jdbc-java && \ | ||
wget -q "http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
tar xzf "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
rm "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" && \ | ||
mv "spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}" /usr/spark && \ | ||
ln -s /usr/share/java/postgresql-jdbc4.jar /usr/spark/jars/postgresql-jdbc4.jar && \ | ||
apt-get remove -y wget && \ | ||
apt-get autoremove -y && \ | ||
apt-get clean | ||
|
||
COPY entrypoint.sh /scripts/ | ||
RUN chmod +x /scripts/entrypoint.sh | ||
|
||
ENTRYPOINT ["/scripts/entrypoint.sh"] | ||
CMD ["--help"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# - VERY rudimentary test script to run latest + specific branch image builds and test them all by running `--version` | ||
mikealfare marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# TODO: create a real test suite | ||
set -e | ||
|
||
echo "\n\n" | ||
echo "####################################" | ||
echo "##### Testing dbt-spark latest #####" | ||
echo "####################################" | ||
|
||
docker build --tag dbt-spark --target dbt-spark docker | ||
docker run dbt-spark --version | ||
|
||
echo "\n\n" | ||
echo "#####################################" | ||
echo "##### Testing dbt-spark-1.0.0b1 #####" | ||
echo "#####################################" | ||
|
||
docker build --tag dbt-spark-1.0.0b1 --target dbt-spark --build-arg commit_ref=v1.0.0b1 docker | ||
docker run dbt-spark-1.0.0b1 --version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this moved whole cloth to
spark.Dockerfile
.