Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(build): spark 3.5.2 #42

Merged
merged 15 commits into from
Sep 11, 2024
Merged

feat(build): spark 3.5.2 #42

merged 15 commits into from
Sep 11, 2024

Conversation

Tingweiftw
Copy link

@Tingweiftw Tingweiftw commented Aug 27, 2024

See CHANGELOG.md for details

@@ -40,57 +40,25 @@ TERM=xterm-color ./dev/make-distribution.sh \
${HIVE_INSTALL_FLAG:+"-Phive"} \
-DskipTests

SPARK_MAJOR_VERSION="$(echo "${SPARK_VERSION}" | cut -d '.' -f1)"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed anymore as we no longer use any spark version that is 2.y.z


if [[ ${SPARK_MAJOR_VERSION} -eq 2 && ${SPARK_MINOR_VERSION} -eq 4 ]]; then # 2.4.z
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed anymore as we no longer use any spark version that is 2.y.z


if [[ ${SPARK_MAJOR_VERSION} -eq 3 && ${SPARK_MINOR_VERSION} -ge 4 ]]; then # >=3.4
# From Spark v3.4.0 onwards, openjdk is not the prefered base image source as it i
# deprecated and taken over by eclipse-temurin. slim-buster variants are not available
# on eclipse-temurin at the moment.
IMAGE_VARIANT="jre"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jre-8 no longer works properly with the current setup as it is upgraded to Ubuntu 22, and the python within that version raises an error when user install python packages globally. However, Dockerfile packaged in Spark still installs packages globally (see prev GH action logs here

image

SPARK_LABEL="${SPARK_VERSION}"
TAG_NAME="${SELF_VERSION}_${SPARK_LABEL}_hadoop-${HADOOP_VERSION}_scala-${SCALA_VERSION}_java-${JAVA_VERSION}"

# ./bin/docker-image-tool.sh \
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R is not used anymore


docker tag "${IMAGE_NAME}:${TAG_NAME}" "${IMAGE_ORG}/${IMAGE_NAME}:${TAG_NAME}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to only use image tags that has self version to prevent overwritten previously working images when rebuild with new base jre images

@Tingweiftw Tingweiftw requested a review from tyng94 September 9, 2024 08:29
- Add Spark 3.5.1
- Add Hadoop 3.3.6
- Add support for Java 17 for Spark 3.5.1
- Fix Ubuntu-based images to use `jre-focal` variant instead of `jre` which was recently upgraded to Ubuntu Jammy to v22.y.z and causing system level python package installation to fail due to [PEP 668](https://issues.apache.org/jira/browse/SPARK-49068)
Copy link

@tyng94 tyng94 Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol i'm pretty sure this guy (Chao Sun) who reported the issue was a singaporean colleague at my previous company

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time to drop him a message to share with him how to fix

push-images.sh Outdated

docker tag "${IMAGE_NAME}:${TAG_NAME}" "${IMAGE_ORG}/${IMAGE_NAME}:${TAG_NAME}"
docker tag "${IMAGE_NAME}:${TAG_NAME}" "test_${IMAGE_ORG}/${IMAGE_NAME}:${TAG_NAME}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to revert test_ prefix?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah thanks for the catch 6c00c0e

@Tingweiftw Tingweiftw merged commit daba04b into master Sep 11, 2024
32 checks passed
@Tingweiftw Tingweiftw deleted the feat--build-spark-3.5.2 branch September 11, 2024 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants