Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Dockerfile publishing #566

Open
foxish opened this issue Nov 29, 2017 · 3 comments
Open

Dockerfile publishing #566

foxish opened this issue Nov 29, 2017 · 3 comments

Comments

@foxish
Copy link
Member

foxish commented Nov 29, 2017

From ongoing thread on docker images in http://apache-spark-developers-list.1001551.n3.nabble.com/Publishing-official-docker-images-for-KubernetesSchedulerBackend-td22928.html

Currently, we have a wide array of dockerfiles that are all based on spark-base, with minor customizations. There is some discussion on publishing those.

Our high level, I think, as articulated on the thread is - We publish canonical images that serve as both - a complete image for most Spark applications, as well as a stable substrate to build customization upon for the subset of applications that need it.

Thoughts? Comments?
cc/ @apache-spark-on-k8s/contributors @felixcheung @tnachen

@foxish
Copy link
Member Author

foxish commented Nov 29, 2017

There are some things we can do to simplify/unify some of those images (by moving the CMD into the scheduler backend code for example). I'm unsure what we might gain by doing that - since the images aren't particularly k8s specific at this time anyway and one could in theory set the right env-vars and reuse those images.

@erikerlandson
Copy link
Member

As I mentioned on the SIG meeting discussion, I think moving the CMD back into the scheduler code is not a good idea - for one thing that would take it off the table for users to customize it in their own container images.

The strategy of unifying a spark-base image with mesos seems like a good one. I would expect any other deviations (kube, mesos, or anything else) to be relatively thin modifications of spark-base

@felixcheung
Copy link

I think it makes sense to me to have one official Spark image.

As of now, I don't see anything in k8s spark-base that is specific to k8s.

mesos does have a Dockerfile (but not image in release) in the Spark codebase, and it has a mesosphere specific base image (FROM) so all in all, we might not be able to say we are replacing/releasing one image or Dockerfile for all cluster type / resource manager, but at least we could say this new image or Dockerfile is not specific to k8s use.

As for the mail thread, I think we could re-articulate the built-in capability for customization with Docker with this spark-base serving as the base image.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants