-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docker] Add jupyter based demo of single colo batch push workflow #1271
Conversation
This adds a new interactive demo and explanation of a typical workflow of Venice batch push. It walks through downloading a dataset from hugging face, using spark to conver the parquet file format to avro, preparing a Venice store, and then using Spark and VPJ to push to the Venice cluster, all from the jupyter notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, and it works perfectly—thanks so much, @ZacAttack! I left a few comments, and once those are addressed, it’s good to go. 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks, @ZacAttack!
[docker] Add jupyter based demo of single colo batch push workflow
This adds a new interactive demo and explanation of a typical workflow of Venice batch push. It walks through downloading a dataset from hugging face, using spark to conver the parquet file format to avro, preparing a Venice store, and then using Spark and VPJ to push to the Venice cluster, all from the jupyter notebook.
To run this demo do the following:
Build the Image
From the repository root directory run:
./docker/build-venice-docker-images.sh
Run and compose the containers
Once that's done, you'll need to compose the images with:
docker compose -f ./docker/docker-compose-single-dc-setup.yaml up -d
Connect to jupyter
Depending on your environment you'll need to access the running logs of the venice-client-jupyter container. If you have the Docker Desktop app you can navigate to the running container list and click on the venice-client-jupyter container. From the log view you'll see a link that looks something like:
http://127.0.0.1:8888/lab?token=<some token string>
Open this link in your browser and you'll be treated to the jupyter notebook UI. From the file explorer on the left double click on the file called Venice_Demo.ipynb. From there, you can read and run the tutorial. Have fun!
Resolves #XXX
How was this PR tested?
Does this PR introduce any user-facing changes?