Skip to content

This is the starter code for both the course and the project for Data Streaming with Spark

License

Notifications You must be signed in to change notification settings

udacity/nd029-c2-apache-spark-and-spark-streaming-starter

Repository files navigation

Windows Users

It is HIGHLY recommended to install the 10 October 2020 Update: https://support.microsoft.com/en-us/windows/get-the-windows-10-october-2020-update-7d20e88c-0568-483a-37bc-c3885390d212

You will then want to install the latest version of Docker on Windows: https://docs.docker.com/docker-for-windows/install/

Using Docker for your Exercises

You will need to use Docker to run the exercises on your own computer. You can find Docker for your operating system here: https://docs.docker.com/get-docker/

It is recommended that you configure Docker to allow it to use up to 2 cores and 6 GB of your host memory for use by the course workspace. If you are running other processes using Docker simultaneously with the workspace, you should take that into account also.

The docker-compose file at the root of the repository creates 9 separate containers:

  • Redis
  • Zookeeper (for Kafka)
  • Kafka
  • Banking Simulation
  • Trucking Simulation
  • STEDI (Application used in Final Project)
  • Kafka Connect with Redis Source Connector
  • Spark Master
  • Spark Worker

It also mounts your repository folder to the Spark Master and Spark Worker containers as a volume /home/workspace, making your code changes instantly available within to the containers running Spark.

Let's get these containers started!

cd [repositoryfolder]
docker-compose up

You should see 9 containers when you run this command:

docker ps

About

This is the starter code for both the course and the project for Data Streaming with Spark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •