If you don't have access to a Spark cluster, sign up for Databricks Community Edition: https://databricks.com/signup/signup-community
-
If you're using Databricks (either Azure Databricks, Databricks on AWS, or Databricks Community Edition), you can import the tutorial notebook using this url:
https://github.com/colbyford/PyDataCLT_Jan2020/raw/master/Python_to_PySpark.dbc
-
If you're using a different type of Spark cluster, you can import the
Python_to_PySpark.ipynb
notebook. -
If you're just following along, you can see the tutorial notebook from your web browsers after downloading the
Python_to_PySpark.html
file.
- Databricks Documentation: https://docs.databricks.com/
- Azure Databricks Information: https://docs.microsoft.com/en-us/azure/azure-databricks/
- PySpark API Documentation: https://spark.apache.org/docs/latest/api/python/index.html
- Databricks Academy: https://academy.databricks.com/
- Sparkitecture: https://www.sparkitecture.io/