Skip to content

Latest commit

 

History

History
110 lines (78 loc) · 2.3 KB

01-Azure-Event-Hub-Lab.md

File metadata and controls

110 lines (78 loc) · 2.3 KB

01. Azure Event Hub - Lab instructions

In this lab module - we will learn to publish/consume events from Azure Event Hub with Spark Structured Streaming. The source is the curated crimes dataset in DBFS, and the sink is DBFS in Delta format.

1-aeh-preview




Unit 1. Provisioning and configuring

1.1. Provision Event Hub

1-aeh



3-aeh



2-aeh



1.2. Create consumer group within event hub

4-aeh



5-aeh



1.3. Create SAS policy for accessing from Spark

11-aeh



12-aeh



13-aeh



14-aeh



15-aeh



Capture the connection string.

1.4. Attach Spark connector library hosted in Maven

This step is performaned on the Databricks cluster.
The maven coordinates are-
com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.6
Be sure to get the latest from here- https://docs.databricks.com/spark/latest/structured-streaming/streaming-event-hubs.html#requirements

16-aeh



17-aeh



18-aeh



Unit 2. Secure credentials

Refer the notebook for instructions.

Unit 3. Readstream crime data from DBFS and publish events to Azure Event Hub with Spark Structured Streaming

We will read the curated Chicago crimes dataset in DBFS as a stream and pubish to Azure Event Hub using Structured Streaming. Follow instructions in the notebook and execute step by step.

Unit 4. Consume events from Azure Event Hub

We will consume events from Azure Event Hub using Structured Streaming and sink to Databricks Delta. Follow instructions in the notebook and execute step by step.

Unit 5. Query streaming events

We will create an external table on the streaming events and run queries on it. Follow instructions in the notebook and execute step by step.