Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update index.md #4934

Merged
merged 1 commit into from
Jun 30, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/connections/storage/data-lakes/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To learn more about Segment Data Lakes, check out the Segment blog post [Introdu

Segment supports Data Lakes hosted on two cloud providers: Amazon Web Services (AWS) and Microsoft Azure. Each cloud provider has a similar system for managing data, but offer different query engines, post-processing systems, and analytics options.

### How Segment Data Lakes works
### How Segment Data Lakes (AWS) works

Data Lakes store Segment data in S3 in a read-optimized encoding format (Parquet) which makes the data more accessible and actionable. To help you zero-in on the right data, Data Lakes also creates logical data partitions and event tables, and integrates metadata with existing schema management tools, such as the AWS Glue Data Catalog. The resulting data set is optimized for use with systems like Spark, Athena, EMR, or machine learning vendors like DataBricks or DataRobot.

Expand All @@ -49,7 +49,7 @@ Data Lakes store Segment data in ADLS in a read-optimized encoding format (Parqu

For detailed Segment Data Lakes (Azure) setup instructions, see the [Data Lakes setup page](/docs/connections/storage/catalog/data-lakes/).

### Set up Segment Data Lakes
### Set up Segment Data Lakes (AWS)

When setting up your data lake using the [Data Lakes catalog page](/docs/connections/storage/catalog/data-lakes/), be sure to consider the EMR and AWS IAM components listed below.

Expand Down Expand Up @@ -84,7 +84,7 @@ TODO:
add schema overview (tables/columns generated)
-->

### Segment Data Lakes schema
### Segment Data Lakes (AWS) schema

#### S3 partition structure

Expand Down
Loading