Skip to content

Commit

Permalink
Renovate: Repurpose into "CrateDB Ecosystem Catalog"
Browse files Browse the repository at this point in the history
After all the tutorials have been refactored into the CrateDB Guide,
the enumeration of catalog items became a bit of a lost place.

This improvement concludes the renovation on this end, by effectively
repurposing it into a ecosystem software catalog/gallery, similar to how
others are running them.

Other than this, the patch also adds concise navigation elements to the
top of each page, in order to add gravity towards the tutorial items.
  • Loading branch information
amotl committed Mar 26, 2024
1 parent 6aaabe5 commit 3062b5a
Show file tree
Hide file tree
Showing 8 changed files with 204 additions and 57 deletions.
59 changes: 54 additions & 5 deletions docs/connect/df.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,36 @@
(dataframes)=
# CrateDB and DataFrame libraries

This documentation section lists DataFrame libraries and frameworks which can
be used together with CrateDB. Hands-on tutorials about them can be found
on the ["connect" section of the CrateDB Guide].
Data frame libraries and frameworks which can
be used together with CrateDB.


:::::{grid} 1 2 2 2
:margin: 4 4 0 0
:padding: 0
:gutter: 2

::::{grid-item-card} {material-outlined}`lightbulb;2em` Tutorials
:link: guide:dataframes
:link-type: ref
Learn how to use CrateDB together with popular open-source data frame
libraries, on behalf of hands-on tutorials and code examples.
+++
{tag-info}`Dask` {tag-info}`pandas` {tag-info}`Polars`
::::

::::{grid-item-card} {material-outlined}`read_more;2em` SQLAlchemy
CrateDB's SQLAlchemy dialect implementation provides fundamental infrastructure
to integrations with Dask, pandas, and Polars.
+++
[ORM Guides](inv:guide#orm)
{ref}`ORM Catalog <orm>`
::::

:::::


(dask)=
## Dask

[Dask] is a parallel computing library for analytics with task scheduling.
Expand All @@ -31,6 +56,7 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear
```


(pandas)=
## pandas

```{div}
Expand All @@ -41,11 +67,34 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear
[pandas] is a fast, powerful, flexible, and easy to use open source data analysis
and manipulation tool, built on top of the Python programming language.

Pandas (stylized as pandas) is a software library written for the Python programming
language for data manipulation and analysis. In particular, it offers data structures
and operations for manipulating numerical tables and time series.

:::{rubric} Data Model
:::
- Pandas is built around data structures called Series and DataFrames. Data for these
collections can be imported from various file formats such as comma-separated values,
JSON, Parquet, SQL database tables or queries, and Microsoft Excel.
- A Series is a 1-dimensional data structure built on top of NumPy's array.
- Pandas includes support for time series, such as the ability to interpolate values
and filter using a range of timestamps.
- By default, a Pandas index is a series of integers ascending from 0, similar to the
indices of Python arrays. However, indices can use any NumPy data type, including
floating point, timestamps, or strings.
- Pandas supports hierarchical indices with multiple values per data point. An index
with this structure, called a "MultiIndex", allows a single DataFrame to represent
multiple dimensions, similar to a pivot table in Microsoft Excel. Each level of a
MultiIndex can be given a unique name.

:::

```{div}
:style: "clear: both"
```


(polars)=
## Polars

```{div}
Expand Down Expand Up @@ -83,7 +132,8 @@ vectorized query engine, it is open source, and written in Rust.
community of developers. Everyone is encouraged to add new features and contribute.
It is free to use under the MIT license.

**Data formats**
:::{rubric} Data formats
:::

Polars supports reading and writing to many common data formats.
This allows you to easily integrate Polars into your existing data stack.
Expand All @@ -101,7 +151,6 @@ This allows you to easily integrate Polars into your existing data stack.


[Apache Arrow]: https://arrow.apache.org/
["connect" section of the CrateDB Guide]: inv:guide:*:label#connect
[Dask]: https://www.dask.org/
[Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html
[Dask Futures]: https://docs.dask.org/en/latest/futures.html
Expand Down
24 changes: 18 additions & 6 deletions docs/connect/orm.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
(orm)=
# CrateDB and ORM libraries

This documentation section lists ORM libraries and frameworks which can
be used together with CrateDB. Hands-on tutorials about them can be found
on the ["connect" section of the CrateDB Guide].
ORM libraries and frameworks which can
be used together with CrateDB.


::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:orm
:link-type: ref

Learn how to use CrateDB together with popular open-source ORM libraries.
+++
{tag}`ORM` {tag-info}`SQLAlchemy`
::::



## SQLAlchemy
Expand All @@ -16,15 +28,15 @@ on the ["connect" section of the CrateDB Guide].
[SQLAlchemy] is the Python SQL toolkit and Object Relational Mapper that
gives application developers the full power and flexibility of SQL.

It plays an important role, because popular Python-based [DataFrame](df.md)
Python-based [DataFrame](df.md)
and [ML](../integrate/ml.md) libraries, and a few [ETL](../integrate/etl.md)
frameworks, are using SQLAlchemy as data abstraction library when connecting to
frameworks, are using SQLAlchemy as database adapter library when connecting to
[RDBMS].

```{div}
:style: "clear: both"
```


["connect" section of the CrateDB Guide]: inv:guide:*:label#connect
[RDBMS]: https://en.wikipedia.org/wiki/RDBMS
[SQLAlchemy]: https://www.sqlalchemy.org/
35 changes: 20 additions & 15 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,33 @@
(index)=
(catalog)=
(drivers)=
(frameworks)=
(integrations)=

# CrateDB Drivers and Integrations
# CrateDB Ecosystem Catalog

Database drivers, libraries, frameworks, and applications for CrateDB.

## About CrateDB

CrateDB is a distributed and scalable open-source SQL database for storing and
analyzing massive amounts of data in near real-time, even with complex queries.
It is PostgreSQL-compatible, and based on Lucene.

Users are operating CrateDB clusters that store information in the range of
billions of records, and terabytes of data, equally accessible without any
retrieval penalty on data point age.

:::{rubric} About CrateDB
:::
CrateDB is a distributed and scalable open-source SQL database based on Lucene,
with PostgreSQL compatibility.
CrateDB clusters store information in the range of billions of records, and
terabytes of data, and run analytics in near real time, even with complex
queries.
CrateDB can be used for enterprise data warehouse workloads, it
works across clouds and scales with your data.

## Connectivity

This section introduces you to the canonical set of database drivers, client-
and developer-applications, and how to configure them to connect to CrateDB.
Just to name a few, it is about the CrateDB Admin UI, `crash`, `psql`,
DataGrip, and DBeaver applications, the Java/JDBC/Python drivers, the SQLAlchemy
and Flink dialects, and more.
The canonical set of database drivers, client- and developer-applications, and
how to configure them to connect to CrateDB.

Just to name a few, the sections below are about the CrateDB Admin UI, the
Crash CLI terminal program, connecting with PostgreSQL's psql client, the
DataGrip, and DBeaver IDE applications, the Java/JDBC/Python drivers, the
SQLAlchemy and Flink dialects, and more.

::::{grid} 1 2 2 2
:margin: 4 4 0 0
Expand Down
26 changes: 17 additions & 9 deletions docs/integrate/bi.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,21 @@
(bi-tools)=
# Business Analytics and Intelligence with CrateDB

This documentation section lists business analytics applications
Business analytics applications
and frameworks, which can be used together with CrateDB.

::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:bi
:link-type: ref

Guidelines about integrating CrateDB with business analytics and intelligence
software.
+++
{tag}`BI` {tag}`DataViz` {tag-success}`PowerBI` {tag-success}`Rill` {tag-success}`Tableau`
::::


(powerbi)=
## Microsoft Power BI
Expand Down Expand Up @@ -39,7 +51,7 @@ possible to publish your dashboards, in order to share them with others.

```{div}
:style: "float: right; margin-left: 0.5em"
[![](https://github.com/rilldata/rill/blob/main/docs/static/img/rill-logo-dark.svg){w=180px}](https://www.rilldata.com/)
[![](https://github.com/rilldata/rill/raw/main/docs/static/img/rill-logo-light.svg){w=180px}](https://www.rilldata.com/)
```

[Rill] is an open-source operational BI framework for effortlessly transforming
Expand All @@ -57,7 +69,8 @@ This methodology allows for versioning and tracking, thus improving collaboratio
on BI projects using code, which is more efficient and scalable than traditional
BI tools, also breaking down information and knowledge barriers.

**Rill's design principles**
:::{rubric} Rill's design principles
:::

- **Feels good to use** – powered by Sveltekit & DuckDB = conversation-fast, not
wait-ten-seconds-for-result-set fast
Expand All @@ -80,24 +93,20 @@ BI tools, also breaking down information and knowledge barriers.
## Tableau

```{div}
:style: "float: right"
:style: "float: right; margin-left: 0.5em"
[![](https://upload.wikimedia.org/wikipedia/en/thumb/0/06/Tableau_logo.svg/500px-Tableau_logo.svg.png?20200509180027){w=180px}](https://www.tableau.com/)
```

[Tableau] is a visual business intelligence and analytics software platform. It expresses
data by translating drag-and-drop actions into data queries through an intuitive interface.

[Connecting to CrateDB from Tableau with JDBC] and [Using CrateDB with Tableau] will
guide you through the process of setting it up correctly with CrateDB.

![](https://cratedb.com/hs-fs/hubfs/08-index.png?width=1536&name=08-index.png){h=200px}

```{seealso}
[CrateDB and Tableau]
```


[Connecting to CrateDB from Tableau with JDBC]: https://cratedb.com/blog/connecting-to-cratedb-from-tableau-with-jdbc
[CrateDB and Tableau]: https://cratedb.com/integrations/cratedb-and-tableau
[CrateDB and Power BI]: https://cratedb.com/integrations/cratedb-and-power-bi
[PostgreSQL ODBC driver]: https://odbc.postgresql.org/
Expand All @@ -106,4 +115,3 @@ guide you through the process of setting it up correctly with CrateDB.
[Power Query PostgreSQL connector]: https://learn.microsoft.com/en-us/power-query/connectors/postgresql
[Rill]: https://www.rilldata.com/
[Tableau]: https://www.tableau.com/
[Using CrateDB with Tableau]: https://community.cratedb.com/t/using-cratedb-with-tableau/1192
31 changes: 22 additions & 9 deletions docs/integrate/etl.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,33 @@
(etl)=
# ETL with CrateDB

Use ETL / data pipeline applications and frameworks for transferring data in
and out of CrateDB. Corresponding tutorials can be found within the
[CrateDB Guide: Integration Tutorials] section of the documentation.
ETL / data pipeline applications and frameworks for transferring data in
and out of CrateDB.


::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:etl
:link-type: ref

Learn how to integrate CrateDB with popular ETL frameworks and applications.
+++
{tag}`Extract, Transform, Load` {tag}`Data I/O, Import/Export` {tag}`ETL` {tag}`ELT`
::::


(apache-airflow)=
(airflow)=
(astronomer)=
## Apache Airflow / Astronomer

```{div}
:style: "float: right"
[![](https://19927462.fs1.hubspotusercontent-na1.net/hub/19927462/hubfs/Partner%20Logos/392x140/Apache-Airflow-Logo-392x140.png?width=784&height=280&name=Apache-Airflow-Logo-392x140.png){w=180px}](https://airflow.apache.org/)
[![](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/)
```
[Apache Airflow] is an open source software platform to programmatically author,
schedule, and monitor workflows, written in Python.
[Astronomer] offers managed Airflow services on the cloud of your choice, in
Expand All @@ -23,18 +40,15 @@ dynamic pipeline generation and on-demand, code-driven pipeline invocation.
Pipeline parametrization is using the powerful Jinja templating engine.
To extend the system, you can define your own operators and extend libraries
to fit the level of abstraction that suits your environment.

```{div}
:style: "float: right"
[![](https://19927462.fs1.hubspotusercontent-na1.net/hub/19927462/hubfs/Partner%20Logos/392x140/Apache-Airflow-Logo-392x140.png?width=784&height=280&name=Apache-Airflow-Logo-392x140.png){w=180px}](https://airflow.apache.org/)
[![](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/)
:style: "clear: both"
```

```{seealso}
[CrateDB and Apache Airflow]
```


:::{dropdown} **Managed Airflow**

```{div}
Expand Down Expand Up @@ -334,7 +348,6 @@ an SSIS Catalog database to store, run, and manage packages.
[CrateDB and Apache Kafka]: https://cratedb.com/integrations/cratedb-and-kafka
[CrateDB and Kestra]: https://cratedb.com/integrations/cratedb-and-kestra
[CrateDB and Node-RED]: https://cratedb.com/integrations/cratedb-and-node-red
[CrateDB Guide: Integration Tutorials]: inv:guide:*:label#integrate
[dbt]: https://www.getdbt.com/
[dbt Cloud]: https://www.getdbt.com/product/dbt-cloud/
[Debezium]: https://debezium.io/
Expand Down
28 changes: 20 additions & 8 deletions docs/integrate/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,20 @@
# Monitoring and Metrics with CrateDB

Storing metrics data for the long term is a common need in systems monitoring
scenarios. CrateDB offers corresponding integration adapters. Relevant tutorials
can be found within the [CrateDB Guide: Integration Tutorials] section of the
documentation.
scenarios. CrateDB offers corresponding integration adapters.

::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:metrics
:link-type: ref

Learn how to use CrateDB together with popular metrics collection agents,
brokers, and stores.
+++
{tag}`Logs` {tag}`Metrics` {tag}`Monitoring` {tag}`Telemetry` {tag-info}`Prometheus` {tag-info}`Telegraf`
::::


(prometheus)=
## Prometheus
Expand All @@ -21,8 +32,8 @@ Prometheus collects and stores its metrics as time series data, i.e.
metrics information is stored with the timestamp at which it was recorded,
alongside optional key-value pairs called labels.

**Features**

:::{rubric} Features
:::
Prometheus's main features are:

- a multi-dimensional data model with time series data identified by metric name and key/value pairs
Expand All @@ -34,8 +45,8 @@ Prometheus's main features are:
- multiple modes of graphing and dashboarding support


**Remote Endpoints and Storage**

:::{rubric} Remote Endpoints and Storage
:::
The [Prometheus remote endpoints and storage] subsystem, based on its
[remote write] and [remote read] features, allows to transparently
send and receive metric samples. It is primarily intended for long term
Expand Down Expand Up @@ -75,7 +86,8 @@ events from databases, systems, and IoT sensors. Telegraf is written in Go
and compiles into a single binary with no external dependencies, and requires
a very minimal memory footprint.

**Overview**
:::{rubric} Overview
:::

- **IoT sensors**: Collect critical stateful data (pressure levels, temperature
levels, etc.) with popular protocols like MQTT, ModBus, OPC-UA, and Kafka.
Expand Down
Loading

0 comments on commit 3062b5a

Please sign in to comment.