Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renovate: Repurpose into "CrateDB Ecosystem Catalog" #107

Merged
merged 2 commits into from
Mar 27, 2024
Merged

Conversation

amotl
Copy link
Member

@amotl amotl commented Mar 26, 2024

About

After all the tutorials have been refactored into the CrateDB Guide with GH-82 recently, the enumeration of catalog items here became a bit of a lost place. Thanks for reporting, @geragray.

Details

Ecosystem Catalog

The patch concludes the renovation on this end, by effectively repurposing the documentation section into a (preliminary) ecosystem software catalog/gallery, in the spirit how others are running them.

Gravity to Tutorials

To remedy a guidance flaw, the patch also adds concise navigation elements to the top of each page within its "Integrations" section, by adding gravity towards the corresponding tutorial items, now located within The CrateDB Guide.

Two samples of that have been outlined below, corresponding feedback is very much welcome.

Preview

References

Comment on lines 2 to 22
(ml-tools)=
# Machine Learning with CrateDB

This documentation section lists machine learning applications and frameworks
which can be used together with CrateDB. Relevant tutorials can be found within
the [CrateDB Guide: Machine Learning Tutorials] section of the documentation.
Machine learning applications and frameworks
which can be used together with CrateDB.

::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:ml
:link-type: ref

Learn how to integrate CrateDB with machine learning frameworks and tools,
for MLOps and Vector database operations.
+++
{tag}`MLOps` {tag}`Vector Store` {tag}`Embeddings`
{tag}`Hybrid Search` {tag}`LLM` {tag}`RAG`
::::


## LangChain
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has been a guidance flaw on this page. This has been improved now, the page starts right away by adding navigation gravity towards the tutorials section, where an avid reader may follow right away. That it is a navigation element, becomes immediately obvious, because the whole card item is a link.

Otherwise, a reader of general information may just go on consuming the catalog/gallery items, in order to learn more about them.

-- https://crate-clients-tools--107.org.readthedocs.build/en/107/integrate/ml.html

Screenshot

image

Comment on lines -4 to 20
Use dashboard and other data visualization applications and toolkits for
Dashboard and other data visualization applications and toolkits for
visualizing data stored inside CrateDB.

::::{card} {material-outlined}`lightbulb;2em` Tutorials
:margin: 0 0 5 5
:shadow: md
:link: guide:visualization
:link-type: ref

Guidelines about data analysis and visualization with CrateDB.
+++
{tag}`DataViz` {tag}`EDA` {tag}`BI`
::::


(apache-superset)=
(preset)=
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another sample of guidance improvements, also to highlight and provide navigation to corresponding tutorials.

-- https://crate-clients-tools--107.org.readthedocs.build/en/107/integrate/visualize.html

Screenshot

image

Comment on lines +101 to +126
:::{rubric} scikit-learn
:::
_Machine Learning in Python._

- Simple and efficient tools for predictive data analysis
- Accessible to everybody, and reusable in various contexts
- Built on NumPy, SciPy, and matplotlib

:::{rubric} pandas
:::
_The open source data analysis and manipulation tool._

Pandas is a software library written for the Python programming
language for data manipulation and analysis. In particular, it offers data structures
and operations for manipulating numerical tables and time series.

:::{rubric} Project Jupyter
:::
_Interactive computing across all programming languages._

JupyterLab is the latest web-based interactive development environment for notebooks,
code, and data. Its flexible interface allows users to configure and arrange workflows
in data science, scientific computing, computational journalism, and machine learning.
A modular design invites extensions to expand and enrich functionality.


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section about scikit-learn and friends was a bit empty. Thanks for reporting, @surister.

Comment on lines +70 to +88
Pandas (stylized as pandas) is a software library written for the Python programming
language for data manipulation and analysis. In particular, it offers data structures
and operations for manipulating numerical tables and time series.

:::{rubric} Data Model
:::
- Pandas is built around data structures called Series and DataFrames. Data for these
collections can be imported from various file formats such as comma-separated values,
JSON, Parquet, SQL database tables or queries, and Microsoft Excel.
- A Series is a 1-dimensional data structure built on top of NumPy's array.
- Pandas includes support for time series, such as the ability to interpolate values
and filter using a range of timestamps.
- By default, a Pandas index is a series of integers ascending from 0, similar to the
indices of Python arrays. However, indices can use any NumPy data type, including
floating point, timestamps, or strings.
- Pandas supports hierarchical indices with multiple values per data point. An index
with this structure, called a "MultiIndex", allows a single DataFrame to represent
multiple dimensions, similar to a pivot table in Microsoft Excel. Each level of a
MultiIndex can be given a unique name.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section about pandas also was a bit thin, so it has been expanded. /cc @surister

@amotl amotl marked this pull request as ready for review March 26, 2024 02:10
Server down or defunct: 500 Server Error: Internal Server Error for url.
After all the tutorials have been refactored into the CrateDB Guide,
the enumeration of catalog items became a bit of a lost place.

This improvement concludes the renovation on this end, by effectively
repurposing it into a ecosystem software catalog/gallery, similar to how
others are running them.

Other than this, the patch also adds concise navigation elements to the
top of each page, in order to add gravity towards the tutorial items.
@amotl amotl requested a review from matkuliak March 26, 2024 02:33
@amotl amotl merged commit 31bab43 into main Mar 27, 2024
4 checks passed
@amotl amotl deleted the renovate-catalog branch March 27, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant