Skip to content
/ nul-dsp Public

Playground for testing front-end visualizations of vector clustering.

Notifications You must be signed in to change notification settings

nulib/nul-dsp

Repository files navigation

NUL Data Science Project UI

A NextJS UI application for the NUL Data Science Project.

Public preview branch: https://main.d3nyatpv9uoqqk.amplifyapp.com/

Visualizations

Features visualizations of vector clustering. 3-d Scatterplots are visualized using Plotly.js and, and the 2-d scatterplot is visualized using D3.js.

The goal is to visualize the clusters of the embeddings and the metadata in a way that is intuitive and informative.

image

Data

The dataset is combination of metadata and vector embeddings of the metadata from NUL digital Collections. (Where does the data come from eventually?)

Notebook

For initial testing, ndjson data is generated by the src/lib/notebooks/convert.ipynb notebook (which can be ran inside VSCode in your dev env). Number of records created and the number of dimensions can be adjusted in the notebook. The default values are:

output_row_count = 50
number_of_dimensions = 3

To access the source vectors file src/lib/data/vectors_full.ndjson (and run the notebook), please contact Brendan or a member of the RDC team for a download link (~700MB).

Some sample output data is included in the src/lib/data directory, to feed chart components:

src/lib/data/vectors_tsne_2d_reduced.json
src/lib/data/vectors_tsne_3d_reduced.json
...

Run the app

pnpm install

pnpm dev

Tests

Tests are configured to run with Playwright.

Headless tests (quick)

pnpm test:e2e

Tests with UI (slower, better visuals)

pnpm test:e2e-ui

About

Playground for testing front-end visualizations of vector clustering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published