swiss-ai
Popular repositories Loading
-
nanotron
nanotron PublicForked from huggingface/nanotron
Minimalistic large language model 3D-parallelism training
-
video2dataset
video2dataset PublicForked from iejMac/video2dataset
Easily create large video dataset from video urls
Python 1
-
Megatron-LLM
Megatron-LLM PublicForked from epfLLM/Megatron-LLM
distributed trainer for LLMs
Python
-
data-PDF-pipeline
data-PDF-pipeline PublicPDF pipeline for creating training corpora (mainly for llm, multimodal and alignment horizontals)
Python
Repositories
- mmore Public
Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Ever needed to take 8000 PDFs, 2000 videos, and 500 spreadsheets and feed them to an LLM as a knowledge base? Well, MMORE is here to help you!
swiss-ai/mmore’s past year of commit activity - lighteval Public Forked from huggingface/lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
swiss-ai/lighteval’s past year of commit activity - ethel-tutor-eval Public
swiss-ai/ethel-tutor-eval’s past year of commit activity - ml-4m Public Forked from apple/ml-4m
4M: Massively Multimodal Masked Modeling (NeurIPS 2023 Spotlight)
swiss-ai/ml-4m’s past year of commit activity - data-tooling Public Forked from huggingface/datatrove
Tool set for data preparation and selection in the context of Swiss-AI (forked from DataTrove)
swiss-ai/data-tooling’s past year of commit activity - nanotron-multilingual Public Forked from swiss-ai/nanotron
A copy of nanotron for multilingual training
swiss-ai/nanotron-multilingual’s past year of commit activity