Skip to content

Useful information about LLM and its environment is collected here

Notifications You must be signed in to change notification settings

vvchernov/LLM_info

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

LLM_info

Useful information about LLM and its environment is collected here

Open source projects and frameworks

Serving

  1. vLLM: a fast and easy-to-use library for LLM inference and serving (blog)
  2. RouteLLM: a framework for serving and evaluating LLM routers (paper)

Optimization

  1. DeepSpeed: a deep learning optimization library that makes distributed training easy, efficient, and effective
  2. TVM: a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends
  3. FireOptimizer: customizing latency and quality for your production inference workload
  4. GGML: tensor library for machine learning
  5. Medusa is a simple framework that democratizes the acceleration techniques for LLM generation with multiple decoding heads.
  6. Optimal Brain Compression (OBC): a framework for accurate PTQ and pruning (paper)

Business-logic over LLM

  1. LangChain is a framework for developing applications powered by large language models (LLMs)

AI Agents and platforms

  1. OPEA: Open Platform for Enterprise AI from Intel (examples in github)
  • GenAIComps: a service-based tool that includes microservice components such as llm, embedding, reranking, and so on
  • GenAIInfra: part of the OPEA containerization and cloud-native suite, enables quick and efficient deployment of GenAIExamples in the cloud
  • GenAIEval: it measures service performance metrics such as throughput, latency, and accuracy for GenAIExamples. This feature helps users compare performance across various hardware configurations easily.

Common information in tutorials and blogs

Optimization concepts

  1. Speculative decoding:

API

  1. OpenAI API
  2. Using logprobs from OpenAI

Benchmarks

  1. LLM evals and benchmarking

Papers

Raw list of papers sorted by general topics
Brief description and analysis of current state based on the papers can be also found there.

About

Useful information about LLM and its environment is collected here

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published