LLM Services API

LLM Services API is a FastAPI-based application that provides a suite of natural language processing services using various machine learning models from Hugging Face's transformers library through a REST API interface. The application is designed to run in a Docker container, providing endpoints for text summarization, sentiment analysis, named entity recognition, paraphrasing, keyword extraction, and embedding generation. The entire API is secured using an API key with Bearer <token> format, ensuring that only authorized users can access the endpoints.

The service allows flexibility in model selection through command-line arguments and a configuration file, models_config.json, enabling users to specify different Hugging Face models for various NLP tasks. This flexibility allows users to select lightweight models for lower-resource environments or more powerful models for advanced tasks.

Updates

0.0.4

Tokenization: Convert input text into a list of token IDs, allowing you to process and manipulate text at the token level, default model all-MiniLM-L6-v2.
Detokenization: Reconstruct original text from a list of token IDs, allowing you to reverse the tokenization process, default model all-MiniLM-L6-v2.

0.0.3

Adaptive Throttling: Implemented an adaptive throttling mechanism that delays requests using the Retry-After header when errors are encountered due to high request frequency or processing failures. The delay is dynamically adjusted based on the client’s request rate and error occurrences.

0.0.2

OpenAI-Compatible Embeddings: Provides an endpoint that mimics the OpenAI embedding API, allowing easy integration with existing systems expecting OpenAI-like responses.
Configurable Model Loading: Customize which Hugging Face NLP models are loaded by providing command-line arguments or configuring the models_config.json file. This flexibility allows the application to adapt to different resource environments or use cases.

Features

Text Summarization: Generate concise summaries of long texts, default model BART.
Sentiment Analysis: Determine the sentiment of text inputs, default model DistilBERT.
Named Entity Recognition (NER): Identify entities within text and sort them by frequency, default model BERT (dbmdz/bert-large-cased-finetuned-conll03-english).
Paraphrasing: Rephrase sentences to produce semantically similar outputs, default model T5.
Keyword Extraction: Extract important keywords from text, with customizable output count, default model KeyBERT.
Embedding Generation: Create vector representations of text, default model SentenceTransformers (all-MiniLM-L6-v2).
Caching with LRU: Frequently used computations, such as generating embeddings and tokenizations, are cached using the Least Recently Used (LRU) strategy. This reduces response times for repeated requests and enhances overall performance.

Dependencies

Python 3.7+
FastAPI
Uvicorn
spaCy
transformers
sentence-transformers
keybert
torch
python-dotenv (for environment variable management)

Installation

To get started with the LLM Services API, follow these steps:

Clone the Repository:

git clone https://github.com/samestrin/llm-services-api.git
cd llm-services-api

Create a Virtual Environment:

python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`

Install the Dependencies:

pip install -r requirements.txt

Download SpaCy Model:

python -m spacy download en_core_web_sm

Create Your .env File:

echo "API_KEY=your-key-here" > .env

Run the Application Locally:

You can run the application locally in two ways:

Using Uvicorn:

This is the recommended method for running in a development or production-like environment.

uvicorn main:app --reload --port 5000

Using Python:

This method allows you to pass command-line arguments for customizing models.

python main.py --embedding-model all-MiniLM-L6-v2 --summarization-model facebook/bart-large-cnn

Replace --embedding-model and --summarization-model with the models you wish to use. This approach offers flexibility by allowing you to specify different models for various NLP tasks.

Options

  -h, --help                                  Show this help message and exit
  --embedding-model EMBEDDING_MODEL           Specify embedding model
  --summarization-model SUMMARIZATION_MODEL   Specify summarization model
  --sentiment-model SENTIMENT_MODEL           Specify sentiment analysis model
  --ner-model NER_MODEL                       Specify named entity recognition model
  --paraphrase-model PARAPHRASE_MODEL         Specify paraphrasing model
  --keyword-model KEYWORD_MODEL               Specify keyword extraction mode

Running with Docker

To run the application in a Docker container, follow these steps:

Build the Docker Image:

docker build -t llm-services-api .

Run the Docker Container:

docker run -p 5000:5000 llm-services-api

The application will be accessible at http://localhost:5000.

Usage

The API provides several endpoints for various NLP tasks. Below is a summary of the available endpoints:

Endpoints

1. Text Summarization

Endpoint: /summarize
Method: POST
Request Body:

{
  "text": "Your text here"
}

Response:

{
  "summary": "The generated summary of the provided text."
}

2. Sentiment Analysis

Endpoint: /sentiment
Method: POST
Request Body:

{
  "text": "Your text here"
}

Response:

{
    "sentiment": [
        {
        "label": "POSITIVE", # or "NEGATIVE"
        "score": 0.99
        }
    ]
}

3. Named Entity Recognition

Endpoint: /entities
Method: POST
Request Body:

{
  "text": "Your text here"
}

Response:

{
    "entities": [
        {
        "entity": "PERSON",
        "word": "John Doe",
        "frequency": 3
        },
        ...
    ]
}

4. Paraphrasing

Endpoint: /paraphrase
Method: POST
Request Body:

{
  "text": "Your text here"
}

Response:

{
  "paraphrased_text": "The paraphrased version of the input text."
}

5. Keyword Extraction

Endpoint: /extract_keywords
Method: POST
Query Parameters:
- num_keywords: Optional, defaults to 5. Specifies the number of keywords to extract.
Request Body:

{
  "text": "Your text here"
}

Response:

{
"keywords": [
    {
        "keyword": "important keyword",
        "score": 0.95
        },
        ...
    ]
}

6. Embedding Generation

Endpoint: /embed
Method: POST
Request Body:

{
  "text": "Your text here"
}

Response:

{
    "embedding": [0.1, 0.2, 0.3, ...] # Array of float numbers representing the text embedding
}

7. OpenAI-Compatible Embedding

Endpoint: /v1/embeddings
Method: POST
Request Body:

{
  "input": "Your text here",
  "model": "all-MiniLM-L6-v2"  # or another supported model
}

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.006929283495992422, -0.005336422007530928, ...],  # Embedding array
    }
  ],
  "model": "all-MiniLM-L6-v2",
  "usage": {
    "prompt_tokens": 5,  # Number of tokens in the input
    "total_tokens": 5    # Total number of tokens processed
  }
}

8. Tokenization

Endpoint: /tokenize
Method: POST
Request Body:

{
  "text": "Your text here",
  "model": "all-MiniLM-L6-v2"  # Optional, specify a model for tokenization
}

Response:

{
  "tokens": [101, 7592, 999, ...]  # Array of token IDs representing the text
}

This endpoint allows you to tokenize input text using a specified or default model. If the model field is not provided, the default embeddings model all-MiniLM-L6-v2 will be used.

8. Detokenization

Endpoint: /detokenize
Method: POST
Request Body:

{
  "tokens": [101, 2023, 2003, 2019, 2742, 6251, 2000, 19204, 1012, 102],  # List of token IDs
  "model": "all-MiniLM-L6-v2"  # Optional, specify a model for detokenization
}

Response:

{
  "text": "This is an example sentence to tokenize."  # The reconstructed text
}

Contribute

Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
models		models
routers		routers
schemas		schemas
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
DockerFile		DockerFile
LICENSE		LICENSE
README.md		README.md
main.py		main.py
models_config.json		models_config.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Services API

Updates

Features

Dependencies

Installation

Options

Running with Docker

Usage

Endpoints

1. Text Summarization

2. Sentiment Analysis

3. Named Entity Recognition

4. Paraphrasing

5. Keyword Extraction

6. Embedding Generation

7. OpenAI-Compatible Embedding

8. Tokenization

8. Detokenization

Contribute

License

Share

About

Releases 3

Packages

Languages

License

samestrin/llm-services-api

Folders and files

Latest commit

History

Repository files navigation

LLM Services API

Updates

Features

Dependencies

Installation

Options

Running with Docker

Usage

Endpoints

1. Text Summarization

2. Sentiment Analysis

3. Named Entity Recognition

4. Paraphrasing

5. Keyword Extraction

6. Embedding Generation

7. OpenAI-Compatible Embedding

8. Tokenization

8. Detokenization

Contribute

License

Share

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages