Implement MPNet model #363

kozistr · 2024-07-28T17:19:59Z

What does this PR do?

Fixes #250
Fixes #33

feedback or contributions are welcome!

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@OlivierDehaene OR @Narsil

ramipellumbi · 2024-08-18T02:22:01Z

Unable to run on Metal local install:

Error: Model backend is not healthy

Caused by:
    Metal contiguous affine U8 not implemented

I can try taking a crack at this if desired

kozistr · 2024-08-18T04:35:49Z

Unable to run on Metal local install:
Error: Model backend is not healthy

Caused by:
    Metal contiguous affine U8 not implemented
I can try taking a crack at this if desired

thanks for checking in. that sounds great! I’d really appreciate your input. please feel free to dive in whenever you are available :)

kozistr · 2024-08-26T06:26:15Z

I just made a small change not to support the Metal devices temporarily 13ebffb

ramipellumbi · 2024-08-26T21:22:09Z

I just made a small change not to support the Metal devices temporarily 13ebffb

Thank you :) will check into this. Sorry I have been too busy to look into this lately

kozistr · 2024-08-27T09:29:28Z

I just made a small change not to support the Metal devices temporarily 13ebffb

Thank you :) will check into this. Sorry I have been too busy to look into this lately

no worries! take your time :) if you need any help, please feel free to mention me here

kozistr · 2024-09-22T06:44:39Z

@ramipellumbi I just made a commit to fix that Metal issue! -> 9b58292

@OlivierDehaene I guess this PR is ready to review

-> % ./target/release/text-embeddings-router --model-id sentence-transformers/all-mpnet-base-v2 --dtype float32 --pooling mean --port 8080
2024-09-22T06:38:19.976274Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "sen*****-************/***-*****-***e-v2", revision: None, tokenization_workers: None, dtype: Some(Float32), pooling: Some(Mean), max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-09-22T06:38:19.977004Z  INFO hf_hub: /Users/taehyeonjeon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/Users/taehyeonjeon/.cache/huggingface/token"
2024-09-22T06:38:19.981266Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-09-22T06:38:19.981292Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-09-22T06:38:19.981296Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-09-22T06:38:19.981314Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-09-22T06:38:19.981359Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:328: Downloading `model.safetensors`
2024-09-22T06:38:19.981380Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 87.625µs
2024-09-22T06:38:19.989996Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 384
2024-09-22T06:38:19.990076Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 8 tokenization workers
2024-09-22T06:38:20.006653Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
2024-09-22T06:38:20.017132Z  INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:233: Starting MPNet model on Metal(MetalDevice(DeviceId(1)))
2024-09-22T06:38:21.068835Z  INFO text_embeddings_router::http::server: router/src/http/server.rs:1778: Starting HTTP server: 0.0.0.0:8080
2024-09-22T06:38:21.068848Z  INFO text_embeddings_router::http::server: router/src/http/server.rs:1779: Ready

vrdn-23 · 2024-10-25T03:00:56Z

@OlivierDehaene would it be possible to get this PR merged?

kozistr added 6 commits July 29, 2024 02:10

feature: mpnet

3174fe9

update: mpnet

4bb410e

fix: position_ids for a single input

8ad7249

add: mpnet snap

784819a

add: mpnet snaps

3393035

fix: always calculate attention_mask when masking is true

7561550

kozistr marked this pull request as ready for review July 29, 2024 16:43

kozistr added 2 commits July 30, 2024 01:52

docs: MPNet

b536d24

refactor: move relative_attention_bias to encoder

61581e6

kozistr mentioned this pull request Jul 31, 2024

sbert based mpnet model(related issue #33) #250

Open

2 tasks

kozistr added 2 commits August 3, 2024 19:31

fix: MPNetAttention

32fe7ef

fix: bias

f54126c

update: exclude support for metal device

13ebffb

kozistr added 2 commits September 22, 2024 14:58

update: allow metal device

fa1eff3

fix: affine u8 on Metal device

9b58292

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MPNet model #363

Implement MPNet model #363

kozistr commented Jul 28, 2024 •

edited

Loading

ramipellumbi commented Aug 18, 2024 •

edited

Loading

kozistr commented Aug 18, 2024

kozistr commented Aug 26, 2024

ramipellumbi commented Aug 26, 2024

kozistr commented Aug 27, 2024

kozistr commented Sep 22, 2024 •

edited

Loading

vrdn-23 commented Oct 25, 2024

Implement MPNet model #363

Are you sure you want to change the base?

Implement MPNet model #363

Conversation

kozistr commented Jul 28, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

ramipellumbi commented Aug 18, 2024 • edited Loading

kozistr commented Aug 18, 2024

kozistr commented Aug 26, 2024

ramipellumbi commented Aug 26, 2024

kozistr commented Aug 27, 2024

kozistr commented Sep 22, 2024 • edited Loading

vrdn-23 commented Oct 25, 2024

kozistr commented Jul 28, 2024 •

edited

Loading

ramipellumbi commented Aug 18, 2024 •

edited

Loading

kozistr commented Sep 22, 2024 •

edited

Loading