Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using CPU Image without ONNX #388

Open
ramipellumbi opened this issue Aug 22, 2024 · 0 comments
Open

Using CPU Image without ONNX #388

ramipellumbi opened this issue Aug 22, 2024 · 0 comments

Comments

@ramipellumbi
Copy link

ramipellumbi commented Aug 22, 2024

Feature request

TEI 1.5 introduced feat(onnx): add onnx runtime for better CPU perf #328. Request to not use onnx runtime on CPU. It seems there is no way to opt out of the ONNX backend. Requesting ability to not use the ONNX backend.

pub async fn download_weights(api: &ApiRepo) -> Result<Vec<PathBuf>, ApiError> {
    let model_files = if cfg!(feature = "python") || cfg!(feature = "candle") {
        match download_safetensors(api).await {
            Ok(p) => p,
            Err(_) => {
                tracing::warn!("safetensors weights not found. Using `pytorch_model.bin` instead. Model loading will be significantly slower.");
                tracing::info!("Downloading `pytorch_model.bin`");
                let p = api.get("pytorch_model.bin").await?;
                vec![p]
            }
        }
    } else if cfg!(feature = "ort") {
        tracing::info!("Downloading `model.onnx`");
        match api.get("model.onnx").await {
            Ok(p) => vec![p],
            Err(err) => {
                tracing::warn!("Could not download `model.onnx`: {err}");
                tracing::info!("Downloading `onnx/model.onnx`");
                let p = api.get("onnx/model.onnx").await?;
                vec![p.parent().unwrap().to_path_buf()]
            }
        }
    } else {
        unreachable!()
    };
    Ok(model_files)
}

Motivation

Not all models have .onnx files posted. There doesn't seem to be a reason to lock the cpu runtime to onnx models only for people not concerned about performance.

Your contribution

Happy to open a PR adding an option to opt out of the onnx runtime and build the old cpu image as it was

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant