Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasm / frontend support #84

Open
Isaac-Leonard opened this issue Jun 2, 2024 · 5 comments
Open

Wasm / frontend support #84

Isaac-Leonard opened this issue Jun 2, 2024 · 5 comments

Comments

@Isaac-Leonard
Copy link

Isaac-Leonard commented Jun 2, 2024

I see that there is a nodejs example in the repo but is it possible to use this on the frontend yet with ts support?

@robertknight
Copy link
Owner

There is a WASM build, but you have to build from source as it isn't published to eg. npm yet. To try this out locally:

  1. Clone repository and run make wasm
  2. You can try out the JS + WASM demo in https://github.com/robertknight/ocrs/tree/main/js/examples/ocr-node

You can also build for WASI and run with eg. wasmtime. See

ocrs/Makefile

Line 46 in fe6de19

# wasmtime --dir . target/wasm32-wasi/release/ocrs.wasm --detect-model text-detection.rten --rec-model text-recognition.rten ocrs-cli/test-data/why-rust.png
.

To manage expectations, I will warn you that the WASM build is currently much slower than the native build. This is for two reasons:

  1. The native build is multi-threaded, the WASM build is not
  2. Native SIMD is much faster than WASM SIMD

@Isaac-Leonard
Copy link
Author

Okay I've got it working now, I'm not too worried about speed and I'm getting far more useful results than I was getting for tesseract.js so I'm pretty happy.
Should I leave this open till it's available on npm?

@robertknight
Copy link
Owner

Should I leave this open till it's available on npm?

Yes, I think so. There isn't an existing tracking issue for publishing an npm package.

@hbmartin
Copy link

hbmartin commented Sep 16, 2024

I'm trying to implement this in the browser but I keep getting an error Unhandled Promise Rejection: parse error: Range [118622216, 118622220) is out of bounds. when calling setDetectionModel which is hitting the throw on if (r1) { throw takeFromExternrefTable0(r0); } I suspect I may be loading the data from the fetch call incorrectly? but haven't been able to find the issue.

Steps I took:

  1. rustup target add wasm32-unknown-unknown
  2. cargo install wasm-bindgen-cli
  3. Download .onnx models from https://huggingface.co/robertknight/ocrs/tree/main
  4. HTTP fetch WASM and models and init

Implementation:

async function fetchAsBinary(path: string): Promise<Uint8Array> {
    const response = await fetch(path)
    const data = await response.arrayBuffer()
    return new Uint8Array(data)
}

async function initEngine() {
    const wasmBinary = await fetchAsBinary("/ocrs_bg.wasm");
    console.log("Loaded wasm")
    console.log(wasmBinary.length)
    initOcrLib(wasmBinary)

    const [detectionModel, recognitionModel] = await Promise.all([
        fetchAsBinary("/text-detection-ssfbcj81.onnx"),
        fetchAsBinary("/text-rec-checkpoint-s52qdbqt.onnx"),
    ]);
    console.log("Loaded detectionModel")
    console.log(detectionModel.length)
    console.log("Loaded recognitionModel")
    console.log(recognitionModel.length)

    const ocrInit = new OcrEngineInit();
    ocrInit.setDetectionModel(detectionModel);
    ocrInit.setRecognitionModel(recognitionModel);

    const ocrEngine = new OcrEngine(ocrInit);
}

The length log statements match the size of the models on disk.

@robertknight
Copy link
Owner

Hello @hbmartin, this library reads models in .rten format rather than .onnx. RTen models are produced from ONNX models using rten-convert. You can find links to the current models in https://github.com/robertknight/ocrs/blob/main/ocrs/examples/download-models.sh. There are models in .rten format on Hugging Face as well, but those require some configuration changes to work with the current version of ocrs, specifically some of the thresholds need to be adjusted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants