Inference gpu requirements #797

kumar-rajwnai · 2022-01-10T16:56:26Z

kumar-rajwnai
Jan 10, 2022

May i know how much gpu ram is required to inference from model?
Curretly i am using 16gb p100 gpu

%matplotlib inline
import os
# Let's pick the desired backend
# os.environ['USE_TF'] = '1'
os.environ['USE_TORCH'] = '1'
import matplotlib.pyplot as plt
from doctr.io import DocumentFile
from doctr.models import ocr_predictor

doc = DocumentFile.from_pdf("IOA.pdf").as_images()
print(f"Number of pages: {len(doc)}")
predictor = ocr_predictor(pretrained=True, )

import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
predictor.half().to(device)
doc[0] = torch.tensor(doc[0]).to(device)

result = predictor(doc)

It's gives me cuda out of memory.

Answered by fg-mindee

Jan 18, 2022

Hello @kumar-rajwnai 👋

My apologies for the late reply!
This looks more like a bug report, would you mind sharing details about your environment? (by pasting the results of running this script https://github.com/mindee/doctr/blob/main/scripts/collect_env.py)

Apart from that, early answers/remarks:

all of us are able to run the model without a beefy GPU, especially for inference, so there is definitely an issue here
it looks like you want to use AMP, I would suggest to remove .half() which only switch to fp16, and instead use a context with torch.cuda.autocast as described here https://pytorch.org/docs/stable/notes/amp_examples.html
there is a very high probability your OOM comes from you…

View full answer

fg-mindee · 2022-01-18T16:02:22Z

fg-mindee
Jan 18, 2022

Hello @kumar-rajwnai 👋

My apologies for the late reply!
This looks more like a bug report, would you mind sharing details about your environment? (by pasting the results of running this script https://github.com/mindee/doctr/blob/main/scripts/collect_env.py)

Apart from that, early answers/remarks:

all of us are able to run the model without a beefy GPU, especially for inference, so there is definitely an issue here
it looks like you want to use AMP, I would suggest to remove .half() which only switch to fp16, and instead use a context with torch.cuda.autocast as described here https://pytorch.org/docs/stable/notes/amp_examples.html
there is a very high probability your OOM comes from your penultimate line: if your first page has a huge resolution, here you're moving it to GPU before resizing it. I would strongly advise against it. Besides, the predictor has a dedicated preprocessing module, so you don't need to do anything :) Below I edited your snippet to make it work:

%matplotlib inline
import os
# Let's pick the desired backend
os.environ['USE_TORCH'] = '1'
from doctr.io import DocumentFile
from doctr.models import ocr_predictor
import torch

doc = DocumentFile.from_pdf("IOA.pdf").as_images()
print(f"Number of pages: {len(doc)}")
predictor = ocr_predictor(pretrained=True)
# On CPU
result = predictor(doc)

For the GPU part, I just opened #808 to make it simpler for users: once merged, you will only have to add predictor = predictor.cuda() after the predictor instantiation and the rest will be taken care of automatically (preprocessing, moving to GPU, etc.) 👍

Hope this helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference gpu requirements #797

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Inference gpu requirements #797

kumar-rajwnai Jan 10, 2022

Replies: 1 comment

fg-mindee Jan 18, 2022

kumar-rajwnai
Jan 10, 2022

fg-mindee
Jan 18, 2022