-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensor RT #285
base: master
Are you sure you want to change the base?
Tensor RT #285
Conversation
…forms Caffe inference.
…n laptop. DOES NOT compile, convenience commit.
… segfault, need precise step logs or debug.
…ymore, still not working.
|
||
ifeq ($(DEEP_NET), tensorrt) | ||
COMMON_FLAGS += -DUSE_TENSORRT | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If DEEP_NET is tensorrt
then the else
clause is never reached hence lines 73-76
are not needed. It looks like the libraries and dirs that are part of the else clause will still be needed for tensorrt
however so that should be moved up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now TensorRT is included for the main inferences on the middle of a pipeline using CAFFE, for example I use caffe blobs for input and output. I think it's line 64-65 that should be removed.
Sorry for the waiting, got caught up in some other work, I now try polishing this ASAP. |
Thanks @bushibushi, super excited to try this out. |
class OP_API PoseExtractorTensorRT : public PoseExtractor | ||
{ | ||
public: | ||
PoseExtractorTensorRT(const Point<int>& netInputSize, const Point<int>& netOutputSize, const Point<int>& outputSize, const int scaleNumber, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the definition and implementation for std::vector<int> getHeatMapSize() const;
needed here since we are extending PoseExtractor
.
Hi @gineshidalgo99, I had a hard time applying the new PIMPL architecture on my TensorRT versions of netCaffe and poseExtractorCaffe. In the process its seems I broke the pipeline, my TensorRT network forwards to it's output caffe::blob gpu data but nothing is finally displayed. Any clues ? |
I am also having trouble with the runtime-only knowledge of net I/O dimensions as TensorRT networks are not really geared for this. |
Sorry for that huge internal change with the PIMPL paradigm and now flexible output size, unfortunately it was completely required for future modularity and scalability, hence unavoidable :( For the runtime-only knowledge of net I/O dimensions, I guess it's fine since the purpose of RT is real-time applications (i.e. video/webcam), so fixed I/O dimensions should be fine there. I guess implementing your own About forwarding output and getting nothing, that's definitively weird. Do you mean the GPU data is noise? Or it's size is 0? Or something else? Maybe forgetting about GPU rendering (which requires params from the poseExtractor) and using instead CPU rendering and/or no display but JSON saving? If I did not really answer your question, please, ask again! |
Hello @gineshidalgo99,
It's ok, it makes sense, just made me regret not to have completed my PR sooner !
Well I'll go for the fixed size version with sh launch scripts to create and select the correct prototxt.
This is exactly what I did, but while converting it to PIMPL something went wrong.
I'll check again tonight.
Only thing : where is the info on the input size at runtime now ? |
It is not directly outputted, but you can print the size of each element of |
Hi @bushibushi it looks like this is a pretty tough setup. Do you have any recommended steps to pick up where you left off. Im trying to run tensorRT with what you have done so far with a custom net resolution of 512x288. |
Hi @gineshidalgo99 @bushibushi is this issue resolved ? I'm really looking forward for Tensor RT inference on jetson nano. Thanks in advance ! |
Opening this for those wanting to test Tensor RT inference in advance. Still using FP32 in caffe blobs for now to keep things simple, but still get 2x shorter inference on jetson TX2. Did not test on bigger cards.
Documentation not done yet and usage still by hand: you have to duplicate the prototxt file and call it pose_deploy_linevec.prototxt_<desired_input_height>x<desired_input_width> and make these values match with the net_resolution used and the values in the new prototxt file.
First usage will take a bit of time to create the serialized TensorRT engine, then nexts runs will be faster.
Branch compilation separation messed things up, if you want to test checkout commit
b3ae8ec or the one before the merge.