-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Warp affine #67
Add Warp affine #67
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a bench for both python and rust ? I’m measuring a bit everything I put here
Added benchmark. Initial results are pretty positive considering the implementation is pretty naive. (I run
It might not be fair with OpenCV and PIL, since ndarray::zip uses multi-threading via rayon, while I think OpenCV and PIL don't. For an image processing pipeline, single-thread performance might be more important, since there will be multiple processes running concurrently. |
kornia-py/src/warp.rs
Outdated
width: new_size.0, | ||
height: new_size.1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be height width
@gau-nernst for benchmark use RUSTFLAGS="-C target-cpu=native" cargo bench read_jpeg for better optimiser. Regarding parallelism I still exploring what’s the best strategy, I’m using mostly the lib in rust which that’s not an issue for me. Opencv I think by default uses Opencl which is way to implement parallelism. I’m exploring in the background also a safe multithreaded PyTorch dataloader for rust that will be exposed to python. But happy to investigate and put numbers to decide the best’s strategy, maybe we need different backend implementations depending on the use case. |
In the long run I think that for speed we’ll implement the algorithms when cuda in rust is officially supported |
Partially address #47
Questions and problems:
Error
Currently my warp_affine python function also only works for 3-channel, since the compiler forces me to add type annotation when doing conversion
PyImage
->Image
.I think this is caused by image type casting, since your resize function doesn't need annotation? (I need to cast image from
u8
tof32
since our native interpolation only works forf32
).Continue from above, the native interpolation should work for both
u8
andf32
(and potentially other numeric types likeu16
?). For this to happen, we can create a trait that can cast intof32
(when reading value from array) and cast back to the original type (when writing value to array). I looked at some existing crates for numeric traits but none seem to have this specific trait. We can create our own trait for this. There is alsoFrom
andInto
(need to check overflow/underflow behavior).For resizing, it seems like your logic is similar to PyTorch's
align_corners=True
? I think my implementation for warp affine is not quitealign_corners=True
? Need to check more and check against other implementations.For future extension, we can implement a struct
AffineTransform
so that we can do computation with it easier (i.e. invert, apply multiple affine transforms, extract rotation/scale param....). Right now I use a tuple of 6 numbers so that the Python binding doesn't need to depend onndarray
crate.Some outputs