Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Warp affine #67

Merged
merged 21 commits into from
Apr 2, 2024
Merged

Add Warp affine #67

merged 21 commits into from
Apr 2, 2024

Conversation

gau-nernst
Copy link
Contributor

Partially address #47

Questions and problems:

  1. From Python, do we support images with n_channels != 3? The following code fails
import kornia_rs as K
import numpy as np

img = np.random.randint(0, 255, size=(256, 256, 1), dtype=np.uint8)
K.resize(img, (40, 40), "bilinear")

Error

Traceback (most recent call last):
  File "[redacted]/kornia-rs/kornia-py/test.py", line 8, in <module>
    K.resize(img, (40, 40), "bilinear")
Exception: Data length (65536) does not match the image size (196608)

Currently my warp_affine python function also only works for 3-channel, since the compiler forces me to add type annotation when doing conversion PyImage -> Image.

    let image: Image<u8, 3> = Image::from_pyimage(image)
        .map_err(|e| PyErr::new::<pyo3::exceptions::PyException, _>(format!("{}", e)))?;

I think this is caused by image type casting, since your resize function doesn't need annotation? (I need to cast image from u8 to f32 since our native interpolation only works for f32).

  1. Continue from above, the native interpolation should work for both u8 and f32 (and potentially other numeric types like u16?). For this to happen, we can create a trait that can cast into f32 (when reading value from array) and cast back to the original type (when writing value to array). I looked at some existing crates for numeric traits but none seem to have this specific trait. We can create our own trait for this. There is also From and Into (need to check overflow/underflow behavior).

  2. For resizing, it seems like your logic is similar to PyTorch's align_corners=True? I think my implementation for warp affine is not quite align_corners=True? Need to check more and check against other implementations.

For future extension, we can implement a struct AffineTransform so that we can do computation with it easier (i.e. invert, apply multiple affine transforms, extract rotation/scale param....). Right now I use a tuple of 6 numbers so that the Python binding doesn't need to depend on ndarray crate.

Some outputs

import kornia_rs as K
import cv2

import numpy as np

img_path = "../tests/data/dog.jpeg"
img: np.ndarray = K.read_image_jpeg(img_path)

# check the image properties
assert img.shape == (195, 258, 3)

matrix = (1.0, 0.0, -20.0, 0.0, 1.0, 10.0)  # translation
# matrix = (2.0, 0.0, 0.0, 0.0, 2.0, 0.0)  # scale from (0, 0)
# matrix = tuple(cv2.getRotationMatrix2D((258 / 2, 195 / 2), 45.0, 1.0).flatten())  # rotation around center

img_resized = K.warp_affine(img, matrix, (195, 258), "bilinear")

image
image
image

src/geometry/transform.rs Outdated Show resolved Hide resolved
src/geometry/transform.rs Outdated Show resolved Hide resolved
src/geometry/transform.rs Outdated Show resolved Hide resolved
src/resize.rs Outdated Show resolved Hide resolved
kornia-py/src/geometry/transform.rs Outdated Show resolved Hide resolved
kornia-py/src/geometry/transform.rs Outdated Show resolved Hide resolved
kornia-py/src/geometry/transform.rs Outdated Show resolved Hide resolved
Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bench for both python and rust ? I’m measuring a bit everything I put here

@gau-nernst
Copy link
Contributor Author

Added benchmark. Initial results are pretty positive considering the implementation is pretty naive. (I run maturin develop -r to install "release" version)

OpenCV: 0.22 ms
PIL: 0.61 ms
Kornia: 0.50 ms

It might not be fair with OpenCV and PIL, since ndarray::zip uses multi-threading via rayon, while I think OpenCV and PIL don't. For an image processing pipeline, single-thread performance might be more important, since there will be multiple processes running concurrently.

Comment on lines 19 to 20
width: new_size.0,
height: new_size.1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be height width

@edgarriba
Copy link
Member

@gau-nernst for benchmark use RUSTFLAGS="-C target-cpu=native" cargo bench read_jpeg for better optimiser. Regarding parallelism I still exploring what’s the best strategy, I’m using mostly the lib in rust which that’s not an issue for me. Opencv I think by default uses Opencl which is way to implement parallelism. I’m exploring in the background also a safe multithreaded PyTorch dataloader for rust that will be exposed to python. But happy to investigate and put numbers to decide the best’s strategy, maybe we need different backend implementations depending on the use case.

@edgarriba
Copy link
Member

In the long run I think that for speed we’ll implement the algorithms when cuda in rust is officially supported

@edgarriba edgarriba merged commit 1b75be2 into kornia:main Apr 2, 2024
8 of 9 checks passed
@gau-nernst gau-nernst deleted the warp_affine branch April 2, 2024 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants