Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic media support #539

Merged
merged 68 commits into from
Mar 9, 2022
Merged

Generic media support #539

merged 68 commits into from
Mar 9, 2022

Conversation

zhiltsov-max
Copy link
Contributor

@zhiltsov-max zhiltsov-max commented Nov 6, 2021

Summary

Depends on #538
Related #675

  • Added DatasetItem.media to replace dedicated members for each media type
  • Added the PointCloud media type
  • Added the media_type() method to Extractors
  • Added merging for all media types, mixed media types for an item or in the dataset produce an error
  • Datasets can't have mixed media types in items. If such situation occurs, an error is raised (checked during dataset caching/iteration)
  • Datasets can't change media type using transforms
  • Extractors must report their media type with the media_type() method
  • Added a new mandatory media_type argument to Dataset.from_iterable. It has a default value of Image for the transition period (to be tracked in Support different media types #675).
  • Deprecated DatasetItem.image, .related_images, .point_cloud, save-images and require_images
  • Added deprecation messages about annotation classes in components.extractor
  • Suppressed Datumaro deprecation messages when using Datumaro from CLI

TODO:

  • --save-media in CLI
  • [ ] Media type deduction by path (do we need it at all?)
  • Update formats to the new interface
  • Questions with Dataset domain
    • Is it better to differentiate datasets by domain or media type? By media type
    • Dataset items must use the same media type? Yes
    • How to avoid calling transforms with unsupported media types? Transforms and converters must check input media type
    • How to avoid exporting in formats for datasets with unsupported media types?

How to test

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

@zhiltsov-max zhiltsov-max changed the title [WIP] [Dependent] Generic media support [WIP] Generic media support Nov 10, 2021
@yasakova-anastasia yasakova-anastasia changed the title [WIP] Generic media support Generic media support Feb 14, 2022
@yasakova-anastasia
Copy link

@zhiltsov-max, I think PR is ready for review.

Copy link

@sizov-kirill sizov-kirill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now these approaches are not alternative, because if user using video_frames format it's not possible to export such dataset in other Image media type format. I think we should to mention it in docs or allow VideoFrame media type for Image formats.

@zhiltsov-max
Copy link
Contributor Author

@kirill-sizov, VideoFrame is subclass of Image, so it should be working.

@sizov-kirill
Copy link

sizov-kirill commented Feb 22, 2022

@kirill-sizov, VideoFrame is subclass of Image, so it should be working.

@zhiltsov-max, I meant that for example such code throws an error:

from datumaro.components.dataset import Dataset

dataset = Dataset.import_from('video.mp4', format='video_frames')
dataset.export('./save_dir', 'voc', save_media=True)
datumaro.components.errors.MediaTypeError: Media type is not an image

@zhiltsov-max zhiltsov-max merged commit b7d83d5 into develop Mar 9, 2022
@zhiltsov-max zhiltsov-max deleted the zm/generic-media branch March 18, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants