Skip to content

MLOps-essi-upc/MLOps-TeamBeans

Repository files navigation

annotations_creators language_creators language license multilinguality size_categories source_datasets task_categories task_ids pretty_name dataset_info
expert-generated
expert-generated
en
mit
monolingual
1K<n<10K
original
image-classification
multi-class-image-classification
Beans
features splits download_size dataset_size
name dtype
image_file_path
string
name dtype
image
image
name dtype
labels
class_label
names
0 1 2
angular_leaf_spot
bean_rust
healthy
name num_bytes num_examples
train
382110
1034
name num_bytes num_examples
validation
49711
133
name num_bytes num_examples
test
46584
128
180024906
478405

Table of Contents

Dataset Description

Dataset Summary

Beans leaf dataset with images of diseased and health leaves. Each image is 500 x 500 RGB. Dataset is balanced in terms of classes. There are 3 types of classes, 2 of them being diseased leafs and one being healthy:

  • Angular Leaf Spot which is a bacterial disease caused by Pseudomonas syringae pv.lachrymans
  • Bean Rust which is caused by Uromyces phaseoli typica.
  • Healthy

Supported Tasks and Leaderboards

  • image-classification: Based on a leaf image, the goal of this task is to predict the disease type (Angular Leaf Spot and Bean Rust), if any.

Languages

English

Dataset Structure

Data Instances

A sample from the training set is provided below:

{
    'image_file_path': '/root/.cache/huggingface/datasets/downloads/extracted/0aaa78294d4bf5114f58547e48d91b7826649919505379a167decb629aa92b0a/train/bean_rust/bean_rust_train.109.jpg',
    'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x500 at 0x16BAA72A4A8>,
    'labels': 1
}

Data Fields

The data instances have the following fields:

  • image_file_path: a string filepath to an image.
  • image: A PIL.Image.Image object containing the image. Note that when accessing the image column: dataset[0]["image"] the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the "image" column, i.e. dataset[0]["image"] should always be preferred over dataset["image"][0].
  • labels: an int classification label.

Class Label Mappings:

{
  "angular_leaf_spot": 0,
  "bean_rust": 1,
  "healthy": 2,
}

Data Splits

train validation test
# of examples 1034 133 128

Source Data

The data has been sourced from repository at huggingface (https://huggingface.co/datasets/beans)

Data Author

@ONLINE {beansdata,
    author="Makerere AI Lab",
    title="Bean disease dataset",
    month="January",
    year="2020",
    url="https://github.com/AI-Lab-Makerere/ibean/"
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages