Implementation of a system for the survey and classification of museum works.
Starting with a dataset of 22 images, i.e. Etruscan vases, they were divided into 5 classes, based on the type of vase. A set of transformations were applied to the images in the dataset in 4 different ways, using the torchvision package of the PyTorch library, resulting in 88 images with dimensions 463x463x3:
1'Mode | 2'Mode | 3'Mode | 4'Mode |
---|---|---|---|
Resize, CenterCrop, ToTensor | RandomRotation, RandomResizedCrop, RandomHorizontalFlip, ToTensor | Resize, RandomCrop, RandomVerticalFlip, GaussianBlur, ToTensor | Resize, CenterCrop, ColorJitter, ToTensor |
For details of my work, see: Data Augmentation and thesis
Structuring the dataset by assigning labels to certain areas of the image, called bounding boxes, using the Yolo_mark tool.
The file may be empty or it may contain one or more coordinates. Each coordinate is set as 'ID X Y WIDTH HEIGHT', where:
- ID: indicates the identification attributed to the different classes defined. In our case, five classes were defined between ID=0 and ID=4;
- X: indicates the X co-ordinate of the centre of the object;
- Y: indicates the Y co-ordinate of the centre of the object;
- WIDTH: indicates the width of the object;
- HEIGHT: indicates the height of the object.
For details of my work, see: Dataset YOLO format and thesis
Once the dataset was obtained, the k-fold cross validation algorithm was applied to analyse its accuracy. The implementation of the algorithm was done through the use of two libraries:
- Scikit-learn, used to set the number of folds to be applied on the dataset. We chose k=5 as the number of folds, resulting in a split in which 80% refers to the training set, with 163 images, while the remaining 20% refers to the testing set, with 41 images;
- PyTorch, used for training the training set, with 500 epochs. At the end of the 500 epochs, the accuracy was evaluated using the testing set, containing the data that was not trained.
The training and evaluation process is carried out by alternating between 80% of the training set and 20% of the testing set each time, depending on the number of folds chosen via the Scikit-learn library. Once the algorithm has been run, the accuracy obtained is 99%.
For details of my work, see: K-fold train Validation and thesis
Detectron2 only supports datasets in COCO format, so labels were converted from YOLO TXT format to COCO JSON format), via Roboflow. Then the dataset in COCO format was divided to perform the training via Detectron2 in the following way:
Set | Number of Images |
---|---|
Training Set | 145 |
Validation Set | 34 |
Testing Set | 25 |
For details of my work, see: Dataset COCO format and thesis
Through the Detectron2 Model Zoo repository, the model pre-trained on the COCO dataset, Faster RCNN R-50 FPN 3X, was chosen, with its related architecture. This backbone consists of a ResNet with 50 convolution levels and an FPN for feature extraction.
Name | lr sched |
train time (s/iter) |
inference time (s/im) |
train mem (GB) |
box AP |
model id |
---|---|---|---|---|---|---|
R50-FPN | 3x | 0.209 | 0.038 | 3.0 | 40.2 | 137849458 |
For details of my work, see: thesis
# Train Configuration using Detectron2 library
from detectron2.config import get_cfg
from detectron2 import model_zoo
from detectron2.evaluation import COCOEvaluator
import os
cfg = get_cfg ()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)
cfg.DATALOADER.NUM_WORKERS=4
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 4
cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.MAX_ITER = 500
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 5
cfg.TEST.EVAL_PERIOD = 100
os.makedirs(cfg.OUTPUT_DIR , exist_ok = True)
trainer = COCOEvaluator(cfg)
trainer.resume_or_load(resume = False)
trainer.train()
For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis
# Test Configuration using Detectron2 library
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
from detectron2.data import build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
import os
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR ," model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85
predictor = DefaultPredictor(cfg)
evaluator = COCOEvaluator("my_dataset_test", cfg , False , output_dir = "./output /")
val_loader = build_detection_test_loader(cfg ," my_dataset_test ")
inference_on_dataset(trainer.model , val_loader , evaluator)
For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis
The model was evaluated using the "COCO metric with AP at IoU=.50:.05:.95", considering the IoU values relating to the bounding boxes, in which:
- Train: AP = 79.907%
- Test: AP = 81.760%
For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis
For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis