Skip to content

minkyu-choi07/Temporal-Logic-Video-Dataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Temporal Logic Video (TLV) Dataset

arXiv Paper Website GitHub GitHub


Logo

Temporal Logic Video (TLV) Dataset

Synthetic and real video dataset with temporal logic annotation
Explore the docs »

NSVS-TL Project Webpage · NSVS-TL Source Code

Overview

The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:

  1. Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
  2. Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.

Table of Contents

Dataset Composition

Synthetic Datasets

  • Source: COCO and ImageNet
  • Purpose: Introduce artificial Temporal Logic specifications
  • Generation Method: Image stitching from static datasets

Real-world Datasets

  • Sources: NuScenes and Waymo
  • Purpose: Provide real-world autonomous vehicle scenarios
  • Annotation: Temporal Logic specifications added to existing data

Dataset

Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.

Dataset Structure

We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.

File Naming Convention

\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl

Object Attributes

Each serialized object contains the following attributes:

  • ground_truth: Boolean indicating whether the dataset contains ground truth labels
  • ltl_formula: Temporal logic formula applied to the dataset
  • proposition: A set of proposition for ltl_formula
  • number_of_frame: Total number of frames in the dataset
  • frames_of_interest: Frames of interest which satisfy the ltl_formula
  • labels_of_frames: Labels for each frame
  • images_of_frames: Image data for each frame

You can download a dataset from here. The structure of dataset is as follows: serializer

tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/

Dataset Statistics

  1. Total Number of Frames
Ground Truth TL Specifications Synthetic TLV Dataset Real TLV Dataset
COCO ImageNet Waymo Nuscenes
Eventually Event A - 15,750 - -
Always Event A - 15,750 - -
Event A And Event B 31,500 - - -
Event A Until Event B 15,750 15,750 8,736 19,808
(Event A And Event B) Until Event C 5,789 - 7,459 7,459
  1. Total Number of datasets
Ground Truth TL Specifications Synthetic TLV Dataset Real TLV Dataset
COCO ImageNet Waymo Nuscenes
Eventually Event A - 60 - -
Always Event A - 60 - -
Event A And Event B 120 - - -
Event A Until Event B 60 60 45 494
(Event A And Event B) Until Event C 97 - 30 186

Installation

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"

Prerequisites

  1. ImageNet (ILSVRC 2017):

    ILSVRC/
    ├── Annotations/
    ├── Data/
    ├── ImageSets/
    └── LOC_synset_mapping.txt
    
  2. COCO (2017):

    COCO/
    └── 2017/
        ├── annotations/
        ├── train2017/
        └── val2017/
    

Usage

Detailed usage instructions for data loading and processing.

Data Loader Configuration

  • data_root_dir: Root directory of the dataset
  • mapping_to: Label mapping scheme (default: "coco")
  • save_dir: Output directory for processed data

Synthetic Data Generator Configuration

  • initial_number_of_frame: Starting frame count per video
  • max_number_frame: Maximum frame count per video
  • number_video_per_set_of_frame: Videos to generate per frame set
  • increase_rate: Frame count increment rate
  • ltl_logic: Temporal Logic specification (e.g., "F prop1", "G prop1")
  • save_images: Boolean flag for saving individual frames

Data Generation

COCO Synthetic Data Generation

python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"

ImageNet Synthetic Data Generation

python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"

Note: ImageNet generator does not support '&' LTL logic formulae.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Connect with Me

Feel free to connect with me through these professional channels:

LinkedIn Email Google Scholar Website Twitter

Citation

If you find this repo useful, please cite our paper:

@inproceedings{Choi_2024_ECCV,
  author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
  title={Towards Neuro-Symbolic Video Understanding},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  month={September},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%