update

UTAustin-SwarmLab · Jul 6, 2024 · 3cf8490 · 3cf8490
1 parent 1c01877
commit 3cf8490
Show file tree

Hide file tree

Showing 8 changed files with 133 additions and 171 deletions.
diff --git a/README.md b/README.md
@@ -1,12 +1,9 @@
-
-<a name="readme-top"></a>
+# Temporal Logic Video (TLV) Dataset
 
 [![Contributors][contributors-shield]][contributors-url]
 [![Forks][forks-shield]][forks-url]
 [![Stargazers][stars-shield]][stars-url]
 [![MIT License][license-shield]][license-url]
-[![LinkedIn][linkedin-shield]][linkedin-url]
-
 <!-- PROJECT LOGO -->
 <br />
 <div align="center">
@@ -22,192 +19,162 @@
     <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"><strong>Explore the docs »</strong></a>
     <br />
     <br />
-    <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">View Demo</a>
-    ·
-    <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues">Report Bug</a>
+    <a href="https://anoymousu1.github.io/nsvs-anonymous.github.io/">NSVS-TL Project Webpage</a>
     ·
-    <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">Request Feature</a>
+    <a href="https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temploral-Logic">NSVS-TL Source Code</a>
   </p>
 </div>
 
+## Overview
 
-<!-- TABLE OF CONTENTS -->
-<details>
-  <summary>Table of Contents</summary>
-  <ol>
-    <li><a href="#about-the-project">About The Project</a></li>
-    <li>
-      <a href="#getting-started">Getting Started</a>
-      <ul>
-        <li><a href="#prerequisites">Prerequisites</a></li>
-        <li><a href="#installation">Installation</a></li>
-      </ul>
-    </li>
-    <li><a href="#usage">Usage</a></li>
-    <li><a href="#roadmap">Roadmap</a></li>
-    <li><a href="#contributing">Contributing</a></li>
-    <li><a href="#license">License</a></li>
-    <li><a href="#contact">Contact</a></li>
-    <li><a href="#acknowledgments">Acknowledgments</a></li>
-  </ol>
-</details>
-
-<!-- ABOUT THE PROJECT -->
-## About The Project
-
-<!-- [![Product Name Screen Shot][product-screenshot]](https://example.com) -->
-
-Given the lack of SOTA video datasets for long-horizon,
-temporally extended activity and object detection, we intro-
-duce the Temporal Logic Video (TLV) datasets. The syn-
-thetic TLV datasets are compiled by stitching together static
-images from computer vision datasets like COCO and
-ImageNet. This enables the artificial introduction of
-a wide range of TL specifications. Additionally, we have
-created two video datasets based on the open-source au-
-tonomous vehicle (AV) driving datasets NuScenes and
-Waymo.
-
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
-
-<!-- GETTING STARTED -->
-## Getting Started
-
-This is an example of how you may give instructions on setting up your project locally.
-To get a local copy up and running follow these simple example steps.
+The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:
 
-### Prerequisites
+1. Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
+2. Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.
 
-If you want to generate syntetic dataset from COCO and ImageNet, you should download the source data first. 
+## Table of Contents
 
-1. [ImageNet](https://image-net.org/challenges/LSVRC/2017/index.php): The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2017. Recommended file structure as follows: 
-```
-|--ILSVRC
-|----Annotations
-|----Data
-|----ImageSets
-|----LOC_synset_mapping.txt
-```
+- [Dataset Composition](#dataset-composition)
+- [Dataset (Release)](#dataset)
+- [Installation](#installation)
+- [Usage](#usage)
+- [Data Generation](#data-generation)
+- [Contribution Guidelines](#contribution-guidelines)
+- [License](#license)
+- [Acknowledgments](#acknowledgments)
 
-2. [COCO](https://cocodataset.org/#download): Download the source data as follow:
-```
-|--COCO
-|----2017
-|------annotations
-|------train2017
-|------val2017
-```
+## Dataset Composition
 
-### Installation
-```
-python -m venv .venv
-source .venv/bin/activate
-python -m pip install --upgrade pip build
-python -m pip install --editable ."[dev, test]"
-```
-
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+### Synthetic Datasets
+- Source: COCO and ImageNet
+- Purpose: Introduce artificial Temporal Logic specifications
+- Generation Method: Image stitching from static datasets
 
-<!-- USAGE EXAMPLES -->
-## Usage
-Please find argument details from run scripts. 
-
-### Data Loader Common Argument
-* `data_root_dir`: The root directory where the COCO dataset is stored.
-* `mapping_to`: Map the original label to desired mapper, default is "coco".
-* `save_dir`: Directory where the generated dataset will be saved.
-### Synthetic Generator Common Argument
-* `initial_number_of_frame`: Initial number of frames for each video.
-* `max_number_frame`: Maximum number of frames for each video.
-* `number_video_per_set_of_frame`: Number of videos to generate per set of frames.
-* `increase_rate`: Rate at which the number of frames increases.
-* `ltl_logic`: Temporal logic to apply. Options include various logical expressions like "F prop1", "G prop1", etc.
-* `save_images`: Boolean to decide whether to save individual frame images (True or False).
-
-In each run script, make sure 
-1. **coco synthetic data generator** <br>
-COCO synthetic data generator can generate & compositions since it has multiple labels.
-```
-python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output dir path>"
-```
-
-2. **Imagenet synthetic data generator** <br>
-Imagenet synthetic data generator cannot generate & LTL logic formula.
-```
-python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output dir path>""
-```
-
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+### Real-world Datasets
+- Sources: NuScenes and Waymo
+- Purpose: Provide real-world autonomous vehicle scenarios
+- Annotation: Temporal Logic specifications added to existing data
 
+## Dataset 
+<div align="center">
+  <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">
+    <img src="images/teaser.png" alt="Logo" width="240" height="240">
+  </a>
+</div>
+Though we provide a source code to generate datasets from different types of data source, we release a v1 dataset as a proof of concept. 
 
+### Dataset Structure
 
-<!-- ROADMAP -->
-## Roadmap
+We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations.
 
-- [ ] Publication
-   - [ ] Repository
-   - [ ] Blog
+#### File Naming Convention
+`\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl`
 
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+#### Object Attributes
+Each serialized object contains the following attributes:
+- `ground_truth`: Boolean indicating whether the dataset contains ground truth labels
+- `ltl_formula`: Temporal logic formula applied to the dataset
+- `proposition`: A set of proposition for ltl_formula
+- `number_of_frame`: Total number of frames in the dataset
+- `frames_of_interest`: Frames of interest which satisfy the ltl_formula
+- `labels_of_frames`: Labels for each frame
+- `images_of_frames`: Image data for each frame
 
-<!-- CONTRIBUTING 
-## Contributing
+You can download a dataset from here. The structure of dataset is as follows: serializer
+   ```
+   ILSVRC/
+   ├── Annotations/
+   ├── Data/
+   ├── ImageSets/
+   └── LOC_synset_mapping.txt
+   ```
 
-Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
+## Installation
 
-If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
-Don't forget to give the project a star! Thanks again!
+```bash
+python -m venv .venv
+source .venv/bin/activate
+python -m pip install --upgrade pip build
+python -m pip install --editable ."[dev, test]"
+```
 
-1. Fork the Project
-2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
-3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
-4. Push to the Branch (`git push origin feature/AmazingFeature`)
-5. Open a Pull Request
+### Prerequisites
 
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
--->
+1. ImageNet (ILSVRC 2017):
+   ```
+   ILSVRC/
+   ├── Annotations/
+   ├── Data/
+   ├── ImageSets/
+   └── LOC_synset_mapping.txt
+   ```
+
+2. COCO (2017):
+   ```
+   COCO/
+   └── 2017/
+       ├── annotations/
+       ├── train2017/
+       └── val2017/
+   ```
 
+## Usage
 
-<!-- LICENSE -->
-## License
+Detailed usage instructions for data loading and processing.
 
-Distributed under the MIT License. See `LICENSE` for more information.
+### Data Loader Configuration
 
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+- `data_root_dir`: Root directory of the dataset
+- `mapping_to`: Label mapping scheme (default: "coco")
+- `save_dir`: Output directory for processed data
 
-<!-- CONTACT -->
-## Contact
+### Synthetic Data Generator Configuration
 
-Minkyu Choi - [@your_twitter](https://twitter.com/MinkyuChoi7) - minkyu.choi@utexas.edu
+- `initial_number_of_frame`: Starting frame count per video
+- `max_number_frame`: Maximum frame count per video
+- `number_video_per_set_of_frame`: Videos to generate per frame set
+- `increase_rate`: Frame count increment rate
+- `ltl_logic`: Temporal Logic specification (e.g., "F prop1", "G prop1")
+- `save_images`: Boolean flag for saving individual frames
 
-Project Link: TBD
+## Data Generation
 
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+### COCO Synthetic Data Generation
 
+```bash
+python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"
+```
 
+### ImageNet Synthetic Data Generation
 
-<!-- ACKNOWLEDGMENTS -->
-## Acknowledgments
+```bash
+python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"
+```
 
-* University of Texas at Austin (UT Austin)
-* UT Austin Swarm Lab
+Note: ImageNet generator does not support '&' LTL logic formulae.
 
-<p align="right">(<a href="#readme-top">back to top</a>)</p>
+## License
 
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
+
+## Citation
+If you find this repo useful, please cite our paper:
+```bibtex
+@inproceedings{Choi_2024_ECCV,
+  author={Choi, Minkyu and Goel Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali and Sandeep},
+  title={Towards Neuro-Symbolic Video Understanding},
+  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
+  month={September},
+  year={2024}
+}
+```
 
 
-<!-- MARKDOWN LINKS & IMAGES -->
-<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
-[contributors-shield]: https://img.shields.io/github/contributors/othneildrew/Best-README-Template.svg?style=for-the-badge
+[contributors-shield]: https://img.shields.io/github/contributors/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
 [contributors-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/graphs/contributors
-[forks-shield]: https://img.shields.io/github/forks/othneildrew/Best-README-Template.svg?style=for-the-badge
+[forks-shield]: https://img.shields.io/github/forks/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
 [forks-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/network/members
-[stars-shield]: https://img.shields.io/github/stars/othneildrew/Best-README-Template.svg?style=for-the-badge
+[stars-shield]: https://img.shields.io/github/stars/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
 [stars-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/stargazers
-[issues-shield]: https://img.shields.io/github/issues/othneildrew/Best-README-Template.svg?style=for-the-badge
-[issues-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues
-[license-shield]: https://img.shields.io/github/license/othneildrew/Best-README-Template.svg?style=for-the-badge
+[license-shield]: https://img.shields.io/github/license/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
 [license-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/blob/master/LICENSE.txt
-[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
-[linkedin-url]: https://www.linkedin.com/in/mchoi07/
-[product-screenshot]: images/screenshot.png
diff --git a/example.py b/example.py
diff --git a/images/teaser.png b/images/teaser.png
diff --git a/tests/__init__.py b/tests/__init__.py
diff --git a/tests/subpackage_name/__init__.py b/tests/subpackage_name/__init__.py
diff --git a/tests/subpackage_name/test_subpackage_module_name.py b/tests/subpackage_name/test_subpackage_module_name.py
diff --git a/tests/test_module_name.py b/tests/test_module_name.py
diff --git a/tlv_dataset/loader/coco.py b/tlv_dataset/loader/coco.py
@@ -57,10 +57,12 @@ def load_data(self):
         """
         img_ids = self._coco.getImgIds()
         images = [
-            self._image_dir / self._coco.loadImgs(id)[0]["file_name"] for id in img_ids
+            self._image_dir / self._coco.loadImgs(id)[0]["file_name"]
+            for id in img_ids
         ]
         annotations = [
-            self._coco.loadAnns(self._coco.getAnnIds(imgIds=id)) for id in img_ids
+            self._coco.loadAnns(self._coco.getAnnIds(imgIds=id))
+            for id in img_ids
         ]
 
         return images, annotations
@@ -76,16 +78,19 @@ def process_data(self) -> TLVRawImage:
         for id in img_ids:
             images.append(
                 cv2.imread(
-                    str(self._image_dir / self._coco.loadImgs(id)[0]["file_name"])
+                    str(
+                        self._image_dir
+                        / self._coco.loadImgs(id)[0]["file_name"]
+                    )
                 )[:, :, ::-1]
             )  # Read it as RGB
             annotation = self._coco.loadAnns(self._coco.getAnnIds(imgIds=id))
             labels_per_image = []
             for i in range(len(annotation)):
                 labels_per_image.append(
-                    self._coco.cats[annotation[i]["category_id"]]["name"].replace(
-                        " ", "_"
-                    )
+                    self._coco.cats[annotation[i]["category_id"]][
+                        "name"
+                    ].replace(" ", "_")
                 )
             unique_labels = list(set(labels_per_image))
             if len(unique_labels) == 0:
@@ -133,10 +138,10 @@ def map_data(self, **kwargs) -> any:
 
 # # Example usage:
 # coco_loader = COCOImageLoader(
-#     coco_dir_path="/opt/Neuro-Symbolic-Video-Frame-Search/artifacts/data/benchmark_image_dataset/coco",
-#     annotation_file="annotations/instances_val2017.json",
-#     image_dir="val2017",
+#     coco_root_dir_path="/store/datasets/COCO/2017",
+#     coco_image_source="val",
 # )
+# breakpoint()
 
 # # Display a sample image
 # coco_loader.display_sample_image(0)