Skip to content

Commit

Permalink
Version 0.3.2
Browse files Browse the repository at this point in the history
  • Loading branch information
Labbeti committed Jan 30, 2023
1 parent fde6aeb commit 08a7ebd
Show file tree
Hide file tree
Showing 16 changed files with 417 additions and 270 deletions.
35 changes: 12 additions & 23 deletions .github/workflows/python-package-pip.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,40 +10,29 @@ on:

jobs:
build:
runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}

strategy:
matrix:
os: [ubuntu-latest]
python-version: ["3.7", "3.8", "3.9", "3.10"]

steps:
# --- INSTALLATIONS ---
- name: Checkout repository
uses: actions/checkout@v2

- name: Set up Python 3.8
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: 3.8

- name: Load cache of pip dependencies
uses: actions/cache@master
id: cache_requirements
with:
path: ${{ env.pythonLocation }}/lib/python3.8/site-packages/*
key: ${{ runner.os }}-pip-${{ hashFiles('setup.cfg') }}
restore-keys: |
${{ runner.os }}-pip-
${{ runner.os }}-
- name: Install pip dev dependencies + package if needed
if: steps.cache_requirements.outputs.cache-hit != 'true'
run: |
python -m pip install --upgrade pip
python -m pip install -e .[dev]
python-version: ${{ matrix.python-version }}
cache: 'pip'

- name: Install package if needed
if: steps.cache_requirements.outputs.cache-hit == 'true'
- name: Install package
run: |
python -m pip install -e . --no-dependencies
python -m pip install -e .[dev]
- name: Install soundfile
- name: Install soundfile for torchaudio
run: |
# For soundfile dep
sudo apt-get install libsndfile1
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

All notable changes to this project will be documented in this file.

## [0.3.2] 2023-01-30
### Added
- `AudioCaps.load_class_labels_indices` to load AudioSet classes map externally.
- Compatibility and tests from Python 3.7 to 3.10.

### Changed
- Attributes in datasets classes are now weakly private.
- Documentation theme and descriptions.

### Fixed
- Workflow badge with Github changes. (https://github.com/badges/shields/issues/8671)

## [0.3.1] 2022-10-31
### Changed
- AudioCaps, Clotho and MACS order are now defined by their order in the corresponding captions CSV files when available.
Expand Down
5 changes: 3 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ type: software
authors:
- given-names: Etienne
family-names: Labbé
email: labbeti.pub@gmail.com
affiliation: IRIT
orcid: 'https://orcid.org/0000-0002-7219-5463'
repository-code: 'https://github.com/Labbeti/aac-datasets/'
Expand All @@ -21,5 +22,5 @@ keywords:
- captioning
- audio-captioning
license: MIT
version: 0.3.1
date-released: '2022-09-28'
version: 0.3.2
date-released: '2023-01-30'
37 changes: 22 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

<div align="center">

# Audio Captioning datasets for Pytorch
# Audio Captioning datasets for PyTorch

<a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/-Python 3.8+-blue?style=for-the-badge&logo=python&logoColor=white"></a>
<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch 1.10.1-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white"></a>
<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch 1.10.1+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white"></a>
<a href="https://black.readthedocs.io/en/stable/"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-black.svg?style=for-the-badge&labelColor=gray"></a>
<a href="https://github.com/Labbeti/aac-datasets/actions"><img alt="Build" src="https://img.shields.io/github/workflow/status/Labbeti/aac-datasets/Python%20package%20using%20Pip/main?style=for-the-badge&logo=github"></a>
<a href="https://github.com/Labbeti/aac-datasets/actions"><img alt="Build" src="https://img.shields.io/github/actions/workflow/status/Labbeti/aac-datasets/python-package-pip.yaml?branch=main&style=for-the-badge&logo=github"></a>

Audio Captioning unofficial datasets source code for **AudioCaps** [[1]](#audiocaps), **Clotho** [[2]](#clotho), and **MACS** [[3]](#macs), designed for Pytorch.
Audio Captioning unofficial datasets source code for **AudioCaps** [[1]](#audiocaps), **Clotho** [[2]](#clotho), and **MACS** [[3]](#macs), designed for PyTorch.

</div>

Expand All @@ -25,14 +25,14 @@ pip install aac-datasets
```python
from aac_datasets import Clotho

dataset = Clotho(root=".", subset="dev", download=True)
dataset = Clotho(root=".", download=True)
item = dataset[0]
audio, captions = item["audio"], item["captions"]
# audio: Tensor of shape (n_channels=1, audio_max_size)
# captions: list of str captions
# captions: list of str
```

### Build Pytorch dataloader with Clotho
### Build PyTorch dataloader with Clotho

```python
from torch.utils.data.dataloader import DataLoader
Expand All @@ -43,8 +43,8 @@ dataset = Clotho(root=".", download=True)
dataloader = DataLoader(dataset, batch_size=4, collate_fn=BasicCollate())

for batch in dataloader:
# batch["audio"]: list of Tensor of shape (n_channels, audio_size)
# batch["captions"]: list of list of str
# batch["audio"]: list of 4 tensors of shape (n_channels, audio_size)
# batch["captions"]: list of 4 lists of str
...
```

Expand All @@ -68,7 +68,7 @@ Here is the **train** subset statistics for each dataset :
| Nb captions per audio | 1 | 5 | 2-5 |
| Nb captions | 49838 | 19195 | 17275 |
| Total nb words<sup>2</sup> | 402482 | 217362 | 160006 |
| Nb words range<sup>2</sup> | 2-52 | 8-20 | 5-40 |
| Sentence size<sup>2</sup> | 2-52 | 8-20 | 5-40 |

<sup>1</sup> This duration is estimated on the total duration of 46230/49838 files of 126.7h.

Expand Down Expand Up @@ -99,11 +99,18 @@ AudioCaps.YOUTUBE_DL_PATH = "/my/path/to/youtube_dl"
dataset = AudioCaps(root=".", download=True)
```

## Command line download
## Download datasets
To download a dataset, you can use `download=True` argument in dataset construction.

However, if you want to download datasets separately, you can also use the following command :
```bash
python -m aac_datasets.download --root "./data" clotho --version "v2.1"
aac-datasets-download --root "." clotho --subsets "dev"
```
Or use the corresponding function in the code :
```python
from aac_datasets.download import download_clotho

download_clotho(root=".", subsets=["dev"])
```

## References
Expand All @@ -124,11 +131,11 @@ If you use this software, please consider cite it as below :
Labbe_aac-datasets_2022,
author = {Labbé, Etienne},
license = {MIT},
month = {6},
month = {01},
title = {{aac-datasets}},
url = {https://github.com/Labbeti/aac-datasets/},
version = {0.3.1},
year = {2022}
version = {0.3.2},
year = {2023}
}
```

Expand Down
10 changes: 9 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,14 @@
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"
html_theme = "press"

html_theme_options = {
"external_links": [
("Github", "https://github.com/Labbeti/aac-datasets"),
("PyPI", "https://pypi.org/project/aac-datasets/"),
],
}

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
Expand All @@ -86,4 +93,5 @@
intersphinx_mapping = {
"python": ("https://docs.python.org/", None),
"torch": ("https://pytorch.org/docs/master/", None),
"torchaudio": ("https://pytorch.org/audio/stable/", None),
}
27 changes: 15 additions & 12 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,27 @@
aac-datasets's documentation
========================================

Audio Captioning unofficial datasets source code for **AudioCaps**, **Clotho**, and **MACS**, designed for PyTorch.

.. toctree::
:maxdepth: 6
:caption: Contents
:maxdepth: 2
:name: maintoc

aac_datasets
installation
usage
package_tree


Indices and tables
==================
.. Indices and tables
.. ==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
.. * :ref:`genindex`
.. * :ref:`modindex`
.. * :ref:`search`
External links
=========================================================
.. External links
.. =========================================================
* `Github link <https://github.com/Labbeti/aac-datasets>`_
* `PyPI link <https://pypi.org/project/aac-datasets/>`_
.. * `Github link <https://github.com/Labbeti/aac-datasets>`_
.. * `PyPI link <https://pypi.org/project/aac-datasets/>`_
37 changes: 37 additions & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Installation
============

Simply run:

.. code-block:: bash
pip install aac-datasets
Python requirements
###################

The python requirements are automatically installed when using pip on this repository.

.. code-block:: bash
torch >= 1.10.1
torchaudio >= 0.10.1
py7zr >= 0.17.2
pyyaml >= 6.0
tqdm >= 4.64.0
External requirements (AudioCaps only)
######################################

The external requirements needed to download **AudioCaps** are **ffmpeg** and **youtube-dl**.
These two programs can be download on Ubuntu using `sudo apt install ffmpeg youtube-dl`.

You can also override their paths for AudioCaps:

.. code-block:: python
from aac_datasets import AudioCaps
AudioCaps.FFMPEG_PATH = "/my/path/to/ffmpeg"
AudioCaps.YOUTUBE_DL_PATH = "/my/path/to/youtube_dl"
dataset = AudioCaps(root=".", download=True)
9 changes: 9 additions & 0 deletions docs/package_tree.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _modules_file:

Package tree
============

.. toctree::
:maxdepth: 6

aac_datasets
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
sphinx_rtd_theme==0.4.3
sphinx-press-theme==0.8.0
Loading

0 comments on commit 08a7ebd

Please sign in to comment.