6-th task solution of DCASE2020
-
Updated
Jun 22, 2022 - Python
6-th task solution of DCASE2020
This reporsitory code form Weakly Supervised Automaed Audio Captioning via Text Only Training
PyTorch dataloader for Clotho dataset.
text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)
[NeurIPS 2023 - ML for Audio Workshop (Oral)] Zero-shot audio captioning with audio-language model guidance and audio context keywords
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
Workshop on Detection and Classification of Acoustic Scenes and Events
DCASE2024 Challenge Task 6 baseline system (Automated Audio Captioning)
IRIT-UPS DCASE 2021 AUDIO CAPTIONING SYSTEM
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation"
Code for using with the Clotho dataset
Tracking states of the arts and recent results (bibliography) on sound tasks.
Tools for the evaluation of audio captioning.
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning-results#wuyusong2020_t6
Song Describer is a data collection platform for annotating music with textual descriptions.
Audio Captioning datasets for PyTorch.
Add a description, image, and links to the audio-captioning topic page so that developers can more easily learn about it.
To associate your repository with the audio-captioning topic, visit your repo's landing page and select "manage topics."