Skip to content

Repository of the ISMIR'24 paper "Cue Point Estimation using Object Detection"

License

Notifications You must be signed in to change notification settings

ETH-DISCO/cue-detr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUE-DETR

📜Paper | 🤗Dataset | 🤗Checkpoints

Dataset

🤗Dataset

EDM-CUE contains metadata for almost 5k EDM tracks collected from 4 different DJs. No audio provided, only references to training data.

Track Metadata
{
    'id': int,
    'title': str,
    'artists': str,
    'duration': int,        # in seconds
    'genre': [str],
    'key': [str],           # alphanumeric (Camelot)
    'beat_grid': {
        'start_pos': float, # in seconds
        'init_beat': int,   #  first beat count
        'bpm': float,
        'time_sig': str
    },
    'cue_pts': [float]      # in seconds
}

Training Format

  • CUE-DETR expects training data in a modified COCO format: instead of 'bbox' and 'area' the model requires the 'position' of each cue point annotation. The bounding box is computed during runtime with default width 21 pixels.

  • preprocessing.py converts audio into power spectrograms including the annotation file in the custom COCO format.

Custom COCO Format
data = {
    'images' : [{
        'id': img_id,
        'width': int,
        'height': int,
        'file_name' : filename,
    }]
    'annotations': [{
        'id': annotation_id,
        'image_id': img_id,
        'category_id': 0,
        'position': int # cue position instead of bounding box
    }],
    'categories': [{
        'id': 0,
        'name': 'cue',
        'supercategory' : 'cue'
    }]
}

Training

Uses W&B for logging. Connect to W&B account by running wandb login in the console and passing the projectname and account as arguments for training.

See cue_detr_train.py, cue_detr_data.py and cue_detr_model.py in model directory.

Dependencies

Python 3.11.9, see requirements.txt.

Usage / Example Script

🤗Checkpoints

The example script cue_points.py calculates cue points for tracks stored in an audio directory. All calculated cue points will be written to _cue_points.txt which is added to the audio directory. It is also possible to run the script with a local checkpoint from a checkpoint directory. Note that as of now only mp3 files are supported.

python cue_points.py -t path/to/audio/dir
# Optional arguments:
# -c (path/to/local/checkpoint/dir)
# -s (prediction sensitivity)
# -r (min distance between cues)
# -p (toggle to print cue points)

About

Repository of the ISMIR'24 paper "Cue Point Estimation using Object Detection"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages