Skip to content

Commit

Permalink
Merge branch 'develop' into feat/multiple-speakers-stats-pool
Browse files Browse the repository at this point in the history
  • Loading branch information
hbredin authored Sep 15, 2023
2 parents 270ee70 + b660b1e commit ada8fc4
Show file tree
Hide file tree
Showing 33 changed files with 1,064 additions and 688 deletions.
26 changes: 20 additions & 6 deletions .faq/suggest.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
Thank you for your issue.

{%- if questions -%}
{% if questions|length == 1 %}
We found the following entry in the [FAQ]({{ faq_url }}) which you may find helpful:
Expand All @@ -9,12 +11,24 @@ We found the following entries in the [FAQ]({{ faq_url }}) which you may find he
- [{{ question.title }}]({{ faq_url }}#{{ question.slug }})
{%- endfor %}

Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.

{%- else -%}
Thank you for your issue. Give us a little time to review it.

PS. You might want to check the [FAQ]({{ faq_url }}) if you haven't done so already.
You might want to check the [FAQ]({{ faq_url }}) if you haven't done so already.
{%- endif %}

This is an automated reply, generated by [FAQtory](https://github.com/willmcgugan/faqtory)
Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read [this](https://xyproblem.info/) first and update your request accordingly, if needed.

If your issue is a bug report, please provide a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) as a link to a self-contained [Google Colab](https://colab.research.google.com/) notebook containing everthing needed to reproduce the bug:
- installation
- data preparation
- model download
- etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

Companies relying on `pyannote.audio` in production may contact [me](https://herve.niderb.fr) via email regarding:
* paid scientific consulting around speaker diarization and speech processing in general;
* custom models and tailored features (via the local tech transfer office).

> This is an automated reply, generated by [FAQtory](https://github.com/willmcgugan/faqtory)
4 changes: 3 additions & 1 deletion .github/workflows/new_issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ jobs:
- name: Install FAQtory
run: pip install FAQtory
- name: Run Suggest
run: faqtory suggest "${{ github.event.issue.title }}" > suggest.md
env:
TITLE: ${{ github.event.issue.title }}
run: faqtory suggest "$TITLE" > suggest.md
- name: Read suggest.md
id: suggest
uses: juliangruber/read-file-action@v1
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,4 @@ jobs:
pip install -e .[dev,testing]
- name: Test with pytest
run: |
export PYANNOTE_DATABASE_CONFIG=$GITHUB_WORKSPACE/tests/data/database.yml
pytest
12 changes: 9 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@
### Breaking changes

- BREAKING(task): rename `Segmentation` task to `SpeakerDiarization`
- BREAKING(task): remove support for variable chunk duration
- BREAKING(task): remove support for variable chunk duration for segmentation tasks
- BREAKING(pipeline): pipeline defaults to CPU (use `pipeline.to(device)`)
- BREAKING(pipeline): remove `SpeakerSegmentation` pipeline (use `SpeakerDiarization` pipeline)
- BREAKING(pipeline): remove support `FINCHClustering` and `HiddenMarkovModelClustering`
- BREAKING(pipeline): remove support for `FINCHClustering` and `HiddenMarkovModelClustering`
- BREAKING(pipeline): remove `segmentation_duration` parameter from `SpeakerDiarization` pipeline (defaults to `duration` of segmentation model)
- BREAKING(setup): drop support for Python 3.7
- BREAKING(io): channels are now 0-indexed (used to be 1-indexed)
Expand All @@ -26,12 +26,17 @@
* replace `Audio()` by `Audio(mono="downmix")`;
* replace `Audio(mono=True)` by `Audio(mono="downmix")`;
* replace `Audio(mono=False)` by `Audio()`.
- BREAKING(model): get rid of (flaky) `Model.introspection`
If, for some weird reason, you wrote some custom code based on that,
you should instead rely on `Model.example_output`.

### Features and improvements

- feat(task): add support for multi-task models
- feat(pipeline): send pipeline to device with `pipeline.to(device)`
- feat(pipeline): make `segmentation_batch_size` and `embedding_batch_size` mutable in `SpeakerDiarization` pipeline (they now default to `1`)
- feat(task): add [powerset](https://arxiv.org/PLACEHOLDER) support to `SpeakerDiarization` task
- feat(pipeline): add `return_embeddings` option to `SpeakerDiarization` pipeline
- feat(pipeline): add progress hook to pipelines
- feat(pipeline): check version compatibility at load time
- feat(task): add support for label scope in speaker diarization task
Expand All @@ -44,6 +49,7 @@
- fix(pipeline): fix reproducibility issue with Ampere CUDA devices
- fix(pipeline): fix support for IOBase audio
- fix(pipeline): fix corner case with no speaker
- fix(train): prevent metadata preparation to happen twice
- improve(task): shorten and improve structure of Tensorboard tags

### Dependencies
Expand Down Expand Up @@ -82,7 +88,7 @@

- last release before complete rewriting

## Version 1.0.1 (2018--07-19)
## Version 1.0.1 (2018-07-19)

- fix: fix regression in Precomputed.__call__ (#110, #105)

Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# Neural speaker diarization with `pyannote.audio`
> [!IMPORTANT]
> I propose (paid) scientific [consulting services](https://herve.niderb.fr/consulting.html) to companies willing to make the most of their data and open-source speech processing toolkits (and `pyannote` in particular).
# Speaker diarization with `pyannote.audio`

`pyannote.audio` is an open-source toolkit written in Python for speaker diarization. Based on [PyTorch](pytorch.org) machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines.

Expand Down Expand Up @@ -126,9 +129,8 @@ pip install -e .[dev,testing]
pre-commit install
```

Tests rely on a set of debugging files available in [`test/data`](test/data) directory.
Set `PYANNOTE_DATABASE_CONFIG` environment variable to `test/data/database.yml` before running tests:
## Test

```bash
PYANNOTE_DATABASE_CONFIG=tests/data/database.yml pytest
pytest
```
Loading

0 comments on commit ada8fc4

Please sign in to comment.