-
Notifications
You must be signed in to change notification settings - Fork 24
Orca test data
Scott Veirs edited this page Jul 23, 2021
·
6 revisions
In pursuit of meaningful comparison of the performance of orca-specific models, this page presents open test sets that are specific to killer whale signals. The highest priority of the AI for Orcas project is classification of signals from the endangered Southern Resident Killer Whales (SRKWs), so test sets are listed first for their signals, starting with calls (but with placeholders for whistles and clicks). Orcasound is interested in classifiers for other common signals in the Salish Sea, so test sets for other species and sources, like Bigg's killer whales are listed subsequently.
- Prospects for additional Orcasound test data:
- Orcasound candidates for additional rounds of annotation (e.g. through Podcast)
- Current listener log
- High signal:noise ratio
- 27 Sep 2017 -- 1/2? hour of data from the Orcasound Lab node
- 2017 listener log
- Labeled first in Pod.Cast by Scott, Akash, and Prakruti
- Labels verified by Scott in Audacity
- Labels
- Audio data
- Metadata
- 27 Sep 2017 -- 1/2? hour of data from the Orcasound Lab node
- Intermediate signal:noise ratio
- 05 Jul 2019 -- 1/2 hour of data from the Orcasound Lab node labeled in Audacity by Scott
- Labels:
-
only calls
- AWS CLI access via
aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_labels-SV_200210_only_calls.txt .
- AWS CLI access via
-
other signals -- with start/end times + label in row N ("call," specific stereotyped call ID, or "?" to indicate probable but not 100% certain call); row N+1 starts with \ and then contains lower and upper frequency bounds.
- AWS CLI access via
aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_labels-SV_200210_other_signals.txt
.
- AWS CLI access via
-
only calls
-
Audio data -- in WAV format
- AWS CLI access via
aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_.wav .
- AWS CLI access via
- Metadata
- Labels:
- 05 Jul 2019 -- 1/2 hour of data from the Orcasound Lab node labeled in Audacity by Scott
- Low signal:noise ratio
- 14 Nov 2019 -- 2.5 hours of data from the Port Townsend node labeled in Audacity by Scott
- Labels
- Audio data
- Metadata
- 14 Nov 2019 -- 2.5 hours of data from the Port Townsend node labeled in Audacity by Scott
- Ford-Osborne tape (~1980s)
- A small but authoritative sample: 1-3 examples per call annotated back in the 1980s on a cassette tape by John Ford and later Rich Osborne. The Audacity labels should provide start/end times for at least a few examples of each call type. https://www.orcasound.net/data/processed/SRKW/call-catalog/Ford-osborne-tape-analysis+flac-no-narration/FordOsborneTape-Audacity-analysis.zip