KSO - Data management

The Koster Seafloor Observatory is an open-source, citizen science and machine learning approach to analyse subsea movies.

KSO Information architecture

The system processes underwater footage and its associatead metadata into biologically-meaningfull information. The format of the underwater media is standard (.mp4 or .png) and the associated metadata should be captured in three csv files (“movies”, “sites” and “species”) following the Darwin Core standards (DwC).

Module Overview

This data management module contains scripts and resources to move and process underwater footage and its associated data (e.g. location, date, sampling device).

The system is built around a series of easy-to-use Jupyter notebook tutorials. Each tutorial allows users to perform a specific task of the system (e.g. upload footage to the citizen science platform or analyse the classified data). The notebooks rely on the koster utility functions.

Tutorials

Name	Description	Try it!
1. Check footage and metadata	Check format and contents of footage and sites, media and species csv files
2. Upload new media to the system*	Upload new underwater media to the cloud/server and update the csv files
3. Upload clips to Zooniverse	Prepare original footage and upload short clips to Zooniverse
4. Upload frames to Zooniverse	Extract frames of interest from original footage and upload them to Zooniverse
5. Train ML models	Prepare the training and test data, set model parameters and train models
6. Evaluate ML models	Use ecologically-relevant metrics to test the models
7. Publish ML models	Publish the model to a public repository
8. Analyse Zooniverse classifications	Pull up-to-date classifications from Zooniverse and report summary stats/graphs
9. Download and format Zooniverse classifications	Pull up-to-date classifications from Zooniverse and format them for further analysis	Coming soon
10. Run ML models on footage	Automatically classify new footage	Coming soon

* Project-specific tutorial

Dev Installation

If you want to fully use our system (Binder has computing limitations), you will need to download this repository on your local computer or server.

Requirements

Option 1: Local / Cloud Installation

Download this repository

Clone this repository using

git clone --recurse-submodules https://github.com/ocean-data-factory-sweden/kso-data-management.git

Install dependecies

Navigate to the folder where you have cloned the repository or unzipped the manually downloaded repository.

cd kso-data-management

Then install the requirements by running.

pip install -r requirements.txt

Create initial information for the database

If you will work in a new project you will need to input the information about the underwater footage files, sites and species of interest. You can use a template of the csv files and move the directory to the "db_starter" folder.

Link your footage to the database

You will need files of underwater footage to run this system. You can download some samples and move them to db_starter. You can also store your own files and specify their directory in the tutorials.

Option 2: SNIC Users (VPN required)

Before using Option 2, users should have login credentials and have setup the Chalmers VPN on their local computers

Information for Windows users: Click here Information for MAC users: Click here

To use the Jupyter Notebooks within the Alvis HPC cluster, please visit Alvis Portal and login using your SNIC credentials.

Once you have been authorized, click on "Interactive Apps" and then "Jupyter". This open the server creation options.

Here you can keep the settings as default, apart from the "Number of hours" which you can set to the desired limit. Then choose either Data Management (Runtime (User specified jupyter1.sh)) or Machine Learning (Runtime (User specified jupyter2.sh)) from the Runtime dropdown options.

This will directly queue a server session using the correct container image, first showing a blue window and then you should see a green window when the session has been successfully started and the button "Connect to Jupyter" appears on the screen. Click this to launch into the Jupyter notebook environment.

Important note: The remaining time for the server is shown in green window as well. If you have finished using the notebook server before the alloted time runs out, please select "Delete" so that the resources can be released for use by others within the project.

Citation

If you use this code or its models in your research, please cite:

Anton V, Germishuys J, Bergström P, Lindegarth M, Obst M (2021) An open-source, citizen science and machine learning approach to analyse subsea movies. Biodiversity Data Journal 9: e60548. https://doi.org/10.3897/BDJ.9.e60548

Collaborations/questions

You can find out more about the project at https://www.zooniverse.org/projects/victorav/the-koster-seafloor-observatory.

We are always excited to collaborate and help other marine scientists. Please feel free to contact us with your questions.

Dev instructions

Installing conda
Create new environment (e.g. "new environment")
Install git and pip (with conda)
Clone kso repo
pip install ipykernel
python -m ipykernel install --user --name="new_environment"
from the jupyter notebook select kernel/change kernel

Before pushing your code to the repo, please run black on the code you have edited. You can install black package using pip:

pip install black

Troubleshooting

If you experience issues uploading movies to Zooniverse, it might be related to the libmagic package. In Windows, the following commands seem to work:

pip install python-libmagic
pip install python-magic-bin

Name		Name	Last commit message	Last commit date
Latest commit History 821 Commits
.github/workflows		.github/workflows
images		images
kso_utils @ fc06fc6		kso_utils @ fc06fc6
tutorials		tutorials
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
license.txt		license.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KSO - Data management

KSO Information architecture

Module Overview

Tutorials

Dev Installation

Requirements

Option 1: Local / Cloud Installation

Download this repository

Install dependecies

Create initial information for the database

Link your footage to the database

Option 2: SNIC Users (VPN required)

Citation

Collaborations/questions

Dev instructions

Troubleshooting

About

Releases

Packages

Languages

License

begeospatial/koster_data_management

Folders and files

Latest commit

History

Repository files navigation

KSO - Data management

KSO Information architecture

Module Overview

Tutorials

Dev Installation

Requirements

Option 1: Local / Cloud Installation

Download this repository

Install dependecies

Create initial information for the database

Link your footage to the database

Option 2: SNIC Users (VPN required)

Citation

Collaborations/questions

Dev instructions

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages