Skip to content

Commit

Permalink
Dorado plugin (#344)
Browse files Browse the repository at this point in the history
* Introducing a dorado specific plugin which handles the change from ont_pyguppy_client_lib to ont_pybasecaller_client_lib

* minor code reformat

* Bump version

* Update Changelog

* Add `ont-pybasecall-client-lib` as a dependency under target `dorado`

* Add try/except on import helper functions and pyclient for basecaller
This is so pytest collections (The only time both are imported into the same interperter runtime) doesn't crash.
This requires all tests to only test one, so I've moved all tests to test dorado

* Move tests to test dorado rather than guppy plugin when validating
This is due to the fact that pytest imports all files, the only time that pyguppy-client-lib and pybasecall-client-lib are imported at the same time
In a real experiment, the plugin choice for base-caller imports the correct module and leaves the other one out of the interpreter runtime

* Remove dorado, guppy and _read_until_client from coverage reporting as we can't really cover them
They require things link a read_until_api or live base caller to properly test

* Introducing sample rate from minknow and catching sub_read issue.

* Updating documentation for dorado

* Feature/check minknow (#351)

* Initial validation of minknow version and guppy dorado versions

* MinKNOW compatibility check function
Base-caller compatibiity check

* edit insanely long string multiline string example of sub tags

* Local updates

* Error checking and minor corrections to address compatibility. We no longer raise RuntimeException when a version issue is encountered. This means that readfish will continue to function with future versions of minKNOW if compatible.

* Add default when popping `sample_rate` from guppy params
Prevents `KeyError` from being raised if `sample_rate`isn't listed as a `kwarg`

* Remove unused basecaller compatibilty function
We now just check in `dorado.py` and warn if there is a mismatch

* Correct docstring, add doctests

* Add more complete docstring to `DIRECTION` enum
Only return Enum variant from `check_compatibility`, not tuple of (bool, variant)

* Suggest opening an issue if a suitable version of readfish doesn't exist

---------

Co-authored-by: Adoni5 <roryjmunro1@gmail.com>

* Add __futures__ annotation for Py38 compatibility

* Deprecation warning on Guppy plugin

* Add check for Write permission on Dorado socket

---------

Co-authored-by: Adoni5 <roryjmunro1@gmail.com>
  • Loading branch information
mattloose and Adoni5 authored May 9, 2024
1 parent 397d215 commit 8e5210e
Show file tree
Hide file tree
Showing 29 changed files with 557 additions and 43 deletions.
26 changes: 19 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ the read in progress and so direct sequencing capacity towards reads of interest

**This implementation of readfish requires Guppy version >= 6.0.0 and MinKNOW version core >= 5.0.0 . It will not work on earlier versions.**

**Since MinKNOW version core >=5.9.0 and Dorado server version >=7.3.9, Dorado requires an alternate library, `ont-pybasecall-client-lib`. We have introduced a new`dorado` module to handle this.**


The code here has been tested with Guppy in GPU mode using GridION Mk1 and
NVIDIA RTX2080 on live sequencing runs and an NVIDIA GTX1080 using playback
Expand Down Expand Up @@ -121,9 +123,9 @@ conda env create -f development.yml
conda activate readfish_dev
```

| <h2>‼️ Important! </h2> |
|:---------------------------|
| The listed `ont-pyguppy-client-lib` version will probably not match the version installed on your system. To fix this, Please see this [issue](https://github.com/LooseLab/readfish/issues/221#issuecomment-1381529409) |
| <h2>‼️ Important !! </h2> |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| MinKNOW is transitioning from Guppy to Dorado. Until MinKNOW version 5.9 both Guppy and Dorado used ont-pyguppy-client-lib.<br/>As of MinKNOW version 5.9 and Dorado server version 7.3.9 and greater Dorado requires an alternate library, `ont-pybasecall-client-lib`.<br/>The listed `ont-pyguppy-client-lib` or `ont-pybasecaller-client-lib` version may not match the version installed on your system. To fix this, Please see this [issue](https://github.com/LooseLab/readfish/issues/221#issuecomment-1381529409), using the appropriate library. |


[ONT's Guppy GPU](https://community.nanoporetech.com/downloads) should be installed and running as a server.
Expand Down Expand Up @@ -333,8 +335,8 @@ Note: The plots here are generated from running readfish unblock-all on an Apple
<details style="margin-top: 10px">
<summary id="testing-basecalling-and-mapping"><h3 style="display: inline;">Testing base-calling and mapping</h3></summary>

To test selective sequencing you must have access to a
[guppy basecall server](https://community.nanoporetech.com/downloads/guppy/release_notes) (>=6.0.0)
To test selective sequencing you must have access to either a
[guppy basecall server](https://community.nanoporetech.com/downloads/guppy/release_notes) (>=6.0.0) or a [dorado basecall server](https://community.nanoporetech.com/downloads/dorado/release_notes).

and a readfish TOML configuration file.

Expand All @@ -348,6 +350,10 @@ NOTE: guppy and dorado are used here interchangeably as the basecall server. Dor
```toml
[mapper_settings.mappy-rs]
```
1. If on MinKNOW core>=5.9.0 and Dorado server version >=7.3.9, edit the `basecaller` section to read:
```toml
[caller_settings.dorado]
```
1. Modify the `fn_idx_in` field in the file to be the full path to a [minimap2](https://github.com/lh3/minimap2) index of the human genome.

1. Modify the `targets` fields for each condition to reflect the naming convention used in your index. This is the sequence name only, up to but not including any whitespace.
Expand Down Expand Up @@ -553,13 +559,19 @@ And for our Awesome Logo please checkout out [@tim_bassford](https://twitter.com

<!-- start-changelog -->
# Changelog
## 2024.2.0
1. Add a dorado base-caller which addressed issue [#347](https://github.com/LooseLab/readfish/issues/347) - chiefly in Dorado 7.3.9 ONT have moved to `ont-pybasecall-client-lib`,
and connections from `ont_pyguppy_client_lib` raise `Connection error. ... LOAD_CONFIG. Reply: INVALID_PROTOCOL` [(#344)](https://github.com/LooseLab/readfish/pull/344)
1. Adds version checking for MinKNOW and Guppy/Dorado, logs if not compatibile [(#351)](https://github.com/LooseLab/readfish/pull/351)

## 2024.1.0
1. bug fix type for `--wait-on-ready` type and actual function [(#327)](https://github.com/LooseLab/readfish/pull/327), [(#323)](https://github.com/LooseLab/readfish/pull/323)
1. mutiple suffix `.mmi` support [(#330)](https://github.com/LooseLab/readfish/pull/330)
1. Change the default `unblock_duration` on the `Analysis` class to use `DEFAULT_UNBLOCK` value defined in `_cli_args.py`. Change type on the Argparser for `--unblock-duration` to float. [(#313)](https://github.com/LooseLab/readfish/pull/313)
1. Big dog Duplex feature - adds ability to select duplex reads that cover a target region. See pull request for details [(#324)](https://github.com/LooseLab/readfish/pull/324)

## 2023.1.1
1. Fix Readme Logo link 🥳 (#296)
1. Fix bug where we had accidentally started requiring barcoded TOMLs to specify a region. Thanks to @jamesemery for catching this. (#299)
1. Correctly handle overriding a decision in internal statistics tracking. (#299)
2. Fix bug where we had accidentally started requiring barcoded TOMLs to specify a region. Thanks to @jamesemery for catching this. (#299)
3. Correctly handle overriding a decision in internal statistics tracking. (#299)
<!-- end-changelog -->
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/barcoded_human.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_bed_file_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_chr_depletion.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_chr_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
10 changes: 7 additions & 3 deletions docs/_static/example_tomls/human_csv_file_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,13 @@
# - https://looselab.github.io/readfish/toml.html.

[caller_settings.guppy]
# Caller Configuration for Guppy Basecaller
config = "dna_r9.4.1_450bps_fast" # Specify the basecaller configuration
address = "ipc:///tmp/.guppy/5555" # Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
debug_log = "live_reads.fq" # Fastq output for individual reads (Optional, delete line to disable)

[mapper_settings.mappy]
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_minimap2_extra_params.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_regions_barcoded.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
2 changes: 1 addition & 1 deletion docs/developers-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ conda env create -f docs/development.yml

## Readfish versioning
Readfish uses [calver](https://calver.org/) for versioning. Specifically the format should be
`YYYY.MINOR.MICRO.Modifier`, where `MINOR` is the feature addiiton, `MICRO` is any hotfix/bugfix, and `Modifier` is the modifier (e.g. `rc` for release candidate, `dev` for development, empty for stable).
`YYYY.MINOR.MICRO.Modifier`, where `MINOR` is the feature addition, `MICRO` is any hotfix/bugfix, and `Modifier` is the modifier (e.g. `rc` for release candidate, `dev` for development, empty for stable).

## Changelog

Expand Down
10 changes: 9 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@ dev = ["readfish[all,docs,tests]", "pre-commit"]
mappy = ["mappy"]
mappy-rs = ["mappy-rs >= 0.0.6"]
guppy = ["ont_pyguppy_client_lib"]
all = ["readfish[mappy,mappy-rs,guppy]"]
dorado = ["ont-pybasecall-client-lib"]
all = ["readfish[mappy,mappy-rs,guppy,dorado]"]

[project.urls]
Documentation = "https://looselab.github.io/readfish"
Expand Down Expand Up @@ -83,3 +84,10 @@ markers = [
"alignment: marks tests which rely on loading or using Mappy or Mappy-rs aligners, used to test with both. (deselect with '-m \"not slow\", select with '-k alignment')",
]
addopts = ["-ra", "--doctest-modules", "--ignore=src/readfish/read_until/base.py"]

[tool.coverage.report]
omit = [
"src/readfish/plugins/dorado.py",
"src/readfish/_read_until_client.py",
"src/readfish/plugins/guppy.py",
]
2 changes: 1 addition & 1 deletion src/readfish/__about__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""__about__.py
Version of the read until software
"""
__version__ = "2024.1.0"
__version__ = "2024.2.0"
103 changes: 103 additions & 0 deletions src/readfish/_compatibility.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
"""_compatibility.py
Contains utilities for checking readfish compatibility with various versions of MinKNOW.
Checks ranges of `readfish` against the `MinKNOW` version
Attributes:
LATEST_TESTED (str): The latest tested version of MinKNOW.
MINKNOW_COMPATIBILITY_RANGE (tuple): The compatibility range of MinKNOW versions for this version of readfish.
DIRECTION (Enum): An enumeration representing upgrade, downgrade, or no change directions.
"""

from __future__ import annotations

from enum import Enum

from minknow_api.manager import Manager
from packaging.version import parse as parse_version
from packaging.version import Version

LATEST_TESTED = "5.9.7"

# The versions of MinKNOW which this version of readfish can connect to
# Format - (lowest minknow version, highest version of minknow supported as an upper bound)
MINKNOW_COMPATIBILITY_RANGE = (
Version("5.0.0"),
Version(LATEST_TESTED),
)


class DIRECTION(Enum):
"""
Represents the direction in which the version of the readfish software should be changed
to be compatible with the tested version of an external tool (likely MinKNOW).
Attributes:
UPGRADE: Indicates that the readfish software version should be upgraded.
DOWNGRADE: Indicates that the readfish software version should be downgraded.
JUST_RIGHT: Indicates that the readfish software version is already compatible
with the tested version of the external tool.
"""

UPGRADE = "upgrade"
DOWNGRADE = "downgrade"
JUST_RIGHT = "do nothing"


def _get_minknow_version(host: str = "127.0.0.1", port: int = None) -> Version:
"""
Get the version of MinKNOW
:param host: The host the RPC is listening on, defaults to "127.0.0.1"
:param port: The port the RPC is listening on, defaults to None
:return: The version of MinKNOW readfish is connected to
"""
manager = Manager(host=host, port=port)
minknow_version = parse_version(manager.core_version)
return minknow_version


def check_compatibility(
comparator: Version,
version_range: tuple[Version, Version] = MINKNOW_COMPATIBILITY_RANGE,
) -> DIRECTION:
"""
Check the compatibility of a given software version, between a given range,
inclusive of the right edge.
:param comparator: Version of the provided software, for example MinKNOW 5.9.7
:param version_ranges: A tuple of lowest supported version, highest supported version
:return: A direction variant indicating if this version of readfish needs to be changed.
Examples:
>>> from packaging.version import Version
>>> check_compatibility(Version("5.9.5"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.JUST_RIGHT: 'do nothing'>
>>> check_compatibility(Version("5.9.7"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.JUST_RIGHT: 'do nothing'>
>>> check_compatibility(Version("5.9.8"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.UPGRADE: 'upgrade'>
>>> check_compatibility(Version("4.9.0"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.DOWNGRADE: 'downgrade'>
>>> if (action := check_compatibility(Version("6.0.0"), MINKNOW_COMPATIBILITY_RANGE)) in (
... DIRECTION.UPGRADE,
... DIRECTION.DOWNGRADE,
... ):
... action
<DIRECTION.UPGRADE: 'upgrade'>
"""
(
lowest_supported_version,
highest_supported_version,
) = version_range
if comparator < lowest_supported_version:
return DIRECTION.DOWNGRADE
return (
DIRECTION.JUST_RIGHT
if comparator <= highest_supported_version
else DIRECTION.UPGRADE
)
1 change: 1 addition & 0 deletions src/readfish/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ def load_module(self, override=False):
"""
builtins = {
"guppy": "guppy",
"dorado": "dorado",
"mappy": "mappy",
"mappy_rs": "mappy_rs",
"no_op": "_no_op",
Expand Down
1 change: 1 addition & 0 deletions src/readfish/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from minknow_api.manager import Manager, FlowCellPosition
from minknow_api import Connection


if sys.version_info < (3, 11):
from exceptiongroup import BaseExceptionGroup

Expand Down
1 change: 1 addition & 0 deletions src/readfish/entry_points/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@
readfish stats --toml tests/static/stats_test/yeast_summary_test.toml --fastq-directory tests/static/stats_test/ --html summary_adaptive
"""

from __future__ import annotations
import argparse
from pathlib import Path
Expand Down
Loading

0 comments on commit 8e5210e

Please sign in to comment.