Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch some "placeholder" channel positions being 0 on flongle #384

Merged
merged 5 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions docs/questions/split-flongle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: "Splitting a flongle into regions"
alt_titles:
- "Flow cell splitting
- "Flongle splitting"
- "ValueError: channel cannot be below 0 or above flowcell_size"
---

The flow cell can be split both vertically and horizontally, with the `split_axis` parameter in the TOML file deciding this.
The axis can be either 0 or 1, with 0 splitting horizontally and 1 splitting vertically.
The default value is 1, so this only needs setting if you wish to split horizontally.

```text
Vertical (axis=1) Horizontal (axis=0)
+------------+ +------+------+ | +------------+ +------------+
| 1 2 3 4| | 1 2| 3 4| | | 1 2 3 4| | 1 2 3 4|
| 5 6 7 8| --> | 5 6| 7 8| | | 5 6 7 8| --> | 5 6 7 8|
| 9 10 11 12| | 9 10| 11 12| | | 9 10 11 12| +------------+
| 13 14 15 16| | 13 14| 15 16| | | 13 14 15 16| | 9 10 11 12|
+------------+ +------+------+ | +------------+ | 13 14 15 16|
+------------+
```

This set at the very top of the TOML file, like the `channels` parameter
```toml
split_axis=0

[caller_settings.dorado]
config = "dna_r10.4.1_e8.2_400bps_hac"
address = "ipc:///tmp/.guppy/5555"
debug_log = "basecalled_chunks.fq" #optional
......... # and so on
```

The flongle is a strange shape - 13 columns by 10 rows. 13 is a prime, meaning it cannot be split vertically (except into 13), so flongles must be split horizontally into 2,5 or 10 regions.
28 changes: 28 additions & 0 deletions docs/toml.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,8 @@ The number of regions subtables determines how many times the flow cell is divid
The maximum number of regions for MinION flow cells is 32 and for PromethION flow cells is 120.
The number of conditions must be a factor of the number for the selected combination.

The flongle is a strange shape - 13 columns by 10 rows. 13 is a prime, meaning it cannot be split vertically (except into 13), so flongles must be split horizontally into 2,5 or 10 regions.

As an example applying two analysis regions to the layout below would split the flow cell into left/right regions.
```text
+------------+ +------+------+
Expand All @@ -202,6 +204,32 @@ As an example applying two analysis regions to the layout below would split the
+------------+ +------+------+
```

The flow cell can be split both vertically and horizontally, with the `split_axis` parameter in the TOML file deciding this.
The axis can be either 0 or 1, with 0 splitting horizontally and 1 splitting vertically.
The default value is 1, so this only needs setting if you wish to split horizontally.

```text
Vertical (axis=1) Horizontal (axis=0)
+------------+ +------+------+ | +------------+ +------------+
| 1 2 3 4| | 1 2| 3 4| | | 1 2 3 4| | 1 2 3 4|
| 5 6 7 8| --> | 5 6| 7 8| | | 5 6 7 8| --> | 5 6 7 8|
| 9 10 11 12| | 9 10| 11 12| | | 9 10 11 12| +------------+
| 13 14 15 16| | 13 14| 15 16| | | 13 14 15 16| | 9 10 11 12|
+------------+ +------+------+ | +------------+ | 13 14 15 16|
+------------+
```

This set at the very top of the TOML file, like the `channels` parameter
```toml
split_axis=0

[caller_settings.dorado]
config = "dna_r10.4.1_e8.2_400bps_hac"
address = "ipc:///tmp/.guppy/5555"
debug_log = "basecalled_chunks.fq" #optional
......... # and so on
```

### Experiments with one region

When wanting to apply a single targeting strategy over the flow cell a single region can be provided.
Expand Down
8 changes: 6 additions & 2 deletions src/readfish/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ class Conf:
:param channels: The number of channels on the flow cell
:param caller_settings: The caller settings as listed in the TOML
:param mapper_settings: The mapper settings as listed in the TOML
:param split_axis: The axis on which to split a flowcell if there are multiple regions. 0 is horizontal, 1 is vertical.
:param regions: The regions as listed in the Toml file.
:param barcodes: A Dictionary of barcode names to Barcode Classes
:param _channel_map: A map of channels number (1 to flowcell size) to the index of the Region (in self.regions) they are part of.
Expand All @@ -244,6 +245,7 @@ class Conf:
channels: int
caller_settings: CallerSettings
mapper_settings: MapperSettings
split_axis: int = 1
regions: List[Region] = attrs.field(default=attrs.Factory(list))
barcodes: Dict[str, Barcode] = attrs.field(default=attrs.Factory(dict))
_channel_map: Dict[int, int] = attrs.field(
Expand Down Expand Up @@ -281,7 +283,9 @@ def __attrs_post_init__(self):
" with n_threads set to at least 4."
)

split_channels = generate_flowcell(self.channels, len(self.regions) or 1)
split_channels = generate_flowcell(
self.channels, len(self.regions) or 1, axis=self.split_axis
)
self._channel_map = {
channel: pos
for pos, (channels, region) in enumerate(zip(split_channels, self.regions))
Expand Down Expand Up @@ -465,7 +469,7 @@ def describe_experiment(self) -> str:
description.append(
f"""Region {region.name} (control={region.control}).
Region applies to section of flow cell (# = applied, . = not applied):
{draw_flowcell_split(self.channels, split, index=index)}"""
{draw_flowcell_split(self.channels, split, index=index, axis=self.split_axis)}"""
)
return "\n".join(description)

Expand Down
51 changes: 46 additions & 5 deletions src/readfish/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,11 @@ def stringify_grid(grid: list[list[str]]) -> str:


def draw_flowcell_split(
flowcell_size: int, split: int = 1, axis: int = 1, index: int = 0
flowcell_size: int,
split: int = 1,
axis: int = 1,
index: int = 0,
prefix: str = "\t",
) -> str:
"""
Draw unicode representation of the flowcell. If the flowcell is split more than once, and index is passed, the region of the
Expand All @@ -306,24 +310,53 @@ def draw_flowcell_split(
00XX
00XX

>>> print(draw_flowcell_split(126, 13, index=1, axis=1, prefix=""))
<BLANKLINE>
.#...........
.#...........
.#...........
.#...........
.#...........
<BLANKLINE>

>>> print(draw_flowcell_split(126, 5, index=1, axis=0, prefix=""))
<BLANKLINE>
.............
.............
.............
#############
.............
<BLANKLINE>

:param flowcell_size: Number of channels on the flow cell
:param split: The number of regions to split into, defaults to 1
:param index: The index of the region to highlight, defaults to 0
:param prefix: Any leading string character to put on the row. Defaults to \t
:return: String representation of the flowcell in ASCII art
"""
depth, width = get_flowcell_array(flowcell_size).shape
depth = round((depth / 2) + 0.5)
height, width = get_flowcell_array(flowcell_size).shape
height = round((height / 2) + 0.5)
# Flongle is a truly dumb shape - 10 rows of 13 columns, breaks maths above so just special case it
if flowcell_size == 126:
height -= 1
cells = []
for _h in range(depth):
for _h in range(height):
row = [
" ",
prefix,
]
for _w in range(width):
row.append(".")
cells.append(row)
cells = np.array(cells)
region = generate_flowcell(flowcell_size, split, axis)[index]
col, row = None, None
for pos in region:
# Flongle has four channels which are "0", but are just placeholders
# in the array where there is no channel due to it's weird shape.
# Therefore we insert a whitespace for this character
if pos == 0 and flowcell_size == 126:
cells[(col // 2), row + 2] = " "
continue
row, col = get_coords(pos, flowcell_size)
cells[(col // 2), row + 1] = "#"
return f"\n{stringify_grid(cells)}\n"
Expand Down Expand Up @@ -363,6 +396,14 @@ def generate_flowcell(
Traceback (most recent call last):
...
ValueError: The flowcell cannot be split evenly

>>> for x in generate_flowcell(126, 5, axis=0):
... print(len(x))
26
26
26
26
26
"""
if odd_even:
return [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,23 @@ Barcode unclassified_reads (control=False), Barcode classified_reads (control=Fa
Region Experimental (control=False).
Region applies to section of flow cell (# = applied, . = not applied):

################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................

Region Control (control=True).
Region applies to section of flow cell (# = applied, . = not applied):

................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
32 changes: 16 additions & 16 deletions tests/static/describe_test/describe_experiment_regions_expected.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,23 @@ Configuration description:
Region select two contigs (control=False).
Region applies to section of flow cell (# = applied, . = not applied):

################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................
################................

Region control (control=True).
Region applies to section of flow cell (# = applied, . = not applied):

................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
................################
Loading