Replies: 1 comment
-
This is expected. CheckM measures completeness and contamination using single-copy genes (from inferred lineage). In the vast majority of cases these genes reside on long contigs and are already binned quite well (note that BinSPreader does not merge bins, though you can build a bin-distance matrix and use it to guide external bin merging). Contigs in the neighbourhood of conservative regions are inherently "multiple-colored" and if you're doing single-bin assignment could flip the assignment (especially if you're running in correction, not propagation mode). Bins enriched with short contigs might accidentally have matches to marker genes from other family contributing to increase of contamination. Long story short: there is no bullet-proof way to estimate completeness / contamination without reference and every tool / method has its own set of limitations. |
Beta Was this translation helpful? Give feedback.
-
Description of bug
I run binspreader after metaspades assembly and binning with VAMB and assessed bin quality with checkM2.
My metaenome is relatively simple infant gut.
My comparison binspreader increased completeness for some bins it increased contamination for most of the bins.
I played around with adding pe-reads or not
toogle sparse-propagation ...
Results seem not much changed
There might be the option to tune the la parameter but I have no idea how much..
spades.log
NA
params.txt
NA
SPAdes version
3.16
Operating System
linux
Python Version
No response
Method of SPAdes installation
conda
No errors reported in spades.log
Beta Was this translation helpful? Give feedback.
All reactions