-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow5tools degrade (1.3.0) does not detect ULK kit? #134
Comments
Hello, we are parsing the ulk part properly, but it is checking if the kits match the ones we exhaustively tested. As this is a lossy compression, we are being very pedantic to avoid a user from inadvertently getting their data affected. These kits will be eventually added when we come across them and test. I have not had access to GridION sqk-ulk114 data, but is very likely the suitable -b would be 3. Is this a publicly available dataset? |
As per the Twitter conversation (https://x.com/jpelbers/status/1842484817885073502), here is a Dropbox link to ~30x average coverage ONT ULK reads for HG002 chr22 (based on alignment to hg38 no alts). They were HG002 cells with DNA extracted following a BioNano DNA extraction protocol, undergoing ONT ULK library preparation, then sequenced on an ONT PromethION P2 solo device with an r10.4.1 flowcell connected to a ONT GridION for data acquisition. Provided is an ex-zd, zstd blow5 file that you can access with wget 'https://www.dropbox.com/scl/fi/8s0p4ttpuy1amiuulzu3v/WGS_HG002_Bionano_recover_13022024.chr22.readids.blow5?rlkey=395acerl9ewgyqkafi7g15ipe&st=giubcawn' -O WGS_HG002_Bionano_recover_13022024.chr22.blow5 on a computer with wget. Best, *NOTE that the blow5 file on Dropbox does not match the header above in this Github issue as I realized those squiggles did not belong to HG002. |
Thanks, we will have a look at this as soon as possible. |
OK, @KavinduJayas did the tests and 3-bits seems to be the suitable number of bits for removal. Identity scores: Methylation correlation: @sashajenner could you please implement a profile for this data in the dev branch for degrade please? The relevant header data is as follows:
|
Since this kit can be used on different device types other than the PromethION 2 Solo, should we be ignoring the device_type header field? Or does the device affect the ideal number of bits to remove? |
For the following s/blow5 header made with blue-crab (0.1.2) , it does not seem that slow5tools degrade (1.3.0) recognizes the ULK kit.
~/bin/slow5tools-v1.3.0/slow5tools degrade -s ex-zd -c zstd PAU99561_d2c3e09e_ca829370_21.blow5 -o PAU99561_d2c3e09e_ca829370_21.3.blow5 [degrade_main::WARNING] This tool performs lossy compression which is an irreversible operation. Just making sure it is intended. [slow5_hdr_get_dataset] Not detected: MinION DNA lsk114 5kHz [slow5_hdr_get_dataset] Not detected: PromethION DNA lsk109 4kHz [slow5_hdr_get_dataset] Not detected: PromethION DNA lsk114 4kHz [slow5_hdr_get_dataset] Not detected: PromethION DNA lsk114 5kHz [slow5_hdr_get_dataset] Not detected: PromethION RNA rna002 3kHz [slow5_hdr_get_dataset] Not detected: PromethION RNA rna004 4kHz [slow5_hdr_get_dataset::ERROR] No suitable bits suggestion [degrade_main::ERROR] Use option -b to manually specify
I guess if it is possible to parse the ULK part, then that would be fine or to show the user what bit values to use for different datasets?
The text was updated successfully, but these errors were encountered: