You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nextflow adopts the scatter-gather method to process huge fastq files. First, split one huge fastq file into multiple smaller fastq files, and then submit jobs that process each individual fastq file to the batch system. Last, merge their results from individual processing to form the sample-level result.
What is the pypiperic way to do that?
The text was updated successfully, but these errors were encountered:
Hi @zhangzhen , pypiper wasn't really designed to do partitioning and parallelism, but rather to be applied to something that's already partitioned/chunked/etc., either naturally (e.g., biological samples) or artificially (e.g., how you could split the FASTQ arbitrarily). pepkit/looper would be how you'd normally do this sort of thing (submission of a single pypiper pipeline to multiple pieces of data). @donaldcampbelljr or @nsheff may have more recent information, though, as I've not worked in depth on the project in a while.
Nextflow adopts the scatter-gather method to process huge fastq files. First, split one huge fastq file into multiple smaller fastq files, and then submit jobs that process each individual fastq file to the batch system. Last, merge their results from individual processing to form the sample-level result.
What is the pypiperic way to do that?
The text was updated successfully, but these errors were encountered: