Script to filter rows in tabbed txt files such as logfiles or sequencing summary, by another tabbed txt
Initially created for filtering rows in ONT Guppy generated sequencing_summary.txt to obtain a file comtaining the same reads as in another one reference summary. The goal was a quality control of several fastq files basecalled in different modes from the same ONT raw reads. Due to the high computational cost of cpu basecalling a little data processed seems to be enaugh for clearly dudgeing its quality score. The only reasonably way to compare not complite basecalling way is to filter reads data and take the reads processed in every case (it menns in every basecalling mode).
Known issues
- After filtering the final file contained less lines than the reference. May be due to duplication of reads in summary files (the reference file was combined from two files generated by Guppy with --resume function anabled).
- Filtered summary doesn't seems valid while MinIONQC.R run on it. The way to resolve it is to copy Guppy generated sequencing_summary.txt with replaceing the rows other than the 1st. Manually adding the header line to the txt file doesn't work properly, but the files looks identically at the IDE.