Skip to content

Script to filter rows in tabbed txt files such as logfiles or sequencing summary, by another tabbed txt

Notifications You must be signed in to change notification settings

asan-emirsaleh/tabbed_txt_rows_filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

tabbed txt rows filter

Script to filter rows in tabbed txt files such as logfiles or sequencing summary, by another tabbed txt

Initially created for filtering rows in ONT Guppy generated sequencing_summary.txt to obtain a file comtaining the same reads as in another one reference summary. The goal was a quality control of several fastq files basecalled in different modes from the same ONT raw reads. Due to the high computational cost of cpu basecalling a little data processed seems to be enaugh for clearly dudgeing its quality score. The only reasonably way to compare not complite basecalling way is to filter reads data and take the reads processed in every case (it menns in every basecalling mode).

Known issues

  1. After filtering the final file contained less lines than the reference. May be due to duplication of reads in summary files (the reference file was combined from two files generated by Guppy with --resume function anabled).
  2. Filtered summary doesn't seems valid while MinIONQC.R run on it. The way to resolve it is to copy Guppy generated sequencing_summary.txt with replaceing the rows other than the 1st. Manually adding the header line to the txt file doesn't work properly, but the files looks identically at the IDE.

About

Script to filter rows in tabbed txt files such as logfiles or sequencing summary, by another tabbed txt

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages