Assignment for master's in bioinformatics. Parsing a .fastq file, extracting biologically relevant information and results and converting to .fasta files.
This directory contains the source code for the project Tedone_eDNAread_2023_Nature. This project arised from the need to resolve an assignment for the VIU master's in bioinformatics subject 01MBIF: Bash and Shell scripting.
This project works on a raw .fastq file, processing the data into a .fasta format, treating the DNA secuences from a biological point of view and obtaining biologicaly relevant information and results from the raw sequencing data.
The raw data containing the .fastq file can be found in /data/raw/ Created or modified data files from the raw data source can be found under /data/processed The scripts parsing, modifing and obtaining results from the raw and processed data can be found under /code.
To execute the several scripts, read the script header under the shebang to know where to execute the script from (script_p2_p3.sh must be executed in the root of the source code (.) while script_p7.sh and script_p8.sh must be executed in the /code subdirectory)