Recent developments with DIAMOND #348

LimesKey · 2023-01-07T12:30:41Z

LimesKey
Jan 7, 2023
Collaborator

I recently had a chance to try out DIAMOND using a GitHub Codespace VM with 32GB of RAM and I'm amazed by the program. The sequence I aligned was the sequenced human genome around a 3.2GB file, with a gene (OCA2) that has several proteins that control eye colour. It found 25 out of 27 other proteins that are identical to this gene. It outputted the sequence alignment in a TSV file (file format can change) and it looks like this,

NT_187660.1 XP_047288575.1  100 76  0   0   216174  215947  1   76  2.42e-38    156
NT_187660.1 XP_047288574.1  100 76  0   0   216174  215947  1   76  2.88e-38    156
NT_187660.1 XP_047288573.1  100 76  0   0   216174  215947  1   76  3.13e-38    156
NT_187660.1 XP_047288570.1  100 76  0   0   216174  215947  1   76  3.44e-38    156
NT_187660.1 XP_047288569.1  100 76  0   0   216174  215947  1   76  3.71e-38    156
NT_187660.1 XP_047288568.1  100 76  0   0   216174  215947  1   76  3.76e-38    156
NT_187660.1 XP_047288567.1  100 76  0   0   216174  215947  1   76  3.93e-38    156

For instance, XP_047288567.1 is the name of the protein inside the OCA2 Gene and this same protein XP_047288567.1 is identical to the protein NT_187660.1.

The entire DIAMOND program ran the sequence alignment for 6 minutes with a 16 Core computer with 32GB of RAM. It's a bit much but I think we can substantially decrease the performance intensity of it by following the documentation and changing the input into the program. The tutorial to run this program is located here. The DB file I used was just the gene OCA2 and the input file was the human genome. The prompt I used to run the program was, ./diamond blastx -q GRCh38_latest_genomic.fna -d eyecolorprotein_Database.dmnd -o out.tsv --threads 16 --very-sensitive --masking 0 incase you want to try it yourself. The files I used are below incase you want to run the program yourself. The human genome can be downloaded from here.

DIAMOND-Test-File.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recent developments with DIAMOND #348

{{title}}

Replies: 0 comments

Select a reply

Recent developments with DIAMOND #348

LimesKey Jan 7, 2023 Collaborator

Replies: 0 comments

LimesKey
Jan 7, 2023
Collaborator