-
Notifications
You must be signed in to change notification settings - Fork 2
Transfer files from GridION
This page explains how you can transfer files from a GridION sequencing run to a Linux machine before running susCovONT. If anyone has a better/faster solution to this, please let us know.
The folder to copy from GridION is the one that contains the subdirectories fast5_pass
, fastq_pass
and the sequencing_summary*txt
file. This folder is located under /data/ on the GridION, two sub-directories down, for example: /data/210213_FAO88697_CoV_NB1-24/CoV_NB1-24/20210213_1359_X5_FAO88697_5cf6e6f0/
. The folder to copy is 20210213_1359_X5_FAO88697_5cf6e6f0
and this is from now on referred to as the RUN_NAME.
First, set the RUN_NAME as a variable
RUN_NAME=20210213_1359_X5_FAO88697_5cf6e6f0 ## Change to match your current run!
echo ${RUN_NAME} #This should show your RUN_NAME on the terminal
Then follow the instructions below. I have added "###" to the end of every line that needs to be changed in the commands below.
-
Plug in an external hard drive to copy the files to. It must be plugged in at the back of the GridION for high-speed transfer.
-
Open a new terminal and run:
cd /media/
The disk mounts on different locations each time it is plugged in. To see which folder it is mounted in (e.g. /media/usb/
, /media/usb0/
, /media/usb1/
or /media/usb2/
, run:
ls *
When you find the correct folder (e.g. /media/usb0/), go into it:
cd /media/usb0 ###
ls ## Make sure that you are in the external hard drive now
- Make a directory on the external hard drive where you want to store the GridION files for transfer. This folder should have the same name as the run (the folder that on GridION contains
fast5_pass
,fastq_pass
, etc.):
mkdir ${RUN_NAME}/
cd ${RUN_NAME}/
- Store the location as a variable for use later
DEST=$(pwd)
Now go to the location of the data on the GridION, and compress and copy the files to DEST:
- Navigate your run folder where the folder which contains the fast5_pass and fast5_fail are.
cd /data/*/*/${RUN_NAME}/
ls #See that the folders fast5_pass, fastq_pass and sequencing_summary*txt are present
- Compress and copy the files with the following commands for fastq_pass and fast5_pass:
mkdir backup ; cd backup
##Store this location as a variable for use later:
SOURCE=$(pwd)
##Fastq_pass: Compress and copy to destination folder
tar -cvzf - ../fastq_pass/ | split --bytes=500MB - fastq_pass.backup.tar.gz. ; cp fastq_pass.backup.tar.gz.* ${DEST}/
##Fast5_pass: Compress and copy to destination folder (see also tip below)
tar -cvzf - ../fast5_pass/ | split --bytes=500MB - fast5_pass.backup.tar.gz. ; cp fast5_pass.backup.tar.gz.* ${DEST}/
##Copy the other run-files as well (including sequencing_summary*txt) to destination:
cp ../*.txt ../*.md ../*.pdf ../*.csv ../*.tsv .
cp ../*.txt ../*.md ../*.pdf ../*.csv ../*.tsv ${DEST}/
Tip: If the fast5_pass directory is very large, you can open several terminals and copy chunks at a time to speed up the process, but note that you have to manually set ${DEST} in each new terminal window (the path to the run_name folder on the external hard drive, e.g. DEST=/media/usb0/Components/20210124_1825_MN30489_FA088582_94c40971)
tar -cvzf - ../fast5_pass/barcode0[0-9]/ | split --bytes=500MB - fast5_pass_barcode0.backup.tar.gz. ; cp fast5_pass_barcode0.backup.tar.gz.* ${DEST}/
tar -cvzf - ../fast5_pass/barcode1[0-9]/ | split --bytes=500MB - fast5_pass_barcode1.backup.tar.gz. ; cp fast5_pass_barcode1.backup.tar.gz.* ${DEST}/
tar -cvzf - ../fast5_pass/barcode2[0-9]/ | split --bytes=500MB - fast5_pass_barcode2.backup.tar.gz. ; cp fast5_pass_barcode2.backup.tar.gz.* ${DEST}/
## This is for 1-24 barcodes (barcode0[0-9], barcode1[0-9] and barcode2[0-9]). You can change this if you have more or less barcodes.
- Now the files should be copied! Make sure that the files did not change during transfer:
In steps 4 and 6 you stored the variables ${SOURCE}
and ${DEST}
. Make sure you are in the same terminal that you stored these in - you can check that by running echo ${SOURCE} ${DEST}
- you should see the full path to the source and destination folders.
Now run the command below to check that the files are the same:
cd ..
find $SOURCE -type f -exec md5sum {} \; | tee source.md5
find $DEST -type f -exec md5sum {} \; | tee dest.md5
diff <(sort source.md5 | cut -d" " -f1) <(sort dest.md5 | cut -d" " -f1) #There should be no output
#When you are happy that the files are the same, run:
rm source.md5 dest.md5
-
Close the terminal when the files have been successfully copied to the external hard drive. In "Files", click the unmount button and wait for the screen to say "You can now unplug ...". If it asks for a password, type the GridION login password. Unplug the drive and plug it into your Linux machine.
-
Open a terminal on the Linux and navigate to the run folder on the external hard drive:
#Set the run name again as you did before
RUN_NAME=20210213_1359_X5_FAO88697_5cf6e6f0 ### Change to match your current run!
#Then go to the folder on the external hard drive
cd /media/susamr/Components/${RUN_NAME}/ ### Change to the path on your external hard drive
- Set as a variable for use later:
SOURCE=$(pwd)
- Now make a folder on Linux to copy the folder ${DEST} to:
mkdir /media/susamr/maggie/ONT_covid/${RUN_NAME}/ ###Change the path to where you want it on your computer
cd /media/susamr/maggie/ONT_covid/${RUN_NAME}/ ###Change the path to where you want it on your computer
DEST=$(pwd) #Set variable for use later
- Copy the files from the external hard drive to the specified destination location on the Linux:
cp ${SOURCE}/* ${DEST}/
12.1 Again, check that the files are exactly the same, i.e. that they were not damaged during the transfer:
cd ..
find $SOURCE -type f -exec md5sum {} \; | tee source.md5
find $DEST -type f -exec md5sum {} \; | tee dest.md5
diff <(sort source.md5 | cut -d" " -f1) <(sort dest.md5 | cut -d" " -f1) #There should be no output
#When you are happy that the files are the same, run:
rm source.md5 dest.md5
12.2 Uncompress the copied files so you can use them:
cd ${DEST}/
cat fastq_pass.backup.tar.gz.* | tar xzvf -
cat fast5_pass.backup.tar.gz.* | tar xzvf -
Note: If you followed the tip in 6., make sure you do it for all the fast5_pass_barcodes[0-9].backup.tar.gz.*
files you created. You can extract these in separate terminal windows to make it faster:
cat fast5_pass_barcodes0.backup.tar.gz.* | tar xzvf -
cat fast5_pass_barcodes1.backup.tar.gz.* | tar xzvf -
cat fast5_pass_barcodes2.backup.tar.gz.* | tar xzvf -
You should now have a folder on the Linux called ${RUN_NAME}/
(e.g. 20210124_1825_MN30489_FA088582_94c40971
) which contains the folders: fast5_pass
, fastq_pass
and several text files, including sequencing_summary*.txt
. Now you can run artic minion via the susCovONT pipeline.