-
Notifications
You must be signed in to change notification settings - Fork 247
First help on decryption dbGaP data
SRA Tools prior to release 2.10.0 would determine the encryption context from a user's current working directory. This is no longer the case.
Since 2.10.0, the user's access token is required as a command line parameter. Access tokens come either in the form of the traditional NGC file, or a newer JWT version. Each has its particular application. Neither one requires you to cd
to a special directory any longer in order to decrypt.
Prior to 2.10.0, users were asked to download an NGC token from dbGaP. This token would then be imported into toolkit configuration via the vdb-config
command. This operation is no longer supported.
Since 2.10.0, users are asked to provide a path to the NGC file as a parameter:
--ngc <path-to-ngc-file>
Since 2.10.0, a new type of permission token is provided for operation within dedicated regions of commercial clouds. The JWT conveys the same authorization information as an NGC, but does not involve encryption nor does it identify keys needed for decryption.
--perm <path-to-jwt-file>
As mentioned above, the JWT only conveys access permissions and cannot be used for decrypting files. In order to decrypt, you should use an NGC token.
Any SRA Tool is capable of decrypting an encrypted file. If you have downloaded an encrypted file, e.g. via prefetch
, then you can access it locally by
fasterq-dump --ngc <path-to-jwt-file> <path-to-encrypted-run>
This is true for other files such as vdb-decrypt.
If you are downloading on-the-fly, the situation is slightly more complicated by the compute environment from which you are running the command. In particular, if you are within one of the designated commercial cloud environments, those having special access to dbGaP in the cloud (currently AWS us-east-1 and GCP gs-us multi-region), you will not be able to use an NGC but must instead use a JWT. This is also true if you prefetch from within one of those regions.
From within a designated commercial cloud environment:
# prefetch to current directory and access via relative path
prefetch --perm <path-to-jwt-file> SRRxxxxxxxxx
fasterq-dump ./SRRxxxxxxxxx
# or download on demand
fasterq-dump --perm <path-to-jwt-file> SRRxxxxxxxxx
From anywhere else:
# prefetch to current directory and access via relative path
prefetch --ngc <path-to-ngc-file> SRRxxxxxxxxx
fasterq-dump ./SRRxxxxxxxxx
# or download on demand
fasterq-dump --ngc <path-to-ngc-file> SRRxxxxxxxxx
Note that vdb-decrypt
will only make sense with an NGC token.
Question: I am trying to decrypt some files in a dbGaP project. I did some research and found instructions on the site of NCBI and on the Web. I try to follow these instructions but still cannot decrypt the data. Please help.
Answer: Troubleshooting step number one: first of all let us make sure we do not follow wrong instructions found on the Web. NEVER SET VDB_PWFILE!!!
You can find non-NCBI guides how to decrypt dbGaP data. They mention that there is a required step to decrypt the data and this step is omitted in NCBI documents: namely, to set VDB_PWFILE environment variable.
It is omitted because you should not set this variable! On the contrary, setting VDB_PWFILE will break NCBI decryption!!!
Here is how to ensure you are not following wrong directions:
-
Check that VDB_PWFILE is not set: Run "set | grep VDB_PWFILE" - it if prints anything, then is is set.
-
If VDB_PWFILE is set: 2.1) run "unset VDB_PWFILE"
The instructions how to decrypt dbGaP data are on http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=dbgap_use
If you still cannot decrypt your data and VDB_PWFILE is not set then do the following:
-
Change directory or "cd" to the project's workspace. If you do not understand what it means read http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=dbgap_use
-
Run "test-sra" and send its output to sra-tools@ncbi-nlm.nih.gov together with a brief desciption of your problem.