Skip to content

Commit

Permalink
♻️ DbGaP consent codes are prefixed with /programs
Browse files Browse the repository at this point in the history
  • Loading branch information
znatty22 committed Jun 22, 2023
1 parent 9f05e95 commit 0348627
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 18 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ The `--match_aliquot` flag will match dbGaP `submitted_sample_id` to `external_a
## ACL Definitions

* study_phs: (e.g. "phs001138")
* consent_acl: f"{study_phs}.c{consent_code}" (consent_code for the specimen)
* default_acl: [{consent_acl} from visible biospecimens which contribute to the genomic file]
* consent_acl: f"/programs/{study_phs}.c{consent_code}" (consent_code for the specimen)
* default_acl: set([{consent_acl} from visible biospecimens which contribute to the genomic file])
* open_acl: ["/open"]

## ACL Rules
Expand Down Expand Up @@ -66,12 +66,12 @@ field set to **False** should get `{open_acl}`.
* All visible genomic files in the dataservice with their `controlled_access`
field set to **True** should get the `{default_acl}`.

* The `default_acl` is a list of the `consent_acl` from the visible specimens
in the study which contribute to the genomic_file.
* The `default_acl` is the unique set of the `consent_acl` from the visible
specimens in the study which contribute to the genomic_file.

* The `consent_acl` is composed of the study phs ID and the
reported sample consent code of the sample. See ACL Definitions for
details
reported sample consent code of the sample, prepended with the dbgap
prefix "/programs" (e.g. "/programs/phs001138.c1")

* All other genomic files in the dataservice should get `{empty_acl}`
indicating no access.
Expand Down
25 changes: 15 additions & 10 deletions kf_update_dbgap_consent/sample_status.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
## ACL Definitions
* study_phs: (e.g. "phs001138")
* consent_acl: f"{study_phs}.c{consent_code}" (consent_code for the specimen)
* default_acl: [{consent_acl} from visible biospecimens which contribute to the genomic file]
* consent_acl: f"/programs/{study_phs}.c{consent_code}" (consent_code for the specimen)
* default_acl: unique([{consent_acl} from visible biospecimens which contribute to the genomic file])
* open_acl: ["/open"]
## ACL Rules
Expand Down Expand Up @@ -39,15 +39,17 @@
* All visible genomic files in the dataservice with their `controlled_access`
field set to **True** should get the `{default_acl}`.
* The `default_acl` is a list of the `consent_acl` from the visible specimens
in the study which contribute to the genomic_file.
* The `default_acl` is the unique set of the `consent_acl` from the visible
specimens in the study which contribute to the genomic_file.
* The `consent_acl` is composed of the study phs ID and the
reported sample consent code of the sample. See ACL Definitions for
details
reported sample consent code of the sample, prepended with the dbgap
prefix "/programs" (e.g. "/programs/phs001138.c1")
* All hidden genomic files in the dataservice should get `{empty_acl}`
* All other genomic files in the dataservice should get `{empty_acl}`
indicating no access.
"""
from collections import defaultdict
from concurrent.futures import ThreadPoolExecutor, as_completed
Expand Down Expand Up @@ -266,19 +268,22 @@ def entities_dict(endpoint, filt):
their `controlled_access` field set to **True** should get
the `{default_acl}`.
* The `default_acl` is a list of the `consent_acl`
* The `default_acl` is the unique set of the `consent_acl`
from the visible specimens in the study which contribute to
the genomic_file.
* The `consent_acl` is composed of the study phs ID and the
reported consent code of the sample (e.g. phs001138.c1)
reported consent code of the sample, prepended with the
dbgap prefix "/programs" (e.g. "/programs/phs001138.c1")
"""
biospecimen_codes = set(
patches["biospecimens"][k].get("dbgap_consent_code")
for k in bsids
)
patches["genomic-files"][gfid].update(
{"authz": sorted(biospecimen_codes)}
{"authz": sorted(
[f"/programs/{code}" for code in biospecimen_codes]
)}
)
# GenomicFile visible = False OR one of contributing Biospecimen
# visible=False
Expand Down
4 changes: 2 additions & 2 deletions tests/data/phs999999_patches.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,12 @@
},
"GF_22222222": {
"authz": [
"phs999999.c1"
"/programs/phs999999.c1"
]
},
"GF_33333333": {
"authz": [
"phs999999.c2"
"/programs/phs999999.c2"
]
},
"GF_44444444": {
Expand Down

0 comments on commit 0348627

Please sign in to comment.