Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency improvement targeting Collection and Upload #347

Closed
maxsibilla opened this issue Apr 25, 2024 · 1 comment
Closed

Efficiency improvement targeting Collection and Upload #347

maxsibilla opened this issue Apr 25, 2024 · 1 comment

Comments

@maxsibilla
Copy link
Contributor

Collection and Upload index procedure is very different from other entity types, similar to each other though.

Collection.datasets and Upload.datasets are both generated by on_read_trigger. This can be time-consuming when a collection has lots datasets. For instance, 3ae4ddfc175d768af5526a010bfe95aa has 211 datasets, the GET request takes 8 seconds to generate a 3.6MB payload.

As an additional efficiency improvement,

  • Rename Collection.dataset_uuids (used by POST only) to Collection.dataset_uuids_to_link (no side effects since no Collection creation being used by other services). Also update the trigger method to use this new field. (Karl? Since he made Collection creation/update using the generic POST/PUT)
  • Rename Collection.datasets to Collection.dataset_uuids with only returning a list of uuids (requires updating the neo4j quey and corresponding search-api tweaks)
  • Rename Upload.datasets to Upload.dataset_uuids with only returning a list of uuids (requires updating the neo4j quey and corresponding search-api tweaks)

HM related card: hubmapconsortium/entity-api#632

@maxsibilla
Copy link
Contributor Author

Not applicable after indexing rework

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Status: Done
Development

No branches or pull requests

1 participant