-
-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harvester / Remove records by harvester UUID #8431
base: main
Are you sure you want to change the base?
Conversation
When harvester contains lot of records, remove records take a while or could even return heapspace errors. Try to improve performances by using delete by query (instead of loop on each records) eg. 1500 records * Select > Delete all = 2min * Harvester > Remove records = 700ms This will bypass events but maybe that is fine for harvested records? Maybe there is better JPA alternative for this kind of query?
d151b55
to
2bad0f4
Compare
Quality Gate failedFailed conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and it's much faster.
Some considerations:
- Metadata harvested with the GeoNetwork ‘protocol’ in MEF format, may store files in the data directory. In this case, the files should be deleted also.
To check if other harvesters support the MEF format.
- If the setting
Allow editing on harvested records
is enabled, the same problem will occur. Also, if the setting is enabled, it might make sense to use the original method to delete the metadata, which backs up the deleted metadata.
|
||
default void deleteAllByHarvesterUuid(String harvesterUuid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for defining it as the default method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, also harvesting WMS most of the time produce thumbnails in the datadir. |
When harvester contains lot of records, remove records take a while or could even return heapspace errors.
Try to improve performances by using delete by query (instead of loop on each records) eg. 1500 records
This will bypass events but maybe that is fine for harvested records?
Maybe there is better JPA alternative for this kind of query?
Checklist
main
branch, backports managed with labelREADME.md
filespom.xml
dependency management. Update build documentation with intended library use and library tutorials or documentationFunded by Ifremer