Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aardvark Support #105

Open
3 of 6 tasks
ewlarson opened this issue Aug 31, 2021 · 7 comments
Open
3 of 6 tasks

Aardvark Support #105

ewlarson opened this issue Aug 31, 2021 · 7 comments
Milestone

Comments

@ewlarson
Copy link
Contributor

ewlarson commented Aug 31, 2021

Identify Aardvark implications for GeoCombine.

@dl-maura
Copy link
Collaborator

Do we want two separate rake tasks (geocombine:index-aardvark and geocombine:index)? Will mixed ingesting work in a front end?

@srappel
Copy link

srappel commented Jan 29, 2024

Do we want two separate rake tasks (geocombine:index-aardvark and geocombine:index)? Will mixed ingesting work in a front end?

I would also like to know this. Being able to ingest both Aardvark and 1.0 records would be really useful for our case.

@thatbudakguy
Copy link
Member

thatbudakguy commented Jan 29, 2024

Do we want two separate rake tasks (geocombine:index-aardvark and geocombine:index)? Will mixed ingesting work in a front end?

Nope – you just need to set an environment variable. To index Aardvark records, you can set:

SCHEMA_VERSION='Aardvark' bundle exec rake geocombine:index

For 1.0 records, you don't need to do anything, since it is the default – although #163 proposes changing that, since Aardvark should really be the default now. You can force it to use 1.0 with:

SCHEMA_VERSION='1.0' bundle exec rake geocombine:index

@srappel
Copy link

srappel commented Jan 30, 2024

Thanks for the reply. Now that the Schema V1 to Aardvark migrator is working, I wonder if it would be possible to ingest 1.0 records and migrate them to Aardvark. For example, our instance uses Aarvark, but I would like to ingest 1.0 records from other portals and have them be migrated to Aardvark automatically. What would it take to add that functionality?

@thatbudakguy
Copy link
Member

I think the way I'd do that is to write a small script or rake task (probably we could even make it part of GeoCombine later) that takes a path to directory as its argument (most likely the cloned OpenGeoMetadata repo for one institution). You could add methods to or adapt the GeoCombine::Harvester class for this, or update the V1AardvarkMigrator to work on whole directories.

The task would need to make two passes:

  1. visit all the v1 records in the repository to build a collection ID map. if the record has a dct_partOf_sm set, add its value to the keys (collection names). if the record has a dc_type_s of Collection, add its layer_slug_s as the value for the key that matches its dc_title_s (collection layer ids).
id_map = {
  'My Collection 1' => 'institution:my-collection-1',
  'My Collection 2' => 'institution:my-collection-2'
}
  1. convert all the v1 records in the repository to Aardvark, using the V1AardvarkMigrator and passing in the collection ID map from step 1. you could save the resulting aardvark JSON file next to the v1 file it was generated from, perhaps with a suffix like -aardvark.json, or put all of it into a new directory (either is fine for the indexer).

If you do it this way, you can just use rake geocombine:index and it will see all of the new, generated Aardvark files in each institution's repository and index those.

@srappel
Copy link

srappel commented Jan 30, 2024

Per a conversation at a sprint standup meeting, we should consider updating the GeoCombine test fixture. It seems out of sync with the GeoBlacklight test fixture. Maybe that's okay, but it should probably be part of the process for adding Aardvark support generally.

@thatbudakguy
Copy link
Member

I updated the full_geoblacklight.json and full_geoblacklight_aardvark.json fixtures as part of #143, so that I could test that the migrator turns the former into the latter. If those files don't represent accurate geoblacklight documents, though, we should definitely correct that...let me know if there are any issues you find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants