Some metadata in the BTAA Geoportal includes place names, but does not include coordinates. For these records, users cannot use the map search to discover the approximate geographical area of the target data. Automating a process to add coordinates has two benefits:
- The old-fashioned way is to add it manually, which is extremely time-consuming and error-prone.
- Adding geographical hierarchy to metadata would make the search engine return related results when searching for place names.
We will be using Anaconda 3 to edit and run scripts. Information on Anaconda installation can be found here. All packages available for 64-bit Windows with Python 3.7 in the Anaconda can be found here. Please note that all scripts are running on Python 3 (3.7.6).
Here are all dependencies needed to be installed properly:
GeoNames is a database available for downloading geographical names as well as other information. All the features are categorized into one out of 9 feature classes.
-
It provides a Python library named GeoNames which supports several methods to retrieve geographic information, hierarchy, and children for a given GeoName.
-
The search results do not guarantee the accuracy and always return the first record. It also does not support searching for the exact match. There is a great chance of ending up with the mismatch.
-
As usual, the csv file is formatted in the GeoBlacklight Template. The required columns include “Title”, “Slug” and “Spatial Coverage”. It can also export fields like “Download” and “Information” to the final product if applicable.
-
It will pull URI, coordinates, hierarchy based on place name (“Spatial Coverage”) from the GeoNames database, and then export to a new csv file.