-
Notifications
You must be signed in to change notification settings - Fork 0
4. Clean up CSV metadata
Michelle Janowiecki edited this page Feb 5, 2021
·
15 revisions
Set-up
To complete this step, you need:
- Access to the command line
- Python 3+ with:
- Scripts:
Purpose
This script cleans up CSV metadata in the following areas:
Cleans up punctuation
- Removes punctuation from end of title
- Removes punctuation from end of publisher field
- Removes punctuation from end of the scale field
- Expands common abbreviations in scale and description fields
- ca. → approximately
- in. → inches
- col. → color
- ill. → illustration(s)
- Creates expanded description by combining title, description, and scale information.
Cleans up names
This script removes formatting punctuation in names by getting authorized name labels from the Library of Congress Name Authority File. ✨ This is especially important as GeoBlacklight collates names based on strings ✨
- Uses URIs from Step 2 to get authorized name labels for creators and contributors
- Conducts a search for authorized publisher labels
- Splits creators column
⤵️ - verified_authors = verified authorized author labels from LCNAF
- nv_authors = unverified authors that may still have punctuation issues
- Splits contributors column
⤵️ - verified_contributors = verified authorized contributor labels from LCNAF
- nv_contributors = unverified contributors that may still have punctuation issues
- Splits publishers column
⤵️ - verified_publishers = verified authorized publisher labels from LCNAF
- nv_publishers = unverified publishers that may still have punctuation issues
Adds field information
- Adds collection title to isPartOf
- Create spreadsheet columns with default values (assumes scanned/digitized map)
- rights: Public
- suppressed: False
- type: Image
- geom_type: Image
It creates a new spreadsheet called "02_marcRecords.csv", and an additional spreadsheet with name results called "fullNameResults.csv".
Instructions
- Run cleanUpGeoCSV.py using the CSV created in Step 3.
Go to next step (5. Convert LCSH headings to GeoNames)→