Name		Name	Last commit message	Last commit date
parent directory ..
source_data		source_data
README.md		README.md
beach_lendinglibrary_catalog.csv		beach_lendinglibrary_catalog.csv
dataset.py		dataset.py
exceptional_metadata.py		exceptional_metadata.py
logbook-dates.json		logbook-dates.json
long_borrow_overrides.csv		long_borrow_overrides.csv
longborrow_overrides.py		longborrow_overrides.py
partial_borrowers.csv		partial_borrowers.csv
partial_borrowers_collapsed.csv		partial_borrowers_collapsed.csv

README.md

Data

This folder includes data used for research on "Missing Data, Speculative Reading" article.

Source data

Copies of published Shakespeare and Company Project dataset files are included for convenience.

Current versions should be obtained from the Project site, and should be cited as listed there:

https://shakespeareandco.princeton.edu/about/data/

Research data

Data files in this folder generated as part of the research for this article or data not published elsewhere.

Book acquisition catalog

beach_lendinglibrary_catalog.csv

This data is a set of a spreadsheet of acquisitions compiled by Robert Chiossi for the Project from an inventory from the Sylvia Beach papers.

“Inventories, Order Records, Clients; Sylvia Beach Papers, C0108,” (n.d.), Manuscripts Division, Department of Special Collections, Princeton University Library, findingaids.princeton.edu/catalog/C0108_c02205.

Partial borrowers

Members with extant but incomplete borrowing records. CSV files list these members and their subscriptions without documented borrowing activity. The collapsed version consolidates sequential or near-sequential subscriptions.

The files were generated by identify_partial_borrowers.py

partial_borrowers.csv
partial_borrowers_collapsed.csv

Long-borrow overrides

In the course of our research, we discovered long-duration borrow events (duration longer than a year) that had been incorrectly entered; these errors are present in the v1.2 datasets but corrections have been submitted to the Shakespeare and Company Project. Since these impact our estimates, we include a list overrides and a mechanism for applying them.

long_borrow_overrides.csv

Incorporating long borrow corrections

The long borrow corrections are meant to be used with the 1.2 version of the dataset. They can be incorporated like this:

events_df = pd.read_csv("SCoData_events_v1.2_2022-01.csv")
borrow_overrides = pd.read_csv("long_borrow_overrides.csv")

events_df = pd.read_csv("SCoData_events_v1.2_2022-01.csv")
borrow_overrides = pd.read_csv("long_borrow_overrides.csv")


for borrow in borrow_overrides.itertuples():
    member_item_borrows = events_df[
        (events_df.event_type == "Borrow")
        & (events_df.member_uris == borrow.member_uris)
        & (events_df.item_uri == borrow.item_uri)
    ]
    if borrow.match_date == "start_date":
        # get the *index* of the row to update
        update_index = member_item_borrows.index[
            member_item_borrows.start_date == borrow.start_date
        ]
    elif borrow.match_date == "end_date":
        update_index = member_item_borrows.index[
            member_item_borrows.end_date == borrow.end_date
        ]

    # update with correct dates & borrow duration
    events_df.at[update_index, "start_date"] = borrow.start_date
    events_df.at[update_index, "end_date"] = borrow.end_date
    events_df.at[
        update_index, "borrow_duration_days"
    ] = borrow.borrow_duration_days

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Data

Source data

Research data

Book acquisition catalog

Partial borrowers

Long-borrow overrides

Incorporating long borrow corrections

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Data

Source data

Research data

Book acquisition catalog

Partial borrowers

Long-borrow overrides

Incorporating long borrow corrections