-
Notifications
You must be signed in to change notification settings - Fork 6
8a. Why add headers and filenames
The purpose of the header script is to create new filenames that include metadata and add headers with metadata to each file. This way, the metadata information is present in the filename and in the file itself. Before showing you how to run the script, we will provide some definitions of we mean by metadata, headers, filename.
-
Metadata: metadata is all the information you have about the texts in your corpus that you would like to keep, for instance, the college, the type of assignment, the student’s proficiency, etc. Here’s an example of some columns from the demo data metadata spreadsheet:
-
Headers. We use headers to store the metadata information mentioned above in each text in our corpus. For Crow, we add the metadata for a text between <> on the top of the text. Here is an example of a file header:
-
Filename. The script standardizes the filenames to include codes that reflect information in your metadata. In Crow, our standard filename looks like this: 106_PS_1_SAU_1_M_10304_UA.txt, where:
- 106 is the course
- PS is the assignment code (Position Argument)
- 1 stands for First Draft
- SAU is the code for student’s country of origin (Saudi Arabia)
- 1 is year in school (freshman)
- M is for gender (male)
- 10304 is an internal Crow ID
- UA is for the University of Arizona
A video version of this content is available on the Crow YouTube channel.
Video: Why add headers and filenames?
CIABATTA: Corpus in a Box: Automated Tools, Tutorials, & Advising
See a problem in this wiki? Report an issue. Unsure how to report using GitHub? Get help reporting.