8a. Why add headers and filenames

Getting started

The purpose of the header script is to create new filenames that include metadata and add headers with metadata to each file. This way, the metadata information is present in the filename and in the file itself. Before showing you how to run the script, we will provide some definitions of we mean by metadata, headers, filename.

Metadata: metadata is all the information you have about the texts in your corpus that you would like to keep, for instance, the college, the type of assignment, the student’s proficiency, etc. Here’s an example of some columns from the demo data metadata spreadsheet:
Headers. We use headers to store the metadata information mentioned above in each text in our corpus. For Crow, we add the metadata for a text between <> on the top of the text. Here is an example of a file header:
Filename. The script standardizes the filenames to include codes that reflect information in your metadata. In Crow, our standard filename looks like this: 106_PS_1_SAU_1_M_10304_UA.txt, where:

106 is the course

PS is the assignment code (Position Argument)

1 stands for First Draft

SAU is the code for student’s country of origin (Saudi Arabia)

1 is year in school (freshman)

M is for gender (male)

10304 is an internal Crow ID

UA is for the University of Arizona

Video presentation

A video version of this content is available on the Crow YouTube channel.

Video: Why add headers and filenames?

Navigating CIABATTA

Previous: 8b. Adding headers and changing filenames script

Next: 9. Deidentifying your data

CIABATTA: Corpus in a Box: Automated Tools, Tutorials, & Advising

See a problem in this wiki? Report an issue. Unsure how to report using GitHub? Get help reporting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8a. Why add headers and filenames

Contents

Getting started

Video presentation

Navigating CIABATTA

Clone this wiki locally