Takes a simple CSV spreadsheet, and a bunch of files and magically turns them into the DSpace Simple Archive format.
- The first row should be your header, which defines the values you're going to provide.
- Only one column is mandatory: 'files'. Files can be organized in any way you want, just provide the proper path relative to the CSV file's location.
- This fork supports the assignment of collections to items. You need the handle of the respective collection. Have a look at the file
template.csv
for examples. - Add one column for each metadata element (eg: dc.title)
- The order of the columns does not matter.
- Only dublin core metadata elements are supported (for now).
- Use the fully qualified dublin core name for each element (eg dc.contributor.author).
- Languages can be specified with the language label in brackets after the element (see template.csv for examples)
leaving a space after the element name and then listing the language. - Separate multiple values for an element by double-pipes (||).
- If your metadata value has a comma in it, put some quotes around it. Eg: "Roses are red, violets are blue".
files | collections | dc.title en | dc.contributor.author en | dc.subject | dc.type |
---|---|---|---|---|---|
something1.pdf||something_else1.pdf | 123456789/11||123456789/12 | title 1 | author 1 | subject 1 | Report |
directory/something2.pdf | 123456789/1 | "title 2, with comma" | author 2a||author 2b | subject 2 | Article |
Usage ./dspace-csv-archive /path/to/input/file.csv
If it is not already, the directory should be placed in a location that the
dspace
user can access it and write to the directory. I recommend putting
the directory into /home/dspace/imported-data/
and leaving it there so the
mapfile can be easily found if it is needed later, e.g. to remove or modify
imported data. One way to do this is
sudo cp -r [directory-name] /home/dspace/imported-data/
sudo chown -R dspace:dspace /home/dspace/imported-data/[directory-name]
Now we are ready to use the import
command that comes with DSpace. Be sure
to run this command as the dspace
user. Something like
[dspace]/bin/dspace import --add --eperson=[importer's email address] --collection=[collection handle] --source=[directory-name] --mapfile=[directory-name]/mapfile
will add the items in the directory to the requested collection. Please refer to the DSpace documentation for more information about the DSpace Simple Archive Format or the import/export commands.
The --mapfile
argument is particularly important, and the file that gets
generated should be kept along with the rest of the source directory. This
file is required for deleting or modifying the imported files using the
command-line tools.
The directory dist
contains a Windows executable. This was created with PyInstaller and the following command:
pyinstaller -F dspace-csv-archive.py