Useful utils for the data management of open citations
-
Download and unzip citation data csv from the opencitations webpage
unzip data.csv.zip -d [path/to/folder]
-
Get unique DOIs from file & parse quotes
cat data.csv | grep -F "10." | awk -F',' 'FNR > 1 {gsub(/"/, "", $2);gsub(/"/, "", $3);print $2"\n"$3}' | sort -u -T [dir/for/tmp/files] > [unique/DOIs/file]
-
Run the script to get paper details
node getCrossrefWorks.js [input/file/with/DOIs] [output/file] [email] > [errored_dois.csv]
- Some papers from OpenCitations can be retrieved from other sources, before querying the Crossref API to get their details:
node src/findDataInCollections.js [unique/dois/file] [output/paper/details/file] [output/not/found/dois/file]