This repository contains scripts to analyze, convert and publish Dewey Decimal Classification (DDC) numbers found K10plus catalogue. The analysis is mainly based on coli-ana DDC number decomposition.
See subdirectory publication
for script to generate the data publication https://doi.org/10.5281/zenodo.10569321
npm ci
Read a list of (numerically sorted) DDC numbers and generate analysis with coli-ana
The script bin/k10plus-patch.js
- reads PICA+ records (or PPNs to retrieve records from K10plus)
- extracts DDC fields from the records
- retrieves DDC analysis (cached in a local database)
- and emits PICA Patch files to modify records
Usage: k10plus-patch [options] < file
Check and extend DDC numbers in PICA records of K10plus catalogue
Options:
-a, --api <URL> coli-ana API endpoint
-c, --continue <ppn> continue after given PPN (expect sorted)
-f, --format <name> PICA+ serialization (default: plain)
-i, --input <file> input file (default: - for STDIN)
-d, --database <file> optional SQLite file for caching
-p, --ppns input is list of PPNs instead of PICA records
-h, --help display help for command
Given the full analysis from coli-ana API in JSKOS format as published at https://doi.org/10.5281/zenodo.10569320, this jq script can be used to simplify the JSKOS records for creation of PDF files for each DDC number:
zcat ddc-decomposition.ndjson.gz | jq -c -f simplify-for-pdf.jq -c > ddc-pdf-data.ndjson
Calculate frequency of individual DDC elements in analysis result and emit as CSV or JSKOS concept list
ddcs
sorted DDC numbers found in K10plus with number of occurrences. Data generated from K10plus Subjects.
- coli-ana API to analyze DDC numbers
- K10plus Subjects to analyze, extract and publish subject indexing data (including DDC but also other systems) from K10plus