-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Provides tools to read/write/publish metadata based on the Atom XML syndication format. This includes support of Dublin Core XML implementation, and a client to APIs implementing the AtomPub SWORD API specification.
If you wish to sponsor atom4R
, do not hesitate to contact me
Many thanks to the following organizations that have provided fundings for strenghtening the atom4R
package:
Table of contents
1. Overview
2. Package status
3. Credits
4. User guide
4.1 Installation
4.2 Write and Read Atom XML records
4.2.1 Atom Feed and Entry objects
4.2.2 Dublin Core Entry objects
4.3 Publish Atom XML records
4.3.1 SWORD API for Dataverse
4.3.1.1 Create Dataverse record
4.3.1.2 Read Dataverse record
4.3.1.3 Update Dataverse record
4.3.1.4 Delete Dataverse record
4.3.1.5 Add/Remove files in a Dataverse record
4.3.1.6 Publish Dataverse record
5. Issue reporting
The atom4R package provides tools to read/write/publish metadata based on the Atom XML syndication format. This includes support of Dublin Core XML implementation, and a client to APIs implementing the AtomPub SWORD API specification.
An introduction on the Atom web standard(s), including the Atom XML Syndication format and AtomPub protocol can be found at [https://en.wikipedia.org/wiki/Atom_(Web_standard).
atom4R is jointly developed together with the geoflow which intends to facilitate and automate the production of geographic metadata documents and their associated datasources, where atom4R is used to assign DOIs and cross-reference these DOIs in other metadata documents such as geographic metadata (ISO 19115/19139) hosted in metadata catalogues and open data portals.
- June 2020: Inception. Code source managed on GitHub.
- Coming soon: Publication on CRAN.
(c) 2020, Emmanuel Blondel
Package distributed under MIT license.
If you use atom4R
, i would be very grateful if you can add a citation in your published work. By citing atom4R
, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. For citation, please use the DOI:
For now, the package can be installed from Github
install.packages("devtools")
Once the devtools package loaded, you can use the install_github to install atom4R
. By default, package will be installed from master
which is the current version in development (likely to be unstable).
require("devtools")
install_github("eblondel/atom4R")
4.2.1 Atom Feed and Entry objects
The below example shows how to create a AtomFeed
object, adding an AtomEntry
to it, how to encode it as XML, and how a AtomFeed
can be read from an XML.
#encoding
atom <- AtomFeed$new()
atom$setId("my-atom-feed")
atom$setTitle("My Atom feed title")
atom$setSubtitle("MyAtom feed subtitle")
author1 <- AtomAuthor$new(
name = "John Doe",
uri = "http://www.atomxml.com/johndoe",
email = "johndoe@atom4R.com"
)
atom$addAuthor(author1)
author2 <- AtomAuthor$new(
name = "John Doe's sister",
uri = "http://www.atomxml.com/johndoesister",
email = "johndoesister@atom4R.com"
)
atom$addAuthor(author2)
contrib1 <- AtomContributor$new(
name = "Contrib1",
uri = "http://www.atomxml.com/contrib1",
email = "contrib1@atom4R.com"
)
atom$addContributor(contrib1)
contrib2 <- AtomContributor$new(
name = "Contrib2",
uri = "http://www.atomxml.com/contrib2",
email = "contrib2@atom4R.com"
)
atom$addContributor(contrib2)
atom$setIcon("https://via.placeholder.com/300x150.png/03f/fff?text=atom4R")
atom$setSelfLink("http://example.com/atom.feed")
atom$setAlternateLink("http://example.com/my-atom-feed")
atom$addCategory("dataset")
atom$addCategory("spatial")
atom$addCategory("fisheries")
#add entry
entry <- AtomEntry$new()
entry$setId("my-atom-entry")
entry$setTitle("My Atom feed entry")
entry$setSummary("My Atom feed entry very comprehensive abstract")
author1 <- AtomAuthor$new(
name = "John Doe",
uri = "http://www.atomxml.com/johndoe",
email = "johndoe@atom4R.com"
)
entry$addAuthor(author1)
author2 <- AtomAuthor$new(
name = "John Doe's sister",
uri = "http://www.atomxml.com/johndoesister",
email = "johndoesister@atom4R.com"
)
entry$addAuthor(author2)
contrib1 <- AtomContributor$new(
name = "Contrib1",
uri = "http://www.atomxml.com/contrib1",
email = "contrib1@atom4R.com"
)
entry$addContributor(contrib1)
contrib2 <- AtomContributor$new(
name = "Contrib2",
uri = "http://www.atomxml.com/contrib2",
email = "contrib2@atom4R.com"
)
entry$addContributor(contrib2)
entry$addCategory("dataset")
entry$addCategory("spatial")
entry$addCategory("fisheries")
atom$addEntry(entry)
xml <- atom$encode()
#decoding
atom2 <- AtomFeed$new(xml = xml)
xml2 <- atom2$encode()
4.2.2 Dublin Core Entry objects
The below example shows how to create a DCEntry
object in R, how to encode it as XML, and how a DCEntry
can be read from an XML. An DCEntry
can be used as AtomEntry
in an AtomFeed
object.
#encoding
dcentry <- DCEntry$new()
dcentry$setId("my-dc-entry")
#fill dc entry
dcentry$addDCDate(Sys.time())
dcentry$addDCTitle("atom4R - Tools to read/write and publish metadata as Atom XML format")
dcentry$addDCType("Software")
creator <- DCCreator$new(value = "Blondel, Emmanuel")
creator$attrs[["affiliation"]] <- "Independent"
dcentry$addDCCreator(creator)
dcentry$addDCSubject("R")
dcentry$addDCSubject("FAIR")
dcentry$addDCSubject("Interoperability")
dcentry$addDCSubject("Open Science")
dcentry$addDCDescription("Atom4R offers tools to read/write and publish metadata as Atom XML syndication format, including Dublin Core entries. Publication can be done using the Sword API which implements AtomPub API specifications")
dcentry$addDCPublisher("GitHub")
funder <- DCContributor$new(value = "CNRS")
dcentry$addDCContributor(funder)
dcentry$addDCRelation("Github repository: https://github.com/eblondel/atom4R")
dcentry$addDCSource("Atom Syndication format - https://www.ietf.org/rfc/rfc4287")
dcentry$addDCSource("AtomPub, The Atom publishing protocol - https://tools.ietf.org/html/rfc5023")
dcentry$addDCSource("Sword API - http://swordapp.org/")
dcentry$addDCSource("Dublin Core Metadata Initiative - https://www.dublincore.org/")
dcentry$addDCSource("Guidelines for implementing Dublin Core in XML - https://www.dublincore.org/specifications/dublin-core/dc-xml-guidelines/")
dcentry$addDCLicense("NONE")
dcentry$addDCRights("MIT License")
xml <- dcentry$encode()
#decoding
dcentry2 <- DCEntry$new(xml = xml)
xml2 <- dcentry2$encode()
The Atom Publishing Protocol (AtomPub or APP) is a simple HTTP-based protocol for creating and updating web resources.
atom4R intends to offer a standard R interface to APIs implementing the AtomPub protocol. Among them, one of the key APIs the package is targeting is the SWORD API. For the timebeing, atom4R offers an R interface to SWORD API v2, taking as main testing plateform the Opensource Dataverse. Additional plateforms implementing AtomPub / SWORD are foreseen to be tested depending on user community needs.
4.3.1 SWORD API for Dataverse
An interface for the Dataverse SWORD API is defined in atom4R with the SwordDataverseClient
. To connect to Dataverse SWORD API, run the following code (filling your dataverse hostname and user token):
SWORD <- SwordDataverseClient$new(
hostname = "localhost:8085",
token = "<token>",
logger = "DEBUG"
)
The following sections detail how to run the SWORD API operations with atom4R.
4.3.1.1. Create Dataverse Record
To create a Dataverse record, you should specify the ID of the dataverse
(collection) in which you want to deposit the record. The record should be an object of class DCEntry
.
#Create with SWORD
out <- SWORD$createDataverseRecord("<dataverse ID>", dcentry)
4.3.1.2. Read Dataverse Record
To read/get an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:
#Read with SWORD
out <- SWORD$getDataverseRecord("doi:10.XXX/10XXXX")
4.3.1.3. Update Dataverse Record
To update an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:
#Update with SWORD
out <- SWORD$updateDataverseRecord("<dataverse ID>", dcentry, "doi:10.XXX/10XXXX")
4.3.1.4. Delete Dataverse Record
To delete an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:
#Delete with SWORD
out <- SWORD$deleteDataverseRecord("doi:10.XXX/10XXXX")
4.3.1.5. Add/Remove files in a Dataverse record
One or files can be added to a Dataverse record. As for the other methods, the global identifier (DOI assigned by Dataverse) is required to locate the record to which files should be added.
SWORD$addFilesToDataverseRecord("doi:10.XXX/10XXXX", files = c("file1", "file2", ...))
The files should be added as simple vector giving the file name(s).
In similar way, files can be removed from a Dataverse record. To delete all files:
SWORD$deleteFilesFromDataverseRecord("doi:10.XXX/10XXXX")
To delete specific files, use the files
argument, silimarly to addFilesToDataverseRecord
method.
4.3.1.6. Publish Dataverse record
To publish an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:
#Publish with SWORD
out <- SWORD$publishDataverseRecord("doi:10.XXX/10XXXX")
A published record on Dataverse cannot be deleted by yourself. If you want to delete a Dataverse record you should contact your Dataverse administrator. However it is possible to edit a record. Its newer publication will induce the creation of a new record version in Dataverse.
Issues can be reported at https://github.com/eblondel/atom4R/issues
Related to Dataverse
- Any way to search datasets / get dataset by "other ID" instead of DOI: https://github.com/IQSS/dataverse/issues/6952
- Dataverse QA integration tests with Docker - need of default token: https://github.com/IQSS/dataverse-docker/issues/40