Skip to content

Using the Linked Data module to access authorities

E. Lynette Rayle edited this page Apr 25, 2022 · 10 revisions

Table of contents

Overview

The linked data module provides access to many authorities using their linked data API. Common code is used to make requests through the external authorities linked data API and to process the linked data results that are returned. A configuration file drives describes the access URLs and how to interpret the data that is returned.

Authorities that fit best with this module follow some basic rules...

  • search API URL allows a query string to be passed and returns results in some serialization of linked data
    • While not required, for best processing, the search results should include a rank predicate that indicates the order of the returned results
  • term fetch API URL allows an ID or URI to be passed and returns data about the term in some serialization of linked data

See also:

Prerequisites

Include required gem for processing linked data

You will need to add the ruby-rdf/linkeddata gem that processes a large number of linked data serializations.

gem 'linkeddata'

NOTE: This gem is included in QA for development and testing of QA, but is not automatically included in the released gem.

Create access to a new authority that supports linked data

To create a new authority, you will need to...

  • identify search API URL that returns linked data
  • identify term fetch API URL that returns linked data
  • create configuration with the API URLs and how to interpret data results (See documentation on creating configurations.)

Place the new configuration in... config/authorities/linked_data directory.

Using existing authority configurations

There are existing configurations to many commonly used authorities. You can get the configuration file and find more documentation about their linked data APIs at...

To use, simply copy the configuration file and place it in... config/authorities/linked_data directory.

NOTE: The list of configurations created by the LD4 series of grants is available at ld4p/linked_data_authorities. Any config with an extension of _DIRECT goes directly against an external authority and can be used with any application. The others can serve as a starting point if you are creating your own cache of the authority's data.

Accessing via QA

Authority: Use the name of the configuration file (e.g. oclc_fast.json has the authority name oclc_fast)

Subauthorities:

Subauthorities, if supported, are defined in the configuraiton file. They are defined separately for search and term fetch.

For oclc_fast, the config file defines the search subauthorities as...

"subauthorities": {
  "topic":          "oclc.topic",
  "geographic":     "oclc.geographic",
  "event_name":     "oclc.eventName",
  "personal_name":  "oclc.personalName",
  "corporate_name": "oclc.corporateName",
  "uniform_title":  "oclc.uniformTitle",
  "period":         "oclc.period",
  "form":           "oclc.form",
  "alt_lc":         "oclc.altlc"
}

NOTE: The key in the subauthorities hash is the subauthority name used in QA requests. The value in the subauthorities hash is the value required by the external authority's API. For more information on configuring subauthorities, see the configuration documentation.


Mount engine in routes

The examples in this document include a starting path ENGINE_MOUNT. The value of this is typically qa or authorities. It is defined in /config/routes.rb.

  mount Qa::Engine => '/authorities'

List authorities

To list all currently loaded authorities:

/ENGINE_MOUNT/list/linked_data/authorities

NOTE: YOUR_AUTH_TOKEN is defined in config/initializers/qa.rb as config.authorized_reload_token.


Reload authorities

If you add an authority to the directory holding authorities, it won't be picked up until there is a server restart. But you can force a reload without starting the server using the reload parameter with the reload_token.

/ENGINE_MOUNT/reload/linked_data/authorities?auth_token=YOUR_AUTH_TOKEN

NOTE: YOUR_AUTH_TOKEN is defined in config/initializers/qa.rb as config.authorized_reload_token.


Example search queries

The linked data module supports many different external linked data authorities. The OCLC FAST authority, which is included in QA, is being used in these examples to demonstrate how to search for terms using the linked data authority module in QA.

All authorities support the following parameters for search:

  • lang - if supported, return literals tagged with this language + literals without language tags (optional) This can also be set through the 'HTTP_ACCEPT_LANGUAGE' header
  • context - if true and if supported by the authority, additional context will be returned for each response in the query results
  • performance_data - if true, the response will include performance data along with the results
  • response_header - if true, metadata about the request and response will be included in the response along with the results
  • format - currently only supports json

Configurations generally support the following parameters using a consistent naming scheme even if the external authority uses a different name for these parameters. The actual name of the parameter is defined in the configuration. See qa_replacement_patterns in the example configuration.

  • q - the string query
  • maxRecords - limit the number of returned records (optional)

Example search OCLC_FAST without a subauthority:

/ENGINE_MOUNT/search/linked_data/oclc_fast?q=twain&maximumRecords=2

Result:

[
  {"uri":"http://id.worldcat.org/fast/1914919","id":"1914919","label":"Life on the Mississippi (Twain, Mark)"},
  {"uri":"http://id.worldcat.org/fast/1796341","id":"1796341","label":"Works (Twain, Mark)"}
]

NOTE: The qa request is converted to the following OCLC FAST request. This is for information only. You do not need to know this to use QA.

http://experimental.worldcat.org/fast/search?query=cql.any+all+%22twain%22&sortKeys=usage&maximumRecords=2

Example search OCLC_FAST with subauthority personal_name:

/ENGINE_MOUNT/search/linked_data/oclc_fast/personal_name?q=twain&maximumRecords=2

Result:

[
  {"uri":"http://id.worldcat.org/fast/1580187","id":"1580187","label":"Braden, Twain"},
  {"uri":"http://id.worldcat.org/fast/365563","id":"365563","label":"Twain, Shania"}
]

NOTE: The qa request is converted to the following OCLC FAST request. This is for information only. You do not need to know this to use QA.

http://experimental.worldcat.org/fast/search?query=oclc.personalName+all+%22twain%22&sortKeys=usage&maximumRecords=2

Example term fetch request

The linked data module supports many different external linked data authorities. The LOC authority, which is included in QA, is being used in these examples to demonstrate how to fetch a term using the linked data authority module in QA.

Some authorities only support fetching by id, while others only support fetching by URI. There are two APIs provided by QA to allow for fetch by ID (i.e. show/{id}) or URI (i.e. fetch?uri={uri}).

The term fetch supports the following standard parameter for all requests:

  • format = json | jsonld - An example of each follows. The notes under the example results provide an explanation the results based on the value of this parameter.

Example fetch by ID from LOC with subauthority subjects as json:

/ENGINE_MOUNT/show/linked_data/loc/subjects/sh85076841

Result:

{
  "uri":"http://id.loc.gov/authorities/subjects/sh85076841",
  "id":"sh 85076841",
  "label":["Life sciences"],
  "altlabel":["Biosciences","Sciences, Life"],
  "narrower":["http://id.loc.gov/authorities/subjects/sh85083022","http://id.loc.gov/authorities/subjects/sh85002415",etc.],
  "broader":["http://id.loc.gov/authorities/subjects/sh00007934"],
  "sameas":[""],
  "predicates":{
    "http://www.loc.gov/mads/rdf/v1#hasCloseExternalAuthority":["http://id.worldcat.org/fast/998323","http://data.bnf.fr/ark:/12148/cb119716335",etc.],
    "http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection":["http://id.loc.gov/authorities/subjects/collection_SubdivideGeographically","http://id.loc.gov/authorities/subjects/collection_LCSH_General",etc.],
    "http://www.loc.gov/mads/rdf/v1#isMemberOfMADSScheme":["http://id.loc.gov/authorities/subjects"],
    "http://www.w3.org/2008/05/skos-xl#altLabel":["Biosciences","Sciences, Life"],
    etc.}
}

NOTE: The results when requesting json are normalized based on definitions in the configuration file for the LOC authority. Using format=json provides apps with a normalized set of results that are easier to process by the consuming app. The results will include all parts of the graph returned by the external authority that have the result URI as the subject URI of the triples. Extended data that have different subject URIs are not part of the results returned by QA in the json format.

NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.

http://id.loc.gov/authorities/subjects/sh85076841

Example fetch by ID from LOC with subauthority subjects as json-ld:

/ENGINE_MOUNT/show/linked_data/loc/subjects/sh85076841

Result:

{
  "@context": {
    "mads": "http://www.loc.gov/mads/rdf/v1#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "skosxl": "http://www.w3.org/2008/05/skos-xl#",
    "identifiers": "http://id.loc.gov/vocabulary/identifiers/",
    "owl": "http://www.w3.org/2002/07/owl#",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@graph": [
    {
      "@id": "http://id.worldcat.org/fast/998327",
      "skos:prefLabel": "Life sciences--Computer programs",
      "@type": [
        "mads:Authority",
        "skos:Concept"
      ],
      "mads:authoritativeLabel": "Life sciences--Computer programs"
    },
    {
      "@id": "http://id.worldcat.org/fast/998329",
      "skos:prefLabel": "Life sciences--Data processing",
      "@type": [
        "mads:Authority",
        "skos:Concept"
      ],
      "mads:authoritativeLabel": "Life sciences--Data processing"
    },
    {
      "@id": "http://id.worldcat.org/fast/998325",
      "skos:prefLabel": "Life sciences--Authorship",
      "@type": [
        "mads:Authority",
        "skos:Concept"
      ],
      "mads:authoritativeLabel": "Life sciences--Authorship"
    },
    etc.
  ]
}

NOTE: The results when requesting json-ld are the full graph as it is returned from the external authority request. This will include extended triples that have a subject URI different from the URI of the fetched term.

NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.

http://id.loc.gov/authorities/subjects/sh85076841

Example fetch by URI from dbPedia as json:

To run this example, copy the dbpedia_direct.json configuration to /config/authorities/linked_data/ and restart rails server.

/ENGINE_MOUNT/fetch/linked_data/dbpedia_direct?uri=http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art

Result:

{
  "uri":"http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art",
  "id":"http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art",
  "label":["Herbert F. Johnson Museum of Art"],
  "altlabel":[],
  "narrower":[""],
  "broader":[""],
  "sameas":[""],
  "predicates":{
    "http://purl.org/dc/terms/subject":["http://dbpedia.org/resource/Category:Art_museums_in_New_York","http://dbpedia.org/resource/Category:University_art_museums_and_galleries_in_New_York",etc.],
    "http://www.w3.org/2003/01/geo/wgs84_pos#geometry":["POINT(-76.486465454102 42.450839996338)"],
    "http://xmlns.com/foaf/0.1/homepage":["http://museum.cornell.edu/"],
    "http://dbpedia.org/ontology/thumbnail":["http://commons.wikimedia.org/wiki/Special:FilePath/Johnson-museum-of-art-cornell.JPG?width=300"],
    etc.
  }
}

NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.

http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art?locale=en

Example list all terms

Not supported


Documentation

Each authority has its own documentation. The LD4 series of grants has created a resource that has links to many authorities that support access via linked data APIs. See ld4p/linked_data_authorities for more information.

Clone this wiki locally