Skip to content

Commit

Permalink
updated references.adoc
Browse files Browse the repository at this point in the history
  • Loading branch information
andreea-pasare committed Aug 16, 2024
1 parent 8f1ad4d commit c70caa6
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 10 deletions.
18 changes: 9 additions & 9 deletions docs/modules/ROOT/pages/how-to-map-existing-data-models.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -167,22 +167,22 @@ Before initiating the mapping development process, it is crucial to construct a

Conceptual Mapping in semantic data integration can be established at two distinct levels: the vocabulary level and the application profile level. These levels differ primarily in their complexity and specificity regarding the data context they address.

*Vocabulary Level* mapping is established using basic XML elements. This form of mapping aims for a terminological alignment, meaning that an XML element or attribute is directly mapped to an ontology class or property. For example, an XML element +<PostalAddress>+ could be mapped to +locn:Address+ class, or an element +<surname>+ could be mapped to a property +foaf:familyName+ in the FOAF ontology. Such mapping can be established as a simple spreadsheet. This approach results in a simplistic and direct alignment, which lacks contextual depth and specificity. For this reason the next steps of this methodology cannot be continued.
*Vocabulary Level* mapping is established using basic XML elements. This form of mapping aims for a terminological alignment, meaning that an XML element or attribute is directly mapped to an ontology class or property. For example, an XML element `<PostalAddress>` could be mapped to `locn:Address` class, or an element `<surname>` could be mapped to a property `foaf:familyName` in the FOAF ontology. Such mapping can be established as a simple spreadsheet. This approach results in a simplistic and direct alignment, which lacks contextual depth and specificity. For this reason the next steps of this methodology cannot be continued.

A more advanced approach would be to embed semantic annotations into XSD schemas using standards such as SAWSDL [https://www.w3.org/TR/sawsdl/#annotateXSD[ref]]. Such an approach is appropriate in the context of WSDL services.
A more advanced approach would be to embed semantic annotations into XSD schemas using standards such as xref:references.adoc#ref:64[SAWSDL]. Such an approach is appropriate in the context of WSDL services.

*Application Profile Level* of conceptual mapping utilises XPath to guide access to data in XML structures, enabling precise extraction and contextualization of data before mapping it to specific ontology fragments. An ontology fragment is usually expressed as a SPARQL Property Path (or simply Property Path). This Property Path facilitates the description of instantiation patterns specific to the Application Profile. This advanced approach allows for context-sensitive semantic representations, crucial for accurately reflecting the nuances in interpreting the meaning of data structures.

The tables below show two examples of mapping the organisation's address, city and postal code. They show where the data can be extracted from, and how it can be mapped to targeted ontology properties such as +locn:postName+, and +locn:postCode+. To ensure that this address is not mapped in a vacuum but it is linked to an organisation instance, and not a person for example, the mapping is anchored in an instance +?this+ of an +owl:Organisation+. Optionally a class path can be provided to complement the property path and explicitly state the class sequence, which otherwise can be deduced from the Application Profile definition.
The tables below show two examples of mapping the organisation's address, city and postal code. They show where the data can be extracted from, and how it can be mapped to targeted ontology properties such as `locn:postName`, and `locn:postCode`. To ensure that this address is not mapped in a vacuum, but it is linked to an organisation instance, and not a person for example, the mapping is anchored in an instance `?this` of an `owl:Organisation`. Optionally a class path can be provided to complement the property path and explicitly state the class sequence, which otherwise can be deduced from the Application Profile definition.
|===
|*Source XPath*|*/efac:Company/cac:PostalAddress/*cbc:PostalZone*
|*Source XPath*|*/efac:Company/cac:PostalAddress/cbc:PostalZone

|*Target Property Path*|?this cv:registeredAddress /** ***locn**:postCode* ?value .
|*Target Property Path*|?this cv:registeredAddress /*locn:postCode* ?value .
|*Target Class Path*|org:Organization / locn:Address / rdf:PlainLiteral
|===

|===
|*Source XPath*|*/efac:Company/cac:PostalAddress/*cbc:CityName*
|*Source XPath*|*/efac:Company/cac:PostalAddress/cbc:CityName

|*Target Property Path*|?this cv:registeredAddress / *locn:postName* ?value .
|*Target Class Path*|org:Organization / locn:Address / rdf:PlainLiteral
Expand All @@ -196,15 +196,15 @@ The tables below show two examples of mapping the organisation's address, city a

The technical mapping step is a critical phase in the mapping process, serving as the bridge between conceptual design and practical, machine-executable implementation. This step takes as input the conceptual mapping, which has been crafted and validated by domain experts or data-savvy business stakeholders and establishes correspondences between XPath expressions and ontology fragments.

When it comes to representing these mappings technically, several technology options are available[https://ceur-ws.org/Vol-2489/paper4.pdf[ref]]: such as XSLT[ref], RML[ref], SPARQLAnything[ref], etc. But RDF Mapping Language (RML) [https://rml.io/specs/rml/[ref]] stands out for its effectiveness and straightforward approach. RML allows for the representation of mappings from heterogeneous data formats like XML, JSON, relational databases and CSV into RDF, supporting the creation of semantically enriched data models. This code can be expressed in Turtle RML or the YARRRML dialect [https://rml.io/yarrrml/spec/[ref]], a user-friendly text-based format based on YAML, making the mappings accessible to both machines and humans. RML is well-supported by robust implementations such as RMLMapper [https://github.com/RMLio/rmlmapper-java[ref]] and RMLStreamer [https://github.com/RMLio/RMLStreamer[ref]], which provide robust platforms for executing these mappings. RMLMapper is adept at handling batch processing of data, transforming large datasets efficiently. On the other hand, RMLStreamer excels in streaming data scenarios, where data needs to be processed in real-time, providing flexibility and scalability in dynamic environments.
When it comes to representing these mappings technically, several technology options are available (xref:references.adoc#ref:65[paper]): such as xref:references.adoc#ref:66[XSLT], xref:references.adoc#ref:67[RML], xref:references.adoc#ref:68[SPARQLAnything], etc. But xref:references.adoc#ref:67[RDF Mapping Language (RML)] stands out for its effectiveness and straightforward approach. RML allows for the representation of mappings from heterogeneous data formats like XML, JSON, relational databases and CSV into RDF, supporting the creation of semantically enriched data models. This code can be expressed in Turtle RML or the xref:references.adoc#ref:69[YARRRML] dialect, a user-friendly text-based format based on YAML, making the mappings accessible to both machines and humans. RML is well-supported by robust implementations such as xref:references.adoc#ref:70[RMLMapper] and xref:references.adoc#ref:71[RMLStreamer], which provide robust platforms for executing these mappings. RMLMapper is adept at handling batch processing of data, transforming large datasets efficiently. On the other hand, RMLStreamer excels in streaming data scenarios, where data needs to be processed in real-time, providing flexibility and scalability in dynamic environments.

The development of the mapping rules is straightforward due to the preliminary conceptual mapping that is already available. The Conceptual Mapping (CM) aided the understanding to which class and property each XML element be mapped and how. Then, RML mapping statements are created for each class of the target ontology coupled with the property-object mapping statements specific to that class. Furthermore, it is essential to master RML along with XML technologies like XSD, XPath, and XQuery to implement the mappings effectively [https://rml.io/docs/rml/tutorials/xml/[ref]].
The development of the mapping rules is straightforward due to the preliminary conceptual mapping that is already available. The Conceptual Mapping (CM) aided the understanding to which class and property each XML element be mapped and how. Then, RML mapping statements are created for each class of the target ontology coupled with the property-object mapping statements specific to that class. Furthermore, it is essential to master RML along with XML technologies like XSD, XPath, and XQuery to implement the mappings effectively (xref:references.adoc#ref:72[rml-gen]).

An additional step involves deciding on a URI creation policy and designing a uniform scheme for use in the generated data, ensuring consistency and coherence in the data output.

A viable alternative to RML is XSLT technology, which offers a powerful, but low-level method for defining technical mappings. While this method allows for high expressiveness and complex transformations, it also increases the potential for errors due to its intricate syntax and operational complexity. This technology excels in scenarios requiring detailed manipulation and parameterization of XML documents, surpassing the capabilities of RML in terms of flexibility and depth of transformation rules that can be implemented. However, the detailed control it affords means that developers must have a high level of expertise in semantic technologies and exercise caution and precision to avoid common pitfalls associated with its use.

A pertinent example of XSLT's application is the tool [https://github.com/SEMICeu/iso-19139-to-dcat-ap[ref]] for transforming ISO-19139 metadata to the DCAT-AP geospatial profile (GeoDCAT-AP) [https://joinup.ec.europa.eu/solution/geodcat-ap[ref]] in the framework of INSPIRE and the EU ISA Programme. This XSLT script is configurable to accommodate transformation [https://github.com/SEMICeu/iso-19139-to-dcat-ap/blob/master/documentation/Mappings.md[ref]] with various operational parameters such as the selection between core or extended GeoDCAT-AP profiles and specific spatial reference systems for geometry encoding, showcasing its utility in precise and tailored data manipulation tasks.
A pertinent example of XSLT's application is the xref:references.adoc#ref:73[tool] for transforming ISO-19139 metadata to the DCAT-AP geospatial profile (xref:references.adoc#ref:74[GeoDCAT-AP]) in the framework of INSPIRE and the EU ISA Programme. This XSLT script is configurable to accommodate xref:references.adoc#ref:75[transformation] with various operational parameters such as the selection between core or extended GeoDCAT-AP profiles and specific spatial reference systems for geometry encoding, showcasing its utility in precise and tailored data manipulation tasks.

*Inputs:* Conceptual Mapping spreadsheet, sample XML data

Expand Down
13 changes: 12 additions & 1 deletion docs/modules/ROOT/pages/references.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,15 @@
- *[[ref:61]][edoal]* EDOAL: Expressive and Declarative Ontology Alignment Language. Available at: https://moex.gitlabpages.inria.fr/alignapi/edoal.html
- *[[ref:62]][silk]* System for Internet-Level Knowledge (SiLK). Available at: https://tools.netsa.cert.org/silk/
- *[[ref:63]][vocbench]* VocBench. Available at: https://vocbench.op.europa.eu/
- *[[ref:63]][vocbench]*
- *[[ref:64]][sawsdl]* SAWSDL. Available at: https://www.w3.org/TR/sawsdl/#annotateXSD
- *[[ref:65]][paper5]* Ben De Meester, Pieter Heyvaert, Ruben Verborgh and Anastasia Dimou. Mapping Language Analysis of Comparative Characteristics. Available at: https://ceur-ws.org/Vol-2489/paper4.pdf
- *[[ref:66]][xslt]* XSL Transformations (XSLT) Version 3.0. Available at: https://www.w3.org/TR/xslt-30/
- *[[ref:67]][rml]* RDF Mapping Language (RML). Available at: https://rml.io/specs/rml/
- *[[ref:68]][sparql-anything]* SPARQL-Anything. Available at: https://github.com/SPARQL-Anything/sparql.anything
- *[[ref:69]][yarrrml]* YARRRML. Available at: https://rml.io/yarrrml/spec/
- *[[ref:70]][rml-mapper]* RML Mapper. Available at: https://github.com/RMLio/rmlmapper-java
- *[[ref:71]][rml-streamer]* RML Sreamer. Available at: https://github.com/RMLio/RMLStreamer
- *[[ref:72]][rml-gen]* Generate RDF from an XML file. Available at: https://rml.io/docs/rml/tutorials/xml/
- *[[ref:73]][iso2dcat-ap]* Reference XSLT-based implementation of GeoDCAT-AP. Available at: https://github.com/SEMICeu/iso-19139-to-dcat-ap
- *[[ref:74]][geo-dcat-ap]* GeoDCAT Application Profile for data portals in Europe. Available at: https://joinup.ec.europa.eu/collection/semic-support-centre/solution/geodcat-application-profile-data-portals-europe
- *[[ref:75]][iso2dacat-ap-mappings]* Mappings defined in GeoDCAT-AP. Available at: https://github.com/SEMICeu/iso-19139-to-dcat-ap/blob/master/documentation/Mappings.md

0 comments on commit c70caa6

Please sign in to comment.