diff --git a/print_page/index.html b/print_page/index.html index 4be4014..c4d24fb 100755 --- a/print_page/index.html +++ b/print_page/index.html @@ -858,7 +858,7 @@

Additional key pointsQuick tutorial

This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user.

-

See a short Presentation and Poster if you want to have a more general overview of the tool..

+

See a short Presentation and Poster if you want to have a more general overview of the tool.


Overview

The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:

diff --git a/search/search_index.json b/search/search_index.json index ce77157..825b180 100755 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\\\s\\\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"

An ecosystem for sharing metadata

"},{"location":"#foster-good-data-management-with-data-sharing-in-mind","title":"Foster good data management, with data sharing in mind","text":"

Sharing descriptive Metadata is the first essential step towards Open Scientific Data. With this in mind, Maggot was specifically designed to annotate datasets by creating a metadata file to attach to the storage space. Indeed, it allows users to easily add descriptive metadata to datasets produced within a collective of people (research unit, platform, multi-partner project, etc.). This approach fits perfectly into a data management plan as it addresses the issues of data organization and documentation, data storage and frictionless metadata sharing within this same collective and beyond.

"},{"location":"#main-features-of-maggot","title":"Main features of Maggot","text":"

The main functionalities of Maggot were established according to a well-defined need (See Background).

  1. Documente with Metadata your datasets produced within a collective of people, thus making it possible :
  2. Search datasets by their metadata
  3. Publish the metadata of datasets along with their data files into an Europe-approved repository

See a short Presentation and Poster for a quick overview.

"},{"location":"#overview-of-the-different-stages-of-metadata-management","title":"Overview of the different stages of metadata management","text":"

Note: The step numbers indicated in the figure correspond to the different points developed below

1 - First you must define all the metadata that will be used to describe your datasets. All metadata can be defined using a single file (in TSV format, therefore using a spreadsheet). This is a unavoidable step because both input and search interfaces are completely generated from these definition files, defining in this way each of the fields along with their input type and also the associated Controlled Vocabulary (ontology, thesaurus, dictionary, list of fixed terms). The metadata proposed by default was mainly established according to the DDI (Data Documentation Initiative) metadata schema. This schema also largely corresponds to that adopted by the Dataverse software. See the Terminology Definition section.

2 - Entering metadata will be greatly facilitated by the use of dictionaries. The dictionaries offered by default are: people, funders, data producers, as well as a vocabulary dictionary allowing you to mix ontologies and thesauri from several sources. Each of these dictionaries allows users, by entering a name by autocompletion, to associate information which will then be added when exporting the metadata either to a remote repository, or for harvesting the metadata. Thus this information, once entered into a dictionary, will not need to be re-entered again.

3 - The web interface for entering metadata is entirely built on the basis of definition files. The metadata are distributed according to the different sections chosen, each constituting a tab (see screenshot). Mandatory fields are marked with a red star and must be documented in order to be able to generate the metadata file. The entry of metadata governed by a controlled vocabulary is done by autocompletion from term lists (dictionary, thesaurus or ontology). We can also define external resources (URL links) relating to documents, publications or other related data. Maggot thus becomes a hub for your datasets connecting different resources, local and external. Once the mandatory fields (at least) and other recommended fields (at best) have been entered, the metadata file can be generated in JSON format.

4 - The file generated in JSON format must be placed in the storage space reserved for this purpose. The role played by this metadata file can be seen as a README file adapted for machines, but also readable by humans. With an internal structure, it offers coherence and consistency of information that a simple README file with a completely free and therefore unstructured text format does not allow. Furthermore, the central idea is to use the storage space as a local data repository, so that the metadata should go to the data and not the other way around.

5 - A search of the datasets can thus be carried out on the basis of the metadata. Indeed, all the JSON metadata files are scanned and parsed according to a fixed time interval (30 min) then loaded into a database. This allows you to perform searches based on predefined metadata. The search form, in a compact shape, is almost the same as the entry form (see a screenshot). Depending on the search criteria, a list of data sets is provided, with for each of them a link pointing to the detailed sheet.

6 - The detailed metadata sheet provides all the metadata divided by section. Unfilled metadata does not appear by default. When a URL can be associated with information (ORCID, Ontology, web site, etc.), you can click on it to go to the corresponding link. Likewise, it is possible to follow the associated link on each of the resources. From this sheet, you can also export the metadata according to different schemata (Dataverse, Zenodo, JSON-LD). See screenshot 1 & screenshot 2.

7 - Finally, once you have decided to publish your metadata with your data, you can choose the repository that suits you (currently repositories based on Dataverse and Zenodo are supported).

"},{"location":"#additional-key-points","title":"Additional key points","text":"

"},{"location":"about/","title":"About","text":""},{"location":"about/#background","title":"Background","text":""},{"location":"about/#motives","title":"Motives","text":""},{"location":"about/#state-of-need","title":"State of need","text":""},{"location":"about/#proposed-approach","title":"Proposed approach","text":""},{"location":"about/#links","title":"Links","text":""},{"location":"about/#contacts","title":"Contacts","text":""},{"location":"about/#designers-developers","title":"Designers / Developers","text":""},{"location":"about/#contributors","title":"Contributors","text":"

"},{"location":"bloxberg/","title":"Bloxberg Blockchain","text":""},{"location":"bloxberg/#experimental-certification-of-metadata-file-on-the-bloxberg-blockchain","title":"EXPERIMENTAL - Certification of metadata file on the bloxberg blockchain","text":""},{"location":"bloxberg/#motivation","title":"Motivation","text":"

To guarantee the authenticity and integrity of a metadata file by recording it permanently and immutably on the bloxberg blockchain.

Indeed, the blockchain is a technology that makes it possible to keep track of a set of transactions (writings in the blockchain), in a decentralized, secure and transparent manner, in the form of a blockchain. A blockchain can therefore be compared to a large (public or private) unfalsifiable register. Blockchain is today used in many fields because it provides solutions to many problems. For example in the field of Higher Education and Research, registration of dataset metadata in the blockchain, makes possible in this way to certify, in an inalienable, irrefutable and completely transparent manner, the ownership and authenticity of the data as well as for example, the license of use and the date of production of the data. Research stakeholders are then more open to the dissemination of their data (files, results, protocols, publications, etc.) since they know that, in particular, the ownership, content and conditions of use of the data cannot not be altered.

The Maggot tool could thus serve as a gateway to certify its data with the associated metadata. The complete process is schematized by the following figure:

"},{"location":"bloxberg/#about-bloxberg","title":"About bloxberg","text":"

bloxberg is the most important blockchain project in science. It was founded in 2019 by MPDL , looking for a way to store research results and make them available to other researchers. In this sense, bloxberg is a decentralized register in which results can be stored in a tamper-proof way with a time stamp and an identifier.

bloxberg is based on the Ethereum Blockchain. However, it makes use of a different consensus mechanism: instead of \u201cProof of Stake\u201d used by Ethereum since 2022, bloxberg validates blocks through \u201cProof of Authority\u201d. Each node is operated by one member. All members of the association are research institutions and are known in the network. Currently, bloxberg has 49 nodes. It is an international project with participating institutions from all over the world.

"},{"location":"bloxberg/#how-to-process","title":"How to process ?","text":"

You will need a Ethereum address and an API key (must be requested via bloxberg-services (at) mpdl.mpg.de). See an example of pushing a metadata file to the bloxberg blockchain using Maggot.

"},{"location":"bloxberg/#useful-links","title":"Useful links","text":""},{"location":"configuration/","title":"Configuration","text":""},{"location":"configuration/#terminology-configuration","title":"Terminology configuration","text":"

A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces are completely generated from this definition file, thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration and conversion step into JSON format is essential in order to be able to configure all the other modules (example: creation of the MongoDB database schema when starting the application before filling it).

"},{"location":"configuration/#tsv-to-json","title":"TSV to JSON","text":""},{"location":"configuration/#tsv-to-doc","title":"TSV to DOC","text":""},{"location":"configuration/#json-to-tsv","title":"JSON to TSV","text":""},{"location":"dictionaries/","title":"Dictionaries","text":""},{"location":"dictionaries/#presentation","title":"Presentation","text":""},{"location":"dictionaries/#the-people-dictionary","title":"The people dictionary","text":"

"},{"location":"dictionaries/#other-dictionaries","title":"Other dictionaries","text":"

"},{"location":"gant/","title":"Gant","text":""},{"location":"gant/#gantt-diagrams-of-the-developments","title":"Gantt diagrams of the developments","text":"gantt dateFormat YYYY-MM-DD axisFormat %Y-%m title Diagrammes de Gantt pr\u00e9visionnel des d\u00e9veloppements section MongoDB 1: des1, 2023-11-01,60d 2: des2, 2023-12-01,90d 3: des3, 2023-12-01,90d section Couche API 4: des4, 2024-01-01,120d 5: des5, 2024-04-01,60d section Interface Web 6a: des6, 2024-06-01,60d 6b: des7, 2024-07-01,60d 6c: des8, 2024-09-01,60d"},{"location":"infrastructure/","title":"Infrastructure","text":""},{"location":"infrastructure/#infrastructure-local-remote-or-mixed","title":"Infrastructure : Local, Remote or Mixed","text":"

The necessary Infrastructure involves 1) a machine running a Linux OS and 2) a dedicated storage space.

1 - The machine will most often be of \"virtual\" type because more simpler to deploy, either locally (with VM providers such as VirtualBox, VMware Workstation or MS Hyper-V) or remotely (e.g VMware ESXi, Openstack: example of deployment). Moreover, the OS of your machine must allow you the deployment of docker containers. See for more details on \u201cWhat is Docker\u201d. The minimum characteristics of the VM are: 2 cpu, 2 Go RAM, 8 Go HD.

2 - The dedicated storage space could be either in the local space of the VM, or in a remote place on the network.

"},{"location":"installation/","title":"Installation","text":""},{"location":"installation/#install-on-your-linux-computer-or-linux-unix-server","title":"Install on your linux computer or linux / unix server","text":"

Requirements: The installation must be carried out on a (virtual) machine with a recent Linux OS that support Docker (see Infrastructure)

"},{"location":"installation/#retrieving-the-code","title":"Retrieving the code","text":"

Go to the destination directory of your choice then clone the repository and cd to your clone path:

git clone https://github.com/inrae/pgd-mmdt.git pgd-mmdt\ncd pgd-mmdt\n

"},{"location":"installation/#installation-of-docker-containers","title":"Installation of Docker containers","text":"

MAGGOT uses 3 Docker images for 3 distinct services:

"},{"location":"installation/#configuration","title":"Configuration","text":"

See Configuration settings

Warning : You have to pay attention to put the same MongoDB settings in all the above configuration files. It is best not to change anything. It would have been preferable to put a single configuration file but this was not yet done given the different languages involved (bash, javascript, python, PHP). To be done!

Note : If you want to run multiple instances, you will need to change in the run file, i) the container names, ii) the data path, iii) the MongoDB volume name.

The following two JSON files are defined by default but can be easily configured from the web interface. See the Terminology Definition section.

"},{"location":"installation/#commands","title":"Commands","text":"

The run shell script allows you to perform multiple actions by specifying an option :

cd pgd-mmdt\nsh ./run <option>\n

Options:

"},{"location":"installation/#starting-the-application","title":"Starting the application","text":"

"},{"location":"installation/#launching-the-web-application-in-the-web-browser","title":"Launching the web application in the web browser","text":"
\n   CONTAINER ID  IMAGE          COMMAND                 CREATED          STATUS         PORTS                                  NAMES\n   5914504f456d  pgd-mmdt-web   \"docker-php-entrypoi.\"  12 seconds ago   Up 10 seconds  0.0.0.0:8087->80/tcp, :::8087->80/tcp  mmdt-web\n   226b13ed9467  pgd-mmdt-scan  \"cron -f\"               12 seconds ago   Up 11 seconds                                         mmdt-scan\n   81fecbb56d23  pgd-mmdt-db    \"docker-entrypoint.s.\"  13 seconds ago   Up 12 seconds  27017/tcp                              mmdt-db\n

"},{"location":"installation/#stoping-the-application","title":"Stoping the application","text":""},{"location":"installation/#updating-the-application","title":"Updating the application","text":"

When updating the application, it is imperative to preserve a whole set of configuration files as well as the content of certain directories (dictionaries, javascripts dedicated to vocabularies, etc.). An update script is available (./etc/update-maggot.sh) preferably placed under '/usr/local/bin'. To preserve your configuration, it is recommended to create local configuration files.

"},{"location":"installation/#architecture-diagram","title":"Architecture diagram","text":"

Note: See how to do proceed for configuration steps.

"},{"location":"installation/#file-browser","title":"File Browser","text":"

You can provide access to your data via a file browser. This application must be installed separately but can be connected to Maggot by specifying the corresponding URL in the configuration file. Users and their rights are managed in the filebrowser application. Likewise, we can also create links to the data without a password. These links can be usefully specified as external resources in the metadata managed by Maggot.

See how to do install in github.

"},{"location":"private-access/","title":"Private access","text":""},{"location":"private-access/#private-access-key-management","title":"Private access key management","text":""},{"location":"private-access/#motivation","title":"Motivation","text":"

Although the Maggot tool is designed to foster the sharing of metadata within a collective, it may be necessary to temporarily privatize access to the metadata of an ongoing project with confidentiality constraints. So even within our own collective, access to metadata must be restricted to authorized users only.

"},{"location":"private-access/#implementation","title":"Implementation","text":"

The choice of not wanting to manage users in the Maggot tool was made in order to make the metadata completely open by default within a collective. Furthermore, access rights to the storage space are managed independently of the Maggot tool by the administrator of this space. It is therefore through the storage space that we must give or not access to the metadata via the web interface.

The chosen mechanism for privatizing access is described below. It has the dual advantage of being simple to implement and simple to use.

  1. First we have to generate a file containing the encrypted key for a private access. This file must be generated from the web interface then downloaded as shown in the figure below. Then this file must be manually deposited in the data directory corresponding to the dataset whose access we wish to privatize. The presence of this file within a directory is enough to block access to metadata and data by default. It should be noted that we can put this same file containing the encrypted private key in several data directories (included within the same project for example). The deposit must be done by hand because the Maggot tool must only have access to the storage space in read mode. This also guarantees that the user has writing rights to this space without having to manage user accounts on the Maggot side.

    By default, \u2018untwist1\u2019 metadata are not accessible to anyone

  2. When we want to have access to the metadata of this dataset, we have to simply enter the private key in the current session. This will have the effect of unlocking access to the metadata via the web interface only in the current session of our web browser. This means that we will have to enter the private key for each session (by default, a session lasts a maximum of 1 hour).

    Now the \u2018untwist1\u2019 metadata are accessible only to us

  3. When we want to give access to the metadata to the entire collective, we simply need to delete the private access file (named by default 'META_auth.txt') from the concerned data directory.

"},{"location":"settings/","title":"Configuration settings","text":""},{"location":"settings/#configuration-settings_1","title":"Configuration settings","text":"

Here is the list of all files that may be subject to adjustment of certain parameters according to the needs of the instance site.

"},{"location":"settings/#dockerscanpartscriptsconfigpy","title":"dockerscanpart/scripts/config.py","text":"

This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.

Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template

Parameter Description Default value dbserver Name of the MongoDB server mmdt-db database Name of the MongoDB database pgd-db dbport Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read/Write access userw-pgd password Password corresponding to the username of the Mongo DB pgd-db wwwww

"},{"location":"settings/#incconfigmongodbinc","title":"inc/config/mongodb.inc","text":"

This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.

Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template

Parameter Description Default value docker_mode Indicates whether the installation involves using docker containers. In this case, the Mongo DB IP address will be different from 127.0.0.1. 1 uritarget the Mongo DB IP address mmdt-db (docker_mode=1) or 127.0.0.1 (docker_mode=0) database Name of the MongoDB database pgd-db collection Name of the MongoDB collection metadata port Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read access only userr-pgd password Password corresponding to the username of the Mongo DB pgd-db rrrrr

"},{"location":"settings/#incconfigconfiginc","title":"inc/config/config.inc","text":"

This file defines parameters related to i) the web interface, ii) the functionalities allowed for users. Only the parameters that could be useful to be changed for the needs of an instance are described here.

Parameter Description Default value EXTERN Indicates if the use of the tool is only for external use, i.e. without using a storage space. 0 PRIVATE_ACCESS Gives the possibility of managing private access to metadata 0 ZOOMWP Zoom level regarding the web interface. By reducing the size slightly, you get a better layout. 90% RESMEDIA Gives the possibility of putting a MINE type on each resource in the metadata. 1 TITLE Title to display in main banner Metadata management FILEBROWSER Indicates whether the file browser is used. This assumes it is installed. 0 URL_FILEBROWSER File browser URL. It can be absolute or relative. /fb/ APPNAME Name given in the URL to access the web interface. maggot dataverse_urls Array of Dataverse repository URLs where you can upload metadata and data - zenodo_urls Array of Zenodo repository URLs where you can upload metadata and data - SERVER_URL Default Dataverse repository URL https://entrepot.recherche.data.gouv.fr ZENODO_SERVER_URL Default Zenodo repository URL https://zenodo.org export_dataverse Indicates whether the Dataverse feature is enabled 1 export_zenodo Indicates whether the Zenodo feature is enabled 1 export_jsonld Indicates whether the JSON-LD feature is enabled 1 export_oai Indicates whether the OAI-PMH feature is enabled 0 export_bloxberg Indicates whether the Bloxberg Blockchain feature is enabled (Experimental) 0 cvdir Relative path of the Control Vocabulary Listes (cvlist) cvlist/ maggot_fulltitle Maggot name of the field corresponding to the title in dataverse/zenodo fulltitle auth_senddata_file Name of the file that must be present in the data directory to authorize the transfer of the data file META_datafile_ok.txt private_auth_file Name of the private access file META_auth.txt sendMail Configuring messaging for sending metadata to data managers (see below) NULL

The messaging configuration is done using the following array in the inc/config/config.inc file (or more judiciously in inc/config/local.inc in order to be preserved during an update) - To understand how it works see Send Emails using PHPmailer

$sendMail['smtpHost'] = 'smtp.example.org';        //  Set the SMTP server to send through\n$sendMail['smtpSecure'] = 'tls';                   //  Enable TLS encryption\n$sendMail['smtpPort'] = 587;                       //  Set the TCP port to connect to\n$sendMail['CheckEmail'] = 'maggot@exemple.org';    //  Email address authorized to send emails\n$sendMail['CheckPass'] = 'password';               //  The corresponding password\n$sendMail['CheckName'] = 'Maggot';                 //  Alias name\n$sendMail['UserEmail'] = 'admin@exemple.org';      //  Email of data managers, separated by a comma\n

"},{"location":"settings/#run","title":"run","text":"

This file contains the essential parameters to be set before any use.

Parameter Description Default value WEB_PORT Local HTTP Port for web application 8087 DATADIR Path to the data /opt/data/ DB_IMAGE Docker image name of the MongoDB pgd-mmdt-db SCAN_IMAGE Docker image name of the Scan process pgd-mmdt-scan WEB_IMAGE Docker image name of the Web interface pgd-mmdt-web DB_CONTAINER Docker container name of the MongoDB mmdt-db SCAN_CONTAINER Docker container name of the Scan process mmdt-scan WEB_CONTAINER Docker container name of the Web interface mmdt-web MONGO_VOL Volume name for MongoDB mmdt-mongodb USER Admin user in the htpasswd file admin

"},{"location":"definitions/","title":"Definition Files","text":""},{"location":"definitions/#metadata-definition-files","title":"Metadata definition files","text":"

The Maggot tool offers great flexibility in configuration. It allows you to completely choose all the metadata you want to describe your data. You can base yourself on an existing metadata schema, invent your own schema or, more pragmatically, mix one or more schemas by introducing some metadata specific to your field of application. However, keep in mind that if you want to add descriptive metadata to your data then a certain amount of information is expected. But a completely different use of the tool is possible, it's up to you.

There are two levels of definition files as shown the figure below:

1 - The first level concerns the definition of terminology (metadata) similar to a descriptive metadata plan. Clearly, this category is more akin to configuration files. They represent the heart of the application around which everything else is based. The input and search interfaces are completely generated from these definition files (especially the web/conf/config_terms.txt file), thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration step is essential in order to be able to configure all the other modules.

2 - The second level concerns the definitions of the mapping to a differently structured metadata schema (metadata crosswalk, i.e a specification for mapping one metadata standard to another), used either i) for metadata export to a remote repository (e.g. Dataverse, Zenodo) or ii) for metadata harvesting (e.g. JSON-LD, OAI-PMH). Simply place the definition files in the configuration directory (web/conf) for them to be taken into account, provided you have adjusted the configuration (web/inc/config/config.inc).

All definition files are made using a simple spreadsheet then exported in TSV format.

The list of definition files in Maggot are given below. All must be put under the directory web/conf.

See an example on line : https://pmb-bordeaux.fr/maggot/config/view and the corresponding form based on these definition files.

"},{"location":"definitions/config_terms/","title":"Terminlogy Definition","text":""},{"location":"definitions/config_terms/#example-of-a-terminlogy-definition-file","title":"Example of a Terminlogy Definition file","text":"Field Section Required Search ShortView Type features Label Predefined terms title definition Y N 1 textbox width=350px Short name fulltitle definition Y Y 2 textbox Full title subject definition Y Y checkbox open=0 Subject Agricultural Sciences,Arts and Humanities,Astronomy and Astrophysics,Business and Management,Chemistry,Computer and Information Science,Earth and Environmental Sciences,Engineering,Law,Mathematical Sciences,Medicine Health and Life Sciences,Physics,Social Sciences,Other description definition Y Y areabox rows=6,cols=30 Description of the dataset note definition N Y areabox rows=4,cols=30 Notes status status N Y 3 dropbox width=350px Status of the dataset Processed,In progress,Unprocessed access_rights status N Y 4 dropbox width=350px Access rights to data Public,Mixte,Private language status N Y checkbox open=0 Language Czech,Danish,Dutch,English,Finnish,French,German,Greek,Hungarian,Icelandic,Italian,Lithuanian,Norwegian,Romanian,Slovenian,Spanish,Swedish lifeCycleStep status N Y multiselect autocomplete=lifecycle,min=1 Life cycle step license status N Y textbox autocomplete=license,min=1 License datestart status N Y datebox width=350px Start of collection dateend status N Y datebox width=350px End of collection dmpid status N Y textbox DMP identifier contacts management Y Y multiselect autocomplete=people,min=1 Contacts authors management Y Y multiselect autocomplete=people,min=1 Authors collectors management N Y multiselect autocomplete=people,min=1 Data collectors curators management N Y multiselect autocomplete=people,min=1 Data curators members management N Y multiselect autocomplete=people,min=1 Project members leader management N Y multiselect autocomplete=people,min=1 Project leader wpleader management N Y multiselect autocomplete=people,min=1 WP leader depositor management N Y textbox Depositor producer management N Y multiselect autocomplete=producer,min=1 Producer grantNumbers management N Y multiselect autocomplete=grant,min=1 Grant Information kindOfData descriptors Y Y checkbox open=0 Kind of Data Audiovisual,Collection,Dataset,Event,Image,Interactive Resource,Model,Physical Object,Service,Software,Sound,Text,Workflow,Other keywords descriptors N Y multiselect autocomplete=bioportal,onto=EFO:JERM:EDAM:MS:NMR:NCIT:OBI:PO:PTO:AGRO:ECOCORE:IOBC:NCBITAXON Keywords topics descriptors N Y multiselect autocomplete=VOvocab Topic Classification dataOrigin descriptors N Y checkbox open=0 Data origin observational data,experimental data,survey data,analysis data,text corpus,simulation data,aggregate data,audiovisual corpus,computer code,Other experimentfactor descriptors N Y multiselect autocomplete=vocabulary,min=1 Experimental Factor measurement descriptors N Y multiselect autocomplete=vocabulary,min=1 Measurement type technology descriptors N Y multiselect autocomplete=vocabulary,min=1 Technology type publication_citation descriptors N Y areabox rows=5,cols=30 Publication - Citation publication_idtype descriptors N Y dropbox width=200px Publication - ID Type -,ark,arXiv,bibcode,doi,ean13,eissn,handle,isbn,issn,istc,lissn,lsid,pmid,purl,upc,url,urn publication_idnumber descriptors N Y textbox width=400px Publication - ID Number publication_url descriptors N Y textbox Publication - URL comment other N Y areabox rows=15, cols=30 Additional information"},{"location":"definitions/dataverse/","title":"Dataverse Definition File","text":"

Open source research data repository software, approved by Europe.

"},{"location":"definitions/dataverse/#dataverse-definition-file_1","title":"Dataverse definition File","text":"

This definition file will allow Maggot to automatically export the dataset into a data repository based on Dataverse. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Dataverse, knowing that this metadata crosswalk was made possible by choosing the right metadata schema at upstream.

The structure of the Dataverse JSON output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.

The file must have 4 columns with headers defined as follows:

Below an example of Dataverse definition file (TSV)

Example of Dataverse JSON file generated based on the definition file itself given as an example above.

"},{"location":"definitions/json-ld/","title":"JSON-LD Definition File","text":""},{"location":"definitions/json-ld/#json-ld-definition-file_1","title":"JSON-LD definition File","text":"

This definition file will allow harvesters to collect structured metadata based on a semantic schema, i.e the fields themselves and not just their content can be associated with a semantic definition (ontology for example) which will then facilitate the link between the metadata and therefore the data (JSON-LD). The chosen semantic schema is based on several metadata schemas.

The full workflow to \"climb the Link Open Data mountain\" is resumed by the figure below :

Metadata schemas used to build the model proposed by default:

Definition of the JSON-LD context using the metadata schemas proposed by default

The structure of the JSON-LD is not known internally, information on the structure will therefore be necessary to carry out the correspondence.

Example of JSON-LD definition file (partial) using the metadata schemas proposed by default (TSV)

Example of JSON-LD file generated based on the definition file itself given as an example above.

"},{"location":"definitions/mapping/","title":"Mapping Definition File","text":""},{"location":"definitions/mapping/#mapping-definition-file_1","title":"Mapping definition File","text":"

The mapping file is used as indicated by its name to match a term chosen by the user during entry with another term from an ontology or a thesaurus and therefore to obtain a URL which will be used for referencing. It can be used for each metadata crosswalk requiring such a mapping (e.g. to the Dataverse, Zenodo or JSON-LD format).

The role of this definition file is illustrated with the figure above

The file must have 5 columns with headers defined as follows:

Below an example of Mapping definition file (TSV)

"},{"location":"definitions/oai-pmh/","title":"OAI-PMH Definition File","text":"

OAI-PMH is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives.

"},{"location":"definitions/oai-pmh/#oai-pmh-definition-file_1","title":"OAI-PMH definition File","text":"

This definition file will allow harvesters to collect metadata structured according to a standard schema (OAI-DC).

The structure of the OAI-PMH output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.

Example of OAI-PMH definition file (TSV)

Another example of OAI-PMH definition file (TSV) with identifers & vocabulary mapping

"},{"location":"definitions/terminology/","title":"Terminology","text":""},{"location":"definitions/terminology/#definition-of-terminology","title":"Definition of terminology","text":"

There are two definition files to set up.

Each time there is a change in these two definition files, it is necessary to convert them so that they are taken into account by the application.

Terminology is the set of terms used to define the metadata of a dataset. A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces (e.g screenshot) are completely generated from this definition file, thus defining i) each of the fields, their input type (checkbox, dropbox, textbox, ...) and ii) the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms).

The metadata schema proposed by defaut is mainly established according to the DDI (Data Documentation Initiative) schema that also corresponds to that adopted by the Dataverse software.

Terminology is organised in several sections. By default 6 sections are proposed, but you can redefine them as you wish:

For each section, fields are then defined. These fields can be defined according to the way they will be entered via the web interface. There are 6 different types of input: check boxes (checkbox), drop lists (dropbox), single-line text boxes (textbox), single-line text boxes with an additional box for multiple selection from a catalog of terms (multiselect), date picker (datebox) and multi-line text boxes (areabox).

For two types (checkbox and dropbox), it is possible to define the values to be selected (predefined terms).

"},{"location":"definitions/terminology/#structure-of-the-terminology-definition-file-tsv","title":"Structure of the Terminology definition file (TSV)","text":"

The file must have 9 columns with headers defined as follows:

Below an example of Terminology definition file (TSV)

Example of Maggot JSON file generated based on the same definition file

"},{"location":"definitions/terminology/#structure-of-the-terminology-documentation-file-tsv","title":"Structure of the Terminology documentation file (TSV)","text":"

The documentation definition file is used to have online help for each field (small icon placed next to each label on the form). So it should only be modified when a field is added or deleted, or moved to another section. This file will be used then to generate the online metadata documentation according to the figure below (See Configuration to find out how to carry out this transformation).

The file must have 3 columns with headers defined as follows:

Below an example of Terminology documentation file (TSV)

Same example as above converted to HTML format using Markdown format

"},{"location":"definitions/vocabulary/","title":"Vocabulary","text":""},{"location":"definitions/vocabulary/#vocabulary_1","title":"Vocabulary","text":"

1 - Vocabulary based on a list of terms fixed in advance (checbox with feature open=0)

2 - Vocabulary open for addition (checkbox with feature open=1)

3 - Vocabulary based on a web API in a text field (textbox)

4 - Vocabulary based on a dictionary with multiple selection (multiselect)

5 - Vocabulary based on a SKOSMOS Thesaurus with multiple selection (multiselect)

6 - Vocabulary based on an OntoPortal with multiple selection (multiselect)

"},{"location":"definitions/zenodo/","title":"Zenodo Definition File","text":"

Open source research data repository software, approved by Europe.

"},{"location":"definitions/zenodo/#zenodo-definition-file_1","title":"Zenodo definition File","text":"

This definition file will allow Maggot to automatically export the dataset into a data repository based on Zenodo. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Zenodo.

The structure of the Zenodo JSON output file is not known internally, information on the structure will therefore be necessary to carry out the correspondence.

Below an example of Zenodo definition file (TSV)

Example of Zenodo JSON file generated based on the definition file itself given as an example above.

"},{"location":"publish/","title":"Publish Metadata","text":""},{"location":"publish/#publish-metadata_1","title":"Publish Metadata","text":"

"},{"location":"publish/#httpswwwgooglecomsearchqmetadatacrosswalkdefinitionoqmetadatacrosswalk","title":"https://www.google.com/search?q=metadata+crosswalk+definition&oq=metadata+crosswalk","text":""},{"location":"publish/dataverse/","title":"Publish into Dataverse","text":""},{"location":"publish/dataverse/#publish-into-dataverse_1","title":"Publish into Dataverse","text":"

1 - To submit metadata to a Dataverse repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.

2 - You then need to connect to the repository in order to retrieve the key (the API token) authorizing you to submit the dataset. This obviously assumes that you have the privileges (creation/modification rights) to do so.

3 - After choosing the repository URL, you must also specify on which dataverse collection you want to deposit the datasets. As previously, you must have write rights to this dataverse collection.

"},{"location":"publish/dataverse/#deposit-data-files","title":"Deposit data files","text":"

"},{"location":"publish/zenodo/","title":"Publish into Zenodo","text":""},{"location":"publish/zenodo/#publish-into-zenodo_1","title":"Publish into Zenodo","text":"

1 - To submit metadata to a Zenodo repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.

2 - Unless you have previously saved your API token, you must create a new one and copy and paste it before validating it. Before validating, you must check the deposit:access and deposit:write boxes in order to obtain creation and modification rights with this token.

3 - After choosing the repository URL, you can optionally choose a community to which the dataset will be linked. By default, you can leave empty this field.

"},{"location":"publish/zenodo/#deposit-data-files","title":"Deposit data files","text":"

"},{"location":"tutorial/","title":"Quick tutorial","text":""},{"location":"tutorial/#quick-tutorial_1","title":"Quick tutorial","text":"

This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user.

See a short Presentation and Poster if you want to have a more general overview of the tool..

"},{"location":"tutorial/#overview","title":"Overview","text":"

The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:

Configuration

This module mainly concerns the data manager and makes it possible to construct all the terminology definition files, i.e. the metadata and sources of associated vocabularies. See Definition files then Configuration.

Private Access

This module allows data producer to temporarily protect access to metadata for the time necessary before sharing it within his collective. See Private access key management.

Dictionaries

This module allows data producer to view content of all dictionaries. It also allows data steward to edit their content. See Dictionaries for technical details only.

Metadata Entry

This is the main module allowing the data producer to enter their metadata relating to a dataset. See the corresponding tutorial for Metadata Entry.

Search datasets

This module allows users to search datasets based on the associated metadata, to see all the metadata and possibly to have access to the data itself. This obviously assumes that the metadata files have been deposited in the correct directory in the storage space dedicated to data management within your collective. See Infrastructure.

File Browser

This module gives users access to a file browser provided that the data manager has installed it. See File Browser

Publication

This module allows either the data producer or the data steward to publish the metadata with possibly the corresponding data within the suitable data repository. See Publication

"},{"location":"tutorial/describe/","title":"Quick tutorial","text":""},{"location":"tutorial/describe/#metadata-entry","title":"Metadata Entry","text":"

The figures are given here for illustration purposes but certain elements may be different for you given that this will depend on the configuration on your instance, in particular the choice of metadata, and the associated vocabulary sources.

Indeed, the choice of vocabulary sources (ontologies, thesauri, dictionaries) as well as the choice of metadata fields to enter must in principle have been the subject of discussion between data producers and data manager during the implementation of the Maggot tool in order to find the best compromise between the choice of sources and all the scientific fields targeted (see Definition files). However a later addition is always possible.

"},{"location":"tutorial/describe/#overview","title":"Overview","text":"

When you enter the metadata entry module you should see a page that looks like the figure below:

"},{"location":"tutorial/describe/#dictionaries","title":"Dictionaries","text":"

Dictionary-based metadata (e.g. people's names) can easily be entered by autocomplete in the 'Search value' box provided the name appears in the corresponding dictionary.

However, if the name does not yet appear in the dictionary, simply enter the full name (first name & last name) in the main box, making sure to separate each name with a comma and then a space as shown in the figure below.

Then you can request to add the additional person name(s) to the dictionary later as described below:

Please proceed in the same way for all dictionaries (people, funders, producer, vocabulary)

"},{"location":"tutorial/describe/#controlled-vocabulary","title":"Controlled Vocabulary","text":"

Depending on the configuration of your instance, it is very likely that certain fields (eg. keywords) are connected to a controlled vocabulary source (e.g. ontology, thesaurus). Vocabulary based on ontologies, thesauri or even dictionaries can easily be entered by autocomplete in the \"search for a value\" box provided that the term exists in the corresponding vocabulary source.

If a term cannot be found by autocomplete, you can enter the term directly in the main box, making sure to separate each term with a comma and a space as shown in the figure below.

The data steward will later try to link it to a vocabulary source that may be suitable for the domain in question. Furthermore, even if the choice of vocabulary sources was made before the tool was put into service, a later addition is always possible. You should make the request to your data manager.

"},{"location":"tutorial/describe/#resources","title":"Resources","text":"

Because data is often scattered across various platforms, databases, and file formats, this making it challenging to locate and access. This is called data fragmentation. So the Maggot tool allows you to specify resources, i.e. data in the broader sense, whether external or internal, allowing to centralize all links towards data.

Four fields must be filled in :

"},{"location":"tutorial/metadata/","title":"Quick tutorial","text":""},{"location":"tutorial/metadata/#metadata-file","title":"Metadata File","text":"

Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. The file is in JSON format and must have the prefix 'META_'.

By clicking on the \"Generate the metadata file\" button, you can save it on your disk space.

Furthermore, if email sending has been configured (see settings), then you have the possibility of sending the metadata file to the data managers for conservation, and possibly also for supporting its storage on data disk space if specific rights are required.

In the (most common) case where you want to save the metadata file to your disk space, two uses of this file are possible:

1. The first use is the recommended one because it allows metadata management within your collective.

You drop the metadata file directly under the data directory corresponding to the metadata. Indeed, when installing the tool, a storage space dedicated to the tool had to be provided for this purpose. See infrastructure. Once deposited, you just have to wait around 30 minutes maximum so that the tool has had time to scan the root of the data directories looking for new files in order to update the database. After this period, the description of your dataset will be visible from the interface, and a selection of criteria will be made in order to restrict the search.

You will then have the possibility to publish the metadata later with possibly the corresponding data in a data repository such as Dataverse or Zenodo.

2. The second use is only to deposit the metadata into a data repository

Whether with Dataverse or Zenodo, you have the possibility of publishing metadata directly in one or other of these repositories without using the storage space.

Please note that you cannot also deposit the data files in this way. You will have to do this manually for each of them directly online in the repository.

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\\\s\\\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"

An ecosystem for sharing metadata

"},{"location":"#foster-good-data-management-with-data-sharing-in-mind","title":"Foster good data management, with data sharing in mind","text":"

Sharing descriptive Metadata is the first essential step towards Open Scientific Data. With this in mind, Maggot was specifically designed to annotate datasets by creating a metadata file to attach to the storage space. Indeed, it allows users to easily add descriptive metadata to datasets produced within a collective of people (research unit, platform, multi-partner project, etc.). This approach fits perfectly into a data management plan as it addresses the issues of data organization and documentation, data storage and frictionless metadata sharing within this same collective and beyond.

"},{"location":"#main-features-of-maggot","title":"Main features of Maggot","text":"

The main functionalities of Maggot were established according to a well-defined need (See Background).

  1. Documente with Metadata your datasets produced within a collective of people, thus making it possible :
  2. Search datasets by their metadata
  3. Publish the metadata of datasets along with their data files into an Europe-approved repository

See a short Presentation and Poster for a quick overview.

"},{"location":"#overview-of-the-different-stages-of-metadata-management","title":"Overview of the different stages of metadata management","text":"

Note: The step numbers indicated in the figure correspond to the different points developed below

1 - First you must define all the metadata that will be used to describe your datasets. All metadata can be defined using a single file (in TSV format, therefore using a spreadsheet). This is a unavoidable step because both input and search interfaces are completely generated from these definition files, defining in this way each of the fields along with their input type and also the associated Controlled Vocabulary (ontology, thesaurus, dictionary, list of fixed terms). The metadata proposed by default was mainly established according to the DDI (Data Documentation Initiative) metadata schema. This schema also largely corresponds to that adopted by the Dataverse software. See the Terminology Definition section.

2 - Entering metadata will be greatly facilitated by the use of dictionaries. The dictionaries offered by default are: people, funders, data producers, as well as a vocabulary dictionary allowing you to mix ontologies and thesauri from several sources. Each of these dictionaries allows users, by entering a name by autocompletion, to associate information which will then be added when exporting the metadata either to a remote repository, or for harvesting the metadata. Thus this information, once entered into a dictionary, will not need to be re-entered again.

3 - The web interface for entering metadata is entirely built on the basis of definition files. The metadata are distributed according to the different sections chosen, each constituting a tab (see screenshot). Mandatory fields are marked with a red star and must be documented in order to be able to generate the metadata file. The entry of metadata governed by a controlled vocabulary is done by autocompletion from term lists (dictionary, thesaurus or ontology). We can also define external resources (URL links) relating to documents, publications or other related data. Maggot thus becomes a hub for your datasets connecting different resources, local and external. Once the mandatory fields (at least) and other recommended fields (at best) have been entered, the metadata file can be generated in JSON format.

4 - The file generated in JSON format must be placed in the storage space reserved for this purpose. The role played by this metadata file can be seen as a README file adapted for machines, but also readable by humans. With an internal structure, it offers coherence and consistency of information that a simple README file with a completely free and therefore unstructured text format does not allow. Furthermore, the central idea is to use the storage space as a local data repository, so that the metadata should go to the data and not the other way around.

5 - A search of the datasets can thus be carried out on the basis of the metadata. Indeed, all the JSON metadata files are scanned and parsed according to a fixed time interval (30 min) then loaded into a database. This allows you to perform searches based on predefined metadata. The search form, in a compact shape, is almost the same as the entry form (see a screenshot). Depending on the search criteria, a list of data sets is provided, with for each of them a link pointing to the detailed sheet.

6 - The detailed metadata sheet provides all the metadata divided by section. Unfilled metadata does not appear by default. When a URL can be associated with information (ORCID, Ontology, web site, etc.), you can click on it to go to the corresponding link. Likewise, it is possible to follow the associated link on each of the resources. From this sheet, you can also export the metadata according to different schemata (Dataverse, Zenodo, JSON-LD). See screenshot 1 & screenshot 2.

7 - Finally, once you have decided to publish your metadata with your data, you can choose the repository that suits you (currently repositories based on Dataverse and Zenodo are supported).

"},{"location":"#additional-key-points","title":"Additional key points","text":"

"},{"location":"about/","title":"About","text":""},{"location":"about/#background","title":"Background","text":""},{"location":"about/#motives","title":"Motives","text":""},{"location":"about/#state-of-need","title":"State of need","text":""},{"location":"about/#proposed-approach","title":"Proposed approach","text":""},{"location":"about/#links","title":"Links","text":""},{"location":"about/#contacts","title":"Contacts","text":""},{"location":"about/#designers-developers","title":"Designers / Developers","text":""},{"location":"about/#contributors","title":"Contributors","text":"

"},{"location":"bloxberg/","title":"Bloxberg Blockchain","text":""},{"location":"bloxberg/#experimental-certification-of-metadata-file-on-the-bloxberg-blockchain","title":"EXPERIMENTAL - Certification of metadata file on the bloxberg blockchain","text":""},{"location":"bloxberg/#motivation","title":"Motivation","text":"

To guarantee the authenticity and integrity of a metadata file by recording it permanently and immutably on the bloxberg blockchain.

Indeed, the blockchain is a technology that makes it possible to keep track of a set of transactions (writings in the blockchain), in a decentralized, secure and transparent manner, in the form of a blockchain. A blockchain can therefore be compared to a large (public or private) unfalsifiable register. Blockchain is today used in many fields because it provides solutions to many problems. For example in the field of Higher Education and Research, registration of dataset metadata in the blockchain, makes possible in this way to certify, in an inalienable, irrefutable and completely transparent manner, the ownership and authenticity of the data as well as for example, the license of use and the date of production of the data. Research stakeholders are then more open to the dissemination of their data (files, results, protocols, publications, etc.) since they know that, in particular, the ownership, content and conditions of use of the data cannot not be altered.

The Maggot tool could thus serve as a gateway to certify its data with the associated metadata. The complete process is schematized by the following figure:

"},{"location":"bloxberg/#about-bloxberg","title":"About bloxberg","text":"

bloxberg is the most important blockchain project in science. It was founded in 2019 by MPDL , looking for a way to store research results and make them available to other researchers. In this sense, bloxberg is a decentralized register in which results can be stored in a tamper-proof way with a time stamp and an identifier.

bloxberg is based on the Ethereum Blockchain. However, it makes use of a different consensus mechanism: instead of \u201cProof of Stake\u201d used by Ethereum since 2022, bloxberg validates blocks through \u201cProof of Authority\u201d. Each node is operated by one member. All members of the association are research institutions and are known in the network. Currently, bloxberg has 49 nodes. It is an international project with participating institutions from all over the world.

"},{"location":"bloxberg/#how-to-process","title":"How to process ?","text":"

You will need a Ethereum address and an API key (must be requested via bloxberg-services (at) mpdl.mpg.de). See an example of pushing a metadata file to the bloxberg blockchain using Maggot.

"},{"location":"bloxberg/#useful-links","title":"Useful links","text":""},{"location":"configuration/","title":"Configuration","text":""},{"location":"configuration/#terminology-configuration","title":"Terminology configuration","text":"

A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces are completely generated from this definition file, thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration and conversion step into JSON format is essential in order to be able to configure all the other modules (example: creation of the MongoDB database schema when starting the application before filling it).

"},{"location":"configuration/#tsv-to-json","title":"TSV to JSON","text":""},{"location":"configuration/#tsv-to-doc","title":"TSV to DOC","text":""},{"location":"configuration/#json-to-tsv","title":"JSON to TSV","text":""},{"location":"dictionaries/","title":"Dictionaries","text":""},{"location":"dictionaries/#presentation","title":"Presentation","text":""},{"location":"dictionaries/#the-people-dictionary","title":"The people dictionary","text":"

"},{"location":"dictionaries/#other-dictionaries","title":"Other dictionaries","text":"

"},{"location":"gant/","title":"Gant","text":""},{"location":"gant/#gantt-diagrams-of-the-developments","title":"Gantt diagrams of the developments","text":"gantt dateFormat YYYY-MM-DD axisFormat %Y-%m title Diagrammes de Gantt pr\u00e9visionnel des d\u00e9veloppements section MongoDB 1: des1, 2023-11-01,60d 2: des2, 2023-12-01,90d 3: des3, 2023-12-01,90d section Couche API 4: des4, 2024-01-01,120d 5: des5, 2024-04-01,60d section Interface Web 6a: des6, 2024-06-01,60d 6b: des7, 2024-07-01,60d 6c: des8, 2024-09-01,60d"},{"location":"infrastructure/","title":"Infrastructure","text":""},{"location":"infrastructure/#infrastructure-local-remote-or-mixed","title":"Infrastructure : Local, Remote or Mixed","text":"

The necessary Infrastructure involves 1) a machine running a Linux OS and 2) a dedicated storage space.

1 - The machine will most often be of \"virtual\" type because more simpler to deploy, either locally (with VM providers such as VirtualBox, VMware Workstation or MS Hyper-V) or remotely (e.g VMware ESXi, Openstack: example of deployment). Moreover, the OS of your machine must allow you the deployment of docker containers. See for more details on \u201cWhat is Docker\u201d. The minimum characteristics of the VM are: 2 cpu, 2 Go RAM, 8 Go HD.

2 - The dedicated storage space could be either in the local space of the VM, or in a remote place on the network.

"},{"location":"installation/","title":"Installation","text":""},{"location":"installation/#install-on-your-linux-computer-or-linux-unix-server","title":"Install on your linux computer or linux / unix server","text":"

Requirements: The installation must be carried out on a (virtual) machine with a recent Linux OS that support Docker (see Infrastructure)

"},{"location":"installation/#retrieving-the-code","title":"Retrieving the code","text":"

Go to the destination directory of your choice then clone the repository and cd to your clone path:

git clone https://github.com/inrae/pgd-mmdt.git pgd-mmdt\ncd pgd-mmdt\n

"},{"location":"installation/#installation-of-docker-containers","title":"Installation of Docker containers","text":"

MAGGOT uses 3 Docker images for 3 distinct services:

"},{"location":"installation/#configuration","title":"Configuration","text":"

See Configuration settings

Warning : You have to pay attention to put the same MongoDB settings in all the above configuration files. It is best not to change anything. It would have been preferable to put a single configuration file but this was not yet done given the different languages involved (bash, javascript, python, PHP). To be done!

Note : If you want to run multiple instances, you will need to change in the run file, i) the container names, ii) the data path, iii) the MongoDB volume name.

The following two JSON files are defined by default but can be easily configured from the web interface. See the Terminology Definition section.

"},{"location":"installation/#commands","title":"Commands","text":"

The run shell script allows you to perform multiple actions by specifying an option :

cd pgd-mmdt\nsh ./run <option>\n

Options:

"},{"location":"installation/#starting-the-application","title":"Starting the application","text":"

"},{"location":"installation/#launching-the-web-application-in-the-web-browser","title":"Launching the web application in the web browser","text":"
\n   CONTAINER ID  IMAGE          COMMAND                 CREATED          STATUS         PORTS                                  NAMES\n   5914504f456d  pgd-mmdt-web   \"docker-php-entrypoi.\"  12 seconds ago   Up 10 seconds  0.0.0.0:8087->80/tcp, :::8087->80/tcp  mmdt-web\n   226b13ed9467  pgd-mmdt-scan  \"cron -f\"               12 seconds ago   Up 11 seconds                                         mmdt-scan\n   81fecbb56d23  pgd-mmdt-db    \"docker-entrypoint.s.\"  13 seconds ago   Up 12 seconds  27017/tcp                              mmdt-db\n

"},{"location":"installation/#stoping-the-application","title":"Stoping the application","text":""},{"location":"installation/#updating-the-application","title":"Updating the application","text":"

When updating the application, it is imperative to preserve a whole set of configuration files as well as the content of certain directories (dictionaries, javascripts dedicated to vocabularies, etc.). An update script is available (./etc/update-maggot.sh) preferably placed under '/usr/local/bin'. To preserve your configuration, it is recommended to create local configuration files.

"},{"location":"installation/#architecture-diagram","title":"Architecture diagram","text":"

Note: See how to do proceed for configuration steps.

"},{"location":"installation/#file-browser","title":"File Browser","text":"

You can provide access to your data via a file browser. This application must be installed separately but can be connected to Maggot by specifying the corresponding URL in the configuration file. Users and their rights are managed in the filebrowser application. Likewise, we can also create links to the data without a password. These links can be usefully specified as external resources in the metadata managed by Maggot.

See how to do install in github.

"},{"location":"private-access/","title":"Private access","text":""},{"location":"private-access/#private-access-key-management","title":"Private access key management","text":""},{"location":"private-access/#motivation","title":"Motivation","text":"

Although the Maggot tool is designed to foster the sharing of metadata within a collective, it may be necessary to temporarily privatize access to the metadata of an ongoing project with confidentiality constraints. So even within our own collective, access to metadata must be restricted to authorized users only.

"},{"location":"private-access/#implementation","title":"Implementation","text":"

The choice of not wanting to manage users in the Maggot tool was made in order to make the metadata completely open by default within a collective. Furthermore, access rights to the storage space are managed independently of the Maggot tool by the administrator of this space. It is therefore through the storage space that we must give or not access to the metadata via the web interface.

The chosen mechanism for privatizing access is described below. It has the dual advantage of being simple to implement and simple to use.

  1. First we have to generate a file containing the encrypted key for a private access. This file must be generated from the web interface then downloaded as shown in the figure below. Then this file must be manually deposited in the data directory corresponding to the dataset whose access we wish to privatize. The presence of this file within a directory is enough to block access to metadata and data by default. It should be noted that we can put this same file containing the encrypted private key in several data directories (included within the same project for example). The deposit must be done by hand because the Maggot tool must only have access to the storage space in read mode. This also guarantees that the user has writing rights to this space without having to manage user accounts on the Maggot side.

    By default, \u2018untwist1\u2019 metadata are not accessible to anyone

  2. When we want to have access to the metadata of this dataset, we have to simply enter the private key in the current session. This will have the effect of unlocking access to the metadata via the web interface only in the current session of our web browser. This means that we will have to enter the private key for each session (by default, a session lasts a maximum of 1 hour).

    Now the \u2018untwist1\u2019 metadata are accessible only to us

  3. When we want to give access to the metadata to the entire collective, we simply need to delete the private access file (named by default 'META_auth.txt') from the concerned data directory.

"},{"location":"settings/","title":"Configuration settings","text":""},{"location":"settings/#configuration-settings_1","title":"Configuration settings","text":"

Here is the list of all files that may be subject to adjustment of certain parameters according to the needs of the instance site.

"},{"location":"settings/#dockerscanpartscriptsconfigpy","title":"dockerscanpart/scripts/config.py","text":"

This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.

Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template

Parameter Description Default value dbserver Name of the MongoDB server mmdt-db database Name of the MongoDB database pgd-db dbport Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read/Write access userw-pgd password Password corresponding to the username of the Mongo DB pgd-db wwwww

"},{"location":"settings/#incconfigmongodbinc","title":"inc/config/mongodb.inc","text":"

This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.

Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template

Parameter Description Default value docker_mode Indicates whether the installation involves using docker containers. In this case, the Mongo DB IP address will be different from 127.0.0.1. 1 uritarget the Mongo DB IP address mmdt-db (docker_mode=1) or 127.0.0.1 (docker_mode=0) database Name of the MongoDB database pgd-db collection Name of the MongoDB collection metadata port Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read access only userr-pgd password Password corresponding to the username of the Mongo DB pgd-db rrrrr

"},{"location":"settings/#incconfigconfiginc","title":"inc/config/config.inc","text":"

This file defines parameters related to i) the web interface, ii) the functionalities allowed for users. Only the parameters that could be useful to be changed for the needs of an instance are described here.

Parameter Description Default value EXTERN Indicates if the use of the tool is only for external use, i.e. without using a storage space. 0 PRIVATE_ACCESS Gives the possibility of managing private access to metadata 0 ZOOMWP Zoom level regarding the web interface. By reducing the size slightly, you get a better layout. 90% RESMEDIA Gives the possibility of putting a MINE type on each resource in the metadata. 1 TITLE Title to display in main banner Metadata management FILEBROWSER Indicates whether the file browser is used. This assumes it is installed. 0 URL_FILEBROWSER File browser URL. It can be absolute or relative. /fb/ APPNAME Name given in the URL to access the web interface. maggot dataverse_urls Array of Dataverse repository URLs where you can upload metadata and data - zenodo_urls Array of Zenodo repository URLs where you can upload metadata and data - SERVER_URL Default Dataverse repository URL https://entrepot.recherche.data.gouv.fr ZENODO_SERVER_URL Default Zenodo repository URL https://zenodo.org export_dataverse Indicates whether the Dataverse feature is enabled 1 export_zenodo Indicates whether the Zenodo feature is enabled 1 export_jsonld Indicates whether the JSON-LD feature is enabled 1 export_oai Indicates whether the OAI-PMH feature is enabled 0 export_bloxberg Indicates whether the Bloxberg Blockchain feature is enabled (Experimental) 0 cvdir Relative path of the Control Vocabulary Listes (cvlist) cvlist/ maggot_fulltitle Maggot name of the field corresponding to the title in dataverse/zenodo fulltitle auth_senddata_file Name of the file that must be present in the data directory to authorize the transfer of the data file META_datafile_ok.txt private_auth_file Name of the private access file META_auth.txt sendMail Configuring messaging for sending metadata to data managers (see below) NULL

The messaging configuration is done using the following array in the inc/config/config.inc file (or more judiciously in inc/config/local.inc in order to be preserved during an update) - To understand how it works see Send Emails using PHPmailer

$sendMail['smtpHost'] = 'smtp.example.org';        //  Set the SMTP server to send through\n$sendMail['smtpSecure'] = 'tls';                   //  Enable TLS encryption\n$sendMail['smtpPort'] = 587;                       //  Set the TCP port to connect to\n$sendMail['CheckEmail'] = 'maggot@exemple.org';    //  Email address authorized to send emails\n$sendMail['CheckPass'] = 'password';               //  The corresponding password\n$sendMail['CheckName'] = 'Maggot';                 //  Alias name\n$sendMail['UserEmail'] = 'admin@exemple.org';      //  Email of data managers, separated by a comma\n

"},{"location":"settings/#run","title":"run","text":"

This file contains the essential parameters to be set before any use.

Parameter Description Default value WEB_PORT Local HTTP Port for web application 8087 DATADIR Path to the data /opt/data/ DB_IMAGE Docker image name of the MongoDB pgd-mmdt-db SCAN_IMAGE Docker image name of the Scan process pgd-mmdt-scan WEB_IMAGE Docker image name of the Web interface pgd-mmdt-web DB_CONTAINER Docker container name of the MongoDB mmdt-db SCAN_CONTAINER Docker container name of the Scan process mmdt-scan WEB_CONTAINER Docker container name of the Web interface mmdt-web MONGO_VOL Volume name for MongoDB mmdt-mongodb USER Admin user in the htpasswd file admin

"},{"location":"definitions/","title":"Definition Files","text":""},{"location":"definitions/#metadata-definition-files","title":"Metadata definition files","text":"

The Maggot tool offers great flexibility in configuration. It allows you to completely choose all the metadata you want to describe your data. You can base yourself on an existing metadata schema, invent your own schema or, more pragmatically, mix one or more schemas by introducing some metadata specific to your field of application. However, keep in mind that if you want to add descriptive metadata to your data then a certain amount of information is expected. But a completely different use of the tool is possible, it's up to you.

There are two levels of definition files as shown the figure below:

1 - The first level concerns the definition of terminology (metadata) similar to a descriptive metadata plan. Clearly, this category is more akin to configuration files. They represent the heart of the application around which everything else is based. The input and search interfaces are completely generated from these definition files (especially the web/conf/config_terms.txt file), thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration step is essential in order to be able to configure all the other modules.

2 - The second level concerns the definitions of the mapping to a differently structured metadata schema (metadata crosswalk, i.e a specification for mapping one metadata standard to another), used either i) for metadata export to a remote repository (e.g. Dataverse, Zenodo) or ii) for metadata harvesting (e.g. JSON-LD, OAI-PMH). Simply place the definition files in the configuration directory (web/conf) for them to be taken into account, provided you have adjusted the configuration (web/inc/config/config.inc).

All definition files are made using a simple spreadsheet then exported in TSV format.

The list of definition files in Maggot are given below. All must be put under the directory web/conf.

See an example on line : https://pmb-bordeaux.fr/maggot/config/view and the corresponding form based on these definition files.

"},{"location":"definitions/config_terms/","title":"Terminlogy Definition","text":""},{"location":"definitions/config_terms/#example-of-a-terminlogy-definition-file","title":"Example of a Terminlogy Definition file","text":"Field Section Required Search ShortView Type features Label Predefined terms title definition Y N 1 textbox width=350px Short name fulltitle definition Y Y 2 textbox Full title subject definition Y Y checkbox open=0 Subject Agricultural Sciences,Arts and Humanities,Astronomy and Astrophysics,Business and Management,Chemistry,Computer and Information Science,Earth and Environmental Sciences,Engineering,Law,Mathematical Sciences,Medicine Health and Life Sciences,Physics,Social Sciences,Other description definition Y Y areabox rows=6,cols=30 Description of the dataset note definition N Y areabox rows=4,cols=30 Notes status status N Y 3 dropbox width=350px Status of the dataset Processed,In progress,Unprocessed access_rights status N Y 4 dropbox width=350px Access rights to data Public,Mixte,Private language status N Y checkbox open=0 Language Czech,Danish,Dutch,English,Finnish,French,German,Greek,Hungarian,Icelandic,Italian,Lithuanian,Norwegian,Romanian,Slovenian,Spanish,Swedish lifeCycleStep status N Y multiselect autocomplete=lifecycle,min=1 Life cycle step license status N Y textbox autocomplete=license,min=1 License datestart status N Y datebox width=350px Start of collection dateend status N Y datebox width=350px End of collection dmpid status N Y textbox DMP identifier contacts management Y Y multiselect autocomplete=people,min=1 Contacts authors management Y Y multiselect autocomplete=people,min=1 Authors collectors management N Y multiselect autocomplete=people,min=1 Data collectors curators management N Y multiselect autocomplete=people,min=1 Data curators members management N Y multiselect autocomplete=people,min=1 Project members leader management N Y multiselect autocomplete=people,min=1 Project leader wpleader management N Y multiselect autocomplete=people,min=1 WP leader depositor management N Y textbox Depositor producer management N Y multiselect autocomplete=producer,min=1 Producer grantNumbers management N Y multiselect autocomplete=grant,min=1 Grant Information kindOfData descriptors Y Y checkbox open=0 Kind of Data Audiovisual,Collection,Dataset,Event,Image,Interactive Resource,Model,Physical Object,Service,Software,Sound,Text,Workflow,Other keywords descriptors N Y multiselect autocomplete=bioportal,onto=EFO:JERM:EDAM:MS:NMR:NCIT:OBI:PO:PTO:AGRO:ECOCORE:IOBC:NCBITAXON Keywords topics descriptors N Y multiselect autocomplete=VOvocab Topic Classification dataOrigin descriptors N Y checkbox open=0 Data origin observational data,experimental data,survey data,analysis data,text corpus,simulation data,aggregate data,audiovisual corpus,computer code,Other experimentfactor descriptors N Y multiselect autocomplete=vocabulary,min=1 Experimental Factor measurement descriptors N Y multiselect autocomplete=vocabulary,min=1 Measurement type technology descriptors N Y multiselect autocomplete=vocabulary,min=1 Technology type publication_citation descriptors N Y areabox rows=5,cols=30 Publication - Citation publication_idtype descriptors N Y dropbox width=200px Publication - ID Type -,ark,arXiv,bibcode,doi,ean13,eissn,handle,isbn,issn,istc,lissn,lsid,pmid,purl,upc,url,urn publication_idnumber descriptors N Y textbox width=400px Publication - ID Number publication_url descriptors N Y textbox Publication - URL comment other N Y areabox rows=15, cols=30 Additional information"},{"location":"definitions/dataverse/","title":"Dataverse Definition File","text":"

Open source research data repository software, approved by Europe.

"},{"location":"definitions/dataverse/#dataverse-definition-file_1","title":"Dataverse definition File","text":"

This definition file will allow Maggot to automatically export the dataset into a data repository based on Dataverse. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Dataverse, knowing that this metadata crosswalk was made possible by choosing the right metadata schema at upstream.

The structure of the Dataverse JSON output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.

The file must have 4 columns with headers defined as follows:

Below an example of Dataverse definition file (TSV)

Example of Dataverse JSON file generated based on the definition file itself given as an example above.

"},{"location":"definitions/json-ld/","title":"JSON-LD Definition File","text":""},{"location":"definitions/json-ld/#json-ld-definition-file_1","title":"JSON-LD definition File","text":"

This definition file will allow harvesters to collect structured metadata based on a semantic schema, i.e the fields themselves and not just their content can be associated with a semantic definition (ontology for example) which will then facilitate the link between the metadata and therefore the data (JSON-LD). The chosen semantic schema is based on several metadata schemas.

The full workflow to \"climb the Link Open Data mountain\" is resumed by the figure below :

Metadata schemas used to build the model proposed by default:

Definition of the JSON-LD context using the metadata schemas proposed by default

The structure of the JSON-LD is not known internally, information on the structure will therefore be necessary to carry out the correspondence.

Example of JSON-LD definition file (partial) using the metadata schemas proposed by default (TSV)

Example of JSON-LD file generated based on the definition file itself given as an example above.

"},{"location":"definitions/mapping/","title":"Mapping Definition File","text":""},{"location":"definitions/mapping/#mapping-definition-file_1","title":"Mapping definition File","text":"

The mapping file is used as indicated by its name to match a term chosen by the user during entry with another term from an ontology or a thesaurus and therefore to obtain a URL which will be used for referencing. It can be used for each metadata crosswalk requiring such a mapping (e.g. to the Dataverse, Zenodo or JSON-LD format).

The role of this definition file is illustrated with the figure above

The file must have 5 columns with headers defined as follows:

Below an example of Mapping definition file (TSV)

"},{"location":"definitions/oai-pmh/","title":"OAI-PMH Definition File","text":"

OAI-PMH is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives.

"},{"location":"definitions/oai-pmh/#oai-pmh-definition-file_1","title":"OAI-PMH definition File","text":"

This definition file will allow harvesters to collect metadata structured according to a standard schema (OAI-DC).

The structure of the OAI-PMH output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.

Example of OAI-PMH definition file (TSV)

Another example of OAI-PMH definition file (TSV) with identifers & vocabulary mapping

"},{"location":"definitions/terminology/","title":"Terminology","text":""},{"location":"definitions/terminology/#definition-of-terminology","title":"Definition of terminology","text":"

There are two definition files to set up.

Each time there is a change in these two definition files, it is necessary to convert them so that they are taken into account by the application.

Terminology is the set of terms used to define the metadata of a dataset. A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces (e.g screenshot) are completely generated from this definition file, thus defining i) each of the fields, their input type (checkbox, dropbox, textbox, ...) and ii) the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms).

The metadata schema proposed by defaut is mainly established according to the DDI (Data Documentation Initiative) schema that also corresponds to that adopted by the Dataverse software.

Terminology is organised in several sections. By default 6 sections are proposed, but you can redefine them as you wish:

For each section, fields are then defined. These fields can be defined according to the way they will be entered via the web interface. There are 6 different types of input: check boxes (checkbox), drop lists (dropbox), single-line text boxes (textbox), single-line text boxes with an additional box for multiple selection from a catalog of terms (multiselect), date picker (datebox) and multi-line text boxes (areabox).

For two types (checkbox and dropbox), it is possible to define the values to be selected (predefined terms).

"},{"location":"definitions/terminology/#structure-of-the-terminology-definition-file-tsv","title":"Structure of the Terminology definition file (TSV)","text":"

The file must have 9 columns with headers defined as follows:

Below an example of Terminology definition file (TSV)

Example of Maggot JSON file generated based on the same definition file

"},{"location":"definitions/terminology/#structure-of-the-terminology-documentation-file-tsv","title":"Structure of the Terminology documentation file (TSV)","text":"

The documentation definition file is used to have online help for each field (small icon placed next to each label on the form). So it should only be modified when a field is added or deleted, or moved to another section. This file will be used then to generate the online metadata documentation according to the figure below (See Configuration to find out how to carry out this transformation).

The file must have 3 columns with headers defined as follows:

Below an example of Terminology documentation file (TSV)

Same example as above converted to HTML format using Markdown format

"},{"location":"definitions/vocabulary/","title":"Vocabulary","text":""},{"location":"definitions/vocabulary/#vocabulary_1","title":"Vocabulary","text":"

1 - Vocabulary based on a list of terms fixed in advance (checbox with feature open=0)

2 - Vocabulary open for addition (checkbox with feature open=1)

3 - Vocabulary based on a web API in a text field (textbox)

4 - Vocabulary based on a dictionary with multiple selection (multiselect)

5 - Vocabulary based on a SKOSMOS Thesaurus with multiple selection (multiselect)

6 - Vocabulary based on an OntoPortal with multiple selection (multiselect)

"},{"location":"definitions/zenodo/","title":"Zenodo Definition File","text":"

Open source research data repository software, approved by Europe.

"},{"location":"definitions/zenodo/#zenodo-definition-file_1","title":"Zenodo definition File","text":"

This definition file will allow Maggot to automatically export the dataset into a data repository based on Zenodo. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Zenodo.

The structure of the Zenodo JSON output file is not known internally, information on the structure will therefore be necessary to carry out the correspondence.

Below an example of Zenodo definition file (TSV)

Example of Zenodo JSON file generated based on the definition file itself given as an example above.

"},{"location":"publish/","title":"Publish Metadata","text":""},{"location":"publish/#publish-metadata_1","title":"Publish Metadata","text":"

"},{"location":"publish/#httpswwwgooglecomsearchqmetadatacrosswalkdefinitionoqmetadatacrosswalk","title":"https://www.google.com/search?q=metadata+crosswalk+definition&oq=metadata+crosswalk","text":""},{"location":"publish/dataverse/","title":"Publish into Dataverse","text":""},{"location":"publish/dataverse/#publish-into-dataverse_1","title":"Publish into Dataverse","text":"

1 - To submit metadata to a Dataverse repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.

2 - You then need to connect to the repository in order to retrieve the key (the API token) authorizing you to submit the dataset. This obviously assumes that you have the privileges (creation/modification rights) to do so.

3 - After choosing the repository URL, you must also specify on which dataverse collection you want to deposit the datasets. As previously, you must have write rights to this dataverse collection.

"},{"location":"publish/dataverse/#deposit-data-files","title":"Deposit data files","text":"

"},{"location":"publish/zenodo/","title":"Publish into Zenodo","text":""},{"location":"publish/zenodo/#publish-into-zenodo_1","title":"Publish into Zenodo","text":"

1 - To submit metadata to a Zenodo repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.

2 - Unless you have previously saved your API token, you must create a new one and copy and paste it before validating it. Before validating, you must check the deposit:access and deposit:write boxes in order to obtain creation and modification rights with this token.

3 - After choosing the repository URL, you can optionally choose a community to which the dataset will be linked. By default, you can leave empty this field.

"},{"location":"publish/zenodo/#deposit-data-files","title":"Deposit data files","text":"

"},{"location":"tutorial/","title":"Quick tutorial","text":""},{"location":"tutorial/#quick-tutorial_1","title":"Quick tutorial","text":"

This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user.

See a short Presentation and Poster if you want to have a more general overview of the tool.

"},{"location":"tutorial/#overview","title":"Overview","text":"

The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:

Configuration

This module mainly concerns the data manager and makes it possible to construct all the terminology definition files, i.e. the metadata and sources of associated vocabularies. See Definition files then Configuration.

Private Access

This module allows data producer to temporarily protect access to metadata for the time necessary before sharing it within his collective. See Private access key management.

Dictionaries

This module allows data producer to view content of all dictionaries. It also allows data steward to edit their content. See Dictionaries for technical details only.

Metadata Entry

This is the main module allowing the data producer to enter their metadata relating to a dataset. See the corresponding tutorial for Metadata Entry.

Search datasets

This module allows users to search datasets based on the associated metadata, to see all the metadata and possibly to have access to the data itself. This obviously assumes that the metadata files have been deposited in the correct directory in the storage space dedicated to data management within your collective. See Infrastructure.

File Browser

This module gives users access to a file browser provided that the data manager has installed it. See File Browser

Publication

This module allows either the data producer or the data steward to publish the metadata with possibly the corresponding data within the suitable data repository. See Publication

"},{"location":"tutorial/describe/","title":"Quick tutorial","text":""},{"location":"tutorial/describe/#metadata-entry","title":"Metadata Entry","text":"

The figures are given here for illustration purposes but certain elements may be different for you given that this will depend on the configuration on your instance, in particular the choice of metadata, and the associated vocabulary sources.

Indeed, the choice of vocabulary sources (ontologies, thesauri, dictionaries) as well as the choice of metadata fields to enter must in principle have been the subject of discussion between data producers and data manager during the implementation of the Maggot tool in order to find the best compromise between the choice of sources and all the scientific fields targeted (see Definition files). However a later addition is always possible.

"},{"location":"tutorial/describe/#overview","title":"Overview","text":"

When you enter the metadata entry module you should see a page that looks like the figure below:

"},{"location":"tutorial/describe/#dictionaries","title":"Dictionaries","text":"

Dictionary-based metadata (e.g. people's names) can easily be entered by autocomplete in the 'Search value' box provided the name appears in the corresponding dictionary.

However, if the name does not yet appear in the dictionary, simply enter the full name (first name & last name) in the main box, making sure to separate each name with a comma and then a space as shown in the figure below.

Then you can request to add the additional person name(s) to the dictionary later as described below:

Please proceed in the same way for all dictionaries (people, funders, producer, vocabulary)

"},{"location":"tutorial/describe/#controlled-vocabulary","title":"Controlled Vocabulary","text":"

Depending on the configuration of your instance, it is very likely that certain fields (eg. keywords) are connected to a controlled vocabulary source (e.g. ontology, thesaurus). Vocabulary based on ontologies, thesauri or even dictionaries can easily be entered by autocomplete in the \"search for a value\" box provided that the term exists in the corresponding vocabulary source.

If a term cannot be found by autocomplete, you can enter the term directly in the main box, making sure to separate each term with a comma and a space as shown in the figure below.

The data steward will later try to link it to a vocabulary source that may be suitable for the domain in question. Furthermore, even if the choice of vocabulary sources was made before the tool was put into service, a later addition is always possible. You should make the request to your data manager.

"},{"location":"tutorial/describe/#resources","title":"Resources","text":"

Because data is often scattered across various platforms, databases, and file formats, this making it challenging to locate and access. This is called data fragmentation. So the Maggot tool allows you to specify resources, i.e. data in the broader sense, whether external or internal, allowing to centralize all links towards data.

Four fields must be filled in :

"},{"location":"tutorial/metadata/","title":"Quick tutorial","text":""},{"location":"tutorial/metadata/#metadata-file","title":"Metadata File","text":"

Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. The file is in JSON format and must have the prefix 'META_'.

By clicking on the \"Generate the metadata file\" button, you can save it on your disk space.

Furthermore, if email sending has been configured (see settings), then you have the possibility of sending the metadata file to the data managers for conservation, and possibly also for supporting its storage on data disk space if specific rights are required.

In the (most common) case where you want to save the metadata file to your disk space, two uses of this file are possible:

1. The first use is the recommended one because it allows metadata management within your collective.

You drop the metadata file directly under the data directory corresponding to the metadata. Indeed, when installing the tool, a storage space dedicated to the tool had to be provided for this purpose. See infrastructure. Once deposited, you just have to wait around 30 minutes maximum so that the tool has had time to scan the root of the data directories looking for new files in order to update the database. After this period, the description of your dataset will be visible from the interface, and a selection of criteria will be made in order to restrict the search.

You will then have the possibility to publish the metadata later with possibly the corresponding data in a data repository such as Dataverse or Zenodo.

2. The second use is only to deposit the metadata into a data repository

Whether with Dataverse or Zenodo, you have the possibility of publishing metadata directly in one or other of these repositories without using the storage space.

Please note that you cannot also deposit the data files in this way. You will have to do this manually for each of them directly online in the repository.

"}]} \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz index 63dc2ef..caf5dbd 100755 Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ diff --git a/tutorial/index.html b/tutorial/index.html index c4594fa..2a07a08 100755 --- a/tutorial/index.html +++ b/tutorial/index.html @@ -763,7 +763,7 @@

Quick tutorialQuick tutorial

This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user.

-

See a short Presentation and Poster if you want to have a more general overview of the tool..

+

See a short Presentation and Poster if you want to have a more general overview of the tool.


Overview

The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below: