diff --git a/print_page/index.html b/print_page/index.html
index 4be4014..c4d24fb 100755
--- a/print_page/index.html
+++ b/print_page/index.html
@@ -858,7 +858,7 @@ <h3 id="index-additional-key-points">Additional key points<a class="headerlink"
 
 <h3 id="tutorial-quick-tutorial_1">Quick tutorial<a class="headerlink" href="#tutorial-quick-tutorial_1" title="Permanent link">&para;</a></h3>
 <p>This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user. </p>
-<p><em>See a short <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Presentation_Jan2024.pdf?download=false" target="_blank">Presentation</a> and <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Poster_Oct2023.pdf?download=false" target="_blank">Poster</a> if you want to have a more general overview of the tool..</em></p>
+<p>See a short <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Presentation_Jan2024.pdf?download=false" target="_blank">Presentation</a> and <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Poster_Oct2023.pdf?download=false" target="_blank">Poster</a> if you want to have a more general overview of the tool.</p>
 <p><br></p>
 <h4 id="tutorial-overview">Overview<a class="headerlink" href="#tutorial-overview" title="Permanent link">&para;</a></h4>
 <p>The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:</p>
diff --git a/search/search_index.json b/search/search_index.json
index ce77157..825b180 100755
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\\\s\\\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"<p> An ecosystem for sharing metadata <p> </p>"},{"location":"#foster-good-data-management-with-data-sharing-in-mind","title":"Foster good data management, with data sharing in mind","text":"<p>Sharing descriptive Metadata is the first essential step towards Open Scientific Data. With this in mind, Maggot was specifically designed to annotate datasets by creating a metadata file to attach to the storage space. Indeed, it allows users to easily add descriptive metadata to datasets produced within a collective of people (research unit, platform, multi-partner project, etc.). This approach fits perfectly into a data management plan as it addresses the issues of data organization and documentation, data storage and frictionless metadata sharing within this same collective and beyond.</p>"},{"location":"#main-features-of-maggot","title":"Main features of Maggot","text":"<p>The main functionalities of Maggot were established according to a well-defined need (See Background).</p> <ol> <li>Documente with Metadata your datasets produced within a collective of people, thus making it possible :<ul> <li>to answer certain questions of the Data Management Plan (DMP) concerning the organization, documentation, storage and sharing of data in the data storage space, </li> <li>to meet certain data and metadata requirements, listed for example by the Open Research Europe in accordance with the FAIR principles.</li> </ul> </li> <li>Search datasets by their metadata<ul> <li>Indeed, the descriptive metadata thus produced can be associated with the corresponding data directly in the storage space then it is possible to perform a search on the metadata in order to find one or more sets of data. Only descriptive metadata is accessible by default.</li> </ul> </li> <li>Publish the metadata of datasets along with their data files into an Europe-approved repository</li> </ol> <p>See a short Presentation and Poster for a quick overview.</p> <p></p>"},{"location":"#overview-of-the-different-stages-of-metadata-management","title":"Overview of the different stages of metadata management","text":"<p> Note: The step numbers indicated in the figure correspond to the different points developed below </p> <p>1 - First you must define all the metadata that will be used to describe your datasets.   All metadata can be defined using a single file (in TSV format, therefore using a spreadsheet). This is a unavoidable step because both input and search interfaces are completely generated from these definition files, defining in this way each of the fields along with their input type and also the associated Controlled Vocabulary (ontology, thesaurus, dictionary, list of fixed terms). The metadata proposed by default was mainly established according to the DDI (Data Documentation Initiative) metadata schema. This schema also largely corresponds to that adopted by the Dataverse software. See the Terminology Definition section.  </p> <p>2 - Entering metadata will be greatly facilitated by the use of dictionaries.   The dictionaries offered by default are: people, funders, data producers, as well as a vocabulary dictionary allowing you to mix ontologies and thesauri from several sources. Each of these dictionaries allows users, by entering a name by autocompletion, to associate information which will then be added when exporting the metadata either to a remote repository, or for harvesting the metadata. Thus this information, once entered into a dictionary, will not need to be re-entered again.  </p> <p>3 - The web interface for entering metadata is entirely built on the basis of definition files.    The metadata are distributed according to the different sections chosen, each constituting a tab (see screenshot). Mandatory fields are marked with a red star and must be documented in order to be able to generate the metadata file. The entry of metadata governed by a controlled vocabulary is done by autocompletion from term lists (dictionary, thesaurus or ontology). We can also define external resources (URL links) relating to documents, publications or other related data. Maggot thus becomes a hub for your datasets connecting different resources, local and external. Once the mandatory fields (at least) and other recommended fields (at best) have been entered, the metadata file can be generated in JSON format.  </p> <p>4 - The file generated in JSON format must be placed in the storage space reserved for this purpose.    The role played by this metadata file can be seen as a README file adapted for machines, but also readable by humans. With an internal structure, it offers coherence and consistency of information that a simple README file with a completely free and therefore unstructured text format does not allow. Furthermore, the central idea is to use the storage space as a local data repository, so that the metadata should go to the data and not the other way around.  </p> <p>5 - A search of the datasets can thus be carried out on the basis of the metadata.    Indeed, all the JSON metadata files are scanned and parsed according to a fixed time interval (30 min) then loaded into a database. This allows you to perform searches based on predefined metadata. The search form, in a compact shape, is almost the same as the entry form (see a screenshot). Depending on the search criteria, a list of data sets is provided, with for each of them a link pointing to the detailed sheet.  </p> <p>6 - The detailed metadata sheet provides all the metadata divided by section.   Unfilled metadata does not appear by default. When a URL can be associated with information (ORCID, Ontology, web site, etc.), you can click on it to go to the corresponding link. Likewise, it is possible to follow the associated link on each of the resources. From this sheet, you can also export the metadata according to different schemata (Dataverse, Zenodo, JSON-LD). See screenshot 1 &amp; screenshot 2.  </p> <p>7 - Finally, once you have decided to publish your metadata with your data, you can choose the repository   that suits you (currently repositories based on Dataverse and Zenodo are supported).  </p> <p></p>"},{"location":"#additional-key-points","title":"Additional key points","text":"<ul> <li><p>Being able to generate descriptive metadata from the start of a project or study without waiting for all the data to be acquired or processed, nor for the moment when one wish to publish data, thus respecting the research data lifecycle as best as possible</p></li> <li><p>The implementation of the tool requires involving all data stakeholders upstream (definition of the metadata schema, vocabularies, targeted data repositories, etc.); everyone has their role: data manager/data steward on one side but also scientists and data producers on the other.</p></li> <li><p>A progressive rise towards an increasingly controlled and standardized vocabulary is not only possible but even encouraged. First we can start with a simple vocabulary dictionary used locally and grouping together domain vocabularies. Then we can consider the creation of a thesaurus with or without mapping to ontologies. The promotion of ontologies must also be done gradually by selecting those which are truly relevant for the collective. A tool like Maggot makes it easy to implement them.</p></li> </ul> <p></p> <p></p>"},{"location":"about/","title":"About","text":""},{"location":"about/#background","title":"Background","text":""},{"location":"about/#motives","title":"Motives","text":"<ul> <li>Meet the challenges of organizing, documenting, storing and sharing data from a site, a project or a structure (unit, platform, etc.).</li> <li>Have visibility of what is produced within the collective: datasets, software, databases, images, sounds, videos, analyses, codes, etc.</li> <li>Fall within an open science quality approach for sharing and reproducibility.</li> <li>Promote FAIR (at least the Findable &amp; Accessible criteria) within the collective.</li> <li>Raise awareness among newcomers and students about a better description of what they produce.</li> </ul>"},{"location":"about/#state-of-need","title":"State of need","text":"<ul> <li>Implementing a data management plan imposes prerequisites such as the externalization of data to be preserved outside of users' disk space. This does not only concern published data but all data produced during the duration of a project. Above all, this outsourcing makes it possible to gather the data in one place and already constitutes a first-level backup. This becomes even more necessary when temporary agents (doctoral students, post-docs, interns, fixed-term contracts) are involved in data production.</li> <li>Consequently, the concern arises about the organization of these storage spaces. Should they be harmonized, i.e. impose good practices such as i) the naming of folders and files, ii) a folder structure (docs, data, scripts, etc.), iii) the use of README files, etc.</li> <li>At a minimum, using a README file seems the simplest and least restrictive. But then the question arises \u201cwhat to put in it\u201d? Templates can be offered to simplify their writing. But then the question arises of how to use them effectively when we want to find information? With what vocabulary?</li> </ul>"},{"location":"about/#proposed-approach","title":"Proposed approach","text":"<ul> <li>The two main ideas behind the tool are:<ul> <li>Make the data storage space a data repository without having to move the data, then ensure that the metadata gets to the data.</li> <li>Be able to \u201ccapture\u201d the user\u2019s metadata as easily as possible by using their vocabulary.</li> </ul> </li> <li>Concerning the first idea: \"Just\" place a metadata file (JSON format) describing the project data in each subdirectory, and then find the projects and/or data corresponding to specific criteria. The choice fell on the JSON format, very suitable for describing metadata, readable by both humans and machines.</li> <li> <p>Concerning the second idea: Given the diversity of the fields, the approach chosen is to be both the most flexible and the most pragmatic possible by allowing users to choose their own vocabulary (controlled or not) corresponding to the reality of their field and their activities. However, a good approach is as much as possible to use only controlled vocabulary, that is to say relevant and sufficient vocabulary used as a reference in the field concerned to allow users to describe a project and its context without having to add additional terms. To this end, the tool must allow users a progressive approach towards the adoption of standardized controlled vocabularies (thesauri or even ontologies).</p> </li> <li> <p>With the approach proposed by Maggot, initially there is no question of opening the data, but of managing metadata associated with the data on a storage space with a precise perimeter represented by the collective (unit, team, project , platform, \u2026). The main characteristic of the tool is, above all, to \u201ccapture\u201d the metadata as easily as possible according to a well-chosen metadata schema. However, the opening of data via their metadata must be a clearly stated objective within the framework of projects financed by public institutions (e.g Europe). Therefore if you have taken care to correctly define your metadata schema so that it is possible to make a metadata crosswalk (using a mapping file) with a data repository recognized by the international community, then you can easily \"push\" its metadata with the data without having re-enter anything.</p> </li> </ul>"},{"location":"about/#links","title":"Links","text":"<ul> <li>Source code on Github : inrae/pgd-mmdt</li> <li>Issues tracker : inrae/pgd-mmdt/issues</li> <li>Instance online : INRAE UMR 1322 BFP</li> </ul>"},{"location":"about/#contacts","title":"Contacts","text":"<ul> <li>Daniel Jacob (INRAE UMR BFP) : daniel.jacob @ inrae.fr</li> </ul>"},{"location":"about/#designers-developers","title":"Designers / Developers","text":"<ul> <li> <p>Daniel Jacob (INRAE UMR BFP) | CATI PROSODIe</p> </li> <li> <p>Fran\u00e7ois Ehrenmann (INRAE UMR BioGECO) | CATI GEDEOP</p> </li> <li> <p>Philippe Chaumeil (INRAE UMR BioGECO)</p> </li> </ul>"},{"location":"about/#contributors","title":"Contributors","text":"<ul> <li> <p>Edouard Guitton (INRAE Dept. SA, Emerg'IN)</p> </li> <li> <p>St\u00e9phane Bernillon (INRAE UR MycSA)</p> </li> <li> <p>Joseph TRAN (INRAE UMR EGFV) | CATI BARIC</p> </li> </ul> <p></p> <p> </p> <p></p>"},{"location":"bloxberg/","title":"Bloxberg Blockchain","text":""},{"location":"bloxberg/#experimental-certification-of-metadata-file-on-the-bloxberg-blockchain","title":"EXPERIMENTAL - Certification of metadata file on the bloxberg blockchain","text":""},{"location":"bloxberg/#motivation","title":"Motivation","text":"<p>To guarantee the authenticity and integrity of a metadata file by recording it permanently and immutably on the bloxberg blockchain.</p> <p>Indeed, the blockchain is a technology that makes it possible to keep track of a set of transactions (writings in the blockchain), in a decentralized, secure and transparent manner, in the form of a blockchain. A blockchain can therefore be compared to a large (public or private) unfalsifiable register. Blockchain is today used in many fields because it provides solutions to many problems. For example in the field of Higher Education and Research, registration of dataset metadata in the blockchain, makes possible in this way to certify, in an inalienable, irrefutable and completely transparent manner, the ownership and authenticity of the data as well as for example, the license of use and the date of production of the data. Research stakeholders are then more open to the dissemination of their data (files, results, protocols, publications, etc.) since they know that, in particular, the ownership, content and conditions of use of the data cannot not be altered.</p> <p>The Maggot tool could thus serve as a gateway to certify its data with the associated metadata. The complete process is schematized by the following figure:</p> <p> </p>"},{"location":"bloxberg/#about-bloxberg","title":"About bloxberg","text":"<p>bloxberg is the most important blockchain project in science. It was founded in 2019 by MPDL , looking for a way to store research results and make them available to other researchers. In this sense, bloxberg is a decentralized register in which results can be stored in a tamper-proof way with a time stamp and an identifier.</p> <p>bloxberg is based on the Ethereum Blockchain. However, it makes use of a different consensus mechanism: instead of \u201cProof of Stake\u201d used by Ethereum since 2022, bloxberg validates blocks through \u201cProof of Authority\u201d. Each node is operated by one member. All members of the association are research institutions and are known in the network.  Currently, bloxberg has 49 nodes. It is an international project with participating institutions from all over the world.</p>"},{"location":"bloxberg/#how-to-process","title":"How to process ?","text":"<p>You will need a Ethereum address and an API key (must be requested via bloxberg-services (at) mpdl.mpg.de). See an example of pushing a metadata file to the bloxberg blockchain using Maggot.</p> <p></p>"},{"location":"bloxberg/#useful-links","title":"Useful links","text":"<ul> <li>Bloxberg Documentation</li> <li>Blockexplorer</li> <li>Blockchain ESR (France)</li> </ul>"},{"location":"configuration/","title":"Configuration","text":""},{"location":"configuration/#terminology-configuration","title":"Terminology configuration","text":"<p>A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces are completely generated from this definition file, thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration and conversion step into JSON format is essential in order to be able to configure all the other modules (example: creation of the MongoDB database schema when starting the application before filling it).</p> <p> </p> <ul> <li>Note : The step numbers shown in the figure above are mentioned in brackets in the text below.</li> </ul>"},{"location":"configuration/#tsv-to-json","title":"TSV to JSON","text":"<ul> <li> <p>This function is used to generate the terminology definition file in JSON format (config_terms.json) and the corresponding JSON-Schema file (maggot-schema.json) from a tabulated file (1). You can either create a terminology definition file in TSV format from scratch (see below to have more details), or extract the file from the current configuration (see JSON to TSV).</p> </li> <li> <p>Once the terminology definition file has been obtained (2), you can load it and press 'Submit'.</p> </li> <li> <p>Three files are generated (3 &amp; 5):</p> </li> <li>config_terms.json and maggot-schema.json : These files should be placed in the web/conf directory (3). A (re)start of the application must be done in full mode (4) (sh ./run fullstart)</li> <li>config_doc.txt (5) : This file serves as a template for the documentation of the metadata profile. You should edit it with a spreadsheet program, and fill in the description column (6). Then it is used to generate the documentation file in markdown format (see TSV to DOC).</li> </ul>"},{"location":"configuration/#tsv-to-doc","title":"TSV to DOC","text":"<ul> <li> <p>This function generates the markdown documentation file (doc.md) from the template file (config_doc.txt) which is itself generated from the metadata definition file (config_terms.txt, cf TSV to JSON).</p> </li> <li> <p>Once the template file for the documentation (config_doc.txt) has been edited and documented (6) (see below to have more details), you can load it and press Submit button.</p> </li> <li> <p>The documentation file in markdown format (doc.md) is thus generated (7) and must be placed in the web/docs directory (8). Users will have access to this documentation file via the web interface, in the documentation section, heading \"Metadata\".</p> </li> </ul>"},{"location":"configuration/#json-to-tsv","title":"JSON to TSV","text":"<ul> <li>This function allows you to extract the terminology definition file in TSV format (config_terms.txt) from the current configuration. This allows you to start from this file, either to adapt your own metadata profile or simply to modify it slightly.</li> </ul>"},{"location":"dictionaries/","title":"Dictionaries","text":""},{"location":"dictionaries/#presentation","title":"Presentation","text":"<ul> <li>The use of dictionaries has no other purpose to facilitate the entry of metadata, entry which can be long and repetitive in generalist data warehouses (such as repository based on Dataverse).</li> <li>Dictionaries allow you to record multiple information necessary to define an entity, such as the names of people or even the funders. These information, once entered and saved in a file called a dictionary, can be subsequently associated with the corresponding entity. </li> <li>The dictionaries offered by default are: people (people), funders (grant), data producers (producer), as well as a vocabulary dictionary (vocabulary) allowing you to mix ontologies and thesauri from several sources.</li> <li>To add a new dictionary, simply create a directory under web/cvlist then putting the files corresponding to the dictionary inside. Dictionaries will be automatically found by browsing this directory.</li> <li>Dictionary files are made using a simple spreadsheet then exported in TSV format.</li> <li>Dictionaries are accessed through secure access limited to administrators allowing their editing. The login is by default 'admin'. You can add another account for consultation only using the following command:  <pre><code>sh ./run passwd &lt;user&gt;\n</code></pre></li> </ul>"},{"location":"dictionaries/#the-people-dictionary","title":"The people dictionary","text":"<ul> <li>Note : must not be changed in its format nor in its name.</li> <li> <p>Like any dictionary, there must be 3 files (see below). Please note that the names of these files must always contain the name of the dictionary, i.e. same as the directory.  </p> </li> <li> <p>The format of the file containing the dictionary data (people.txt) is defined by another file (people_format.txt).</p> </li> </ul> <p> </p> <ul> <li>Thus, we know that the people dictionary must contain 5 columns (last name, first name, institution, ORCID number and email address) and that some fields are mandatory (last name, first name, institution) and others optional (ORCID number, email address).</li> <li>Each of the fields must respect a format specified by a regular expression in order to be accepted as valid.</li> <li>Optionally, you can connect an web API to each of the fields in order to make an entry by autocompletion from a remote register. Currently only ROR (Research Organization Registry) web API is possible but the mechanism is in place for new extensions.</li> <li>The third file, a very simple script written in JavaScript, defines the way to retrieve the list of names (here by containing the first and last name). Note that the name of the variable must always be identical to that of the dictionary. <pre><code>var people = [];\n// Each item in the 'people' list consists of the first two columns (0,1) separated by a space\nget_dictionary_values('people', merge=[0,' ',1]) </code></pre></li> <li> <p>Below, an example is given when modifying a record. When you click on the Institute field which is connected to the ROR web API, the drop-down list of reseach organizations that can correspond in the register appears, if there are any.  </p> </li> <li> <p>Note: It is possible to edit dictionaries, by adding an entry for example, and at the same time be able to immediately find this new entry in the metadata entry in the Maggot tool. Indeed each dictionary is reloaded into memory as soon as the corresponding input box is clicked. See an illustration.</p> </li> </ul> <p></p>"},{"location":"dictionaries/#other-dictionaries","title":"Other dictionaries","text":"<ul> <li> <p>Funders : The dictionary of the funders allows you to define the funding agency, project ID and its corresponding URL.  </p> <ul> <li>Note : can be renamed but while keeping its format (same columns and same layout).  </li> </ul> </li> <li> <p>Producers : The dictionary of the data producers allows you to define their Institute and  project ID and their corresponding URL. Optionally, you can add the URL of the logo.  </p> <ul> <li>Note : can be renamed but while keeping its format (same columns and same layout).  </li> </ul> </li> <li> <p>Vocabulary : Use this dictionary for mixing thesauri and ontologies in order to better target the entire controlled vocabulary of its field of application. Only the vocabulary is mandatory, the URL linked to an ontology or a thesaurus is optional. See Vocabulary section to learn the extent of the possibilities concerning vocabulary in Maggot.  </p> <ul> <li>Note : can be duplicated but while keeping its format (same columns and same layout).  </li> </ul> </li> </ul> <p></p>"},{"location":"gant/","title":"Gant","text":""},{"location":"gant/#gantt-diagrams-of-the-developments","title":"Gantt diagrams of the developments","text":"gantt     dateFormat YYYY-MM-DD     axisFormat  %Y-%m     title Diagrammes de Gantt pr\u00e9visionnel des d\u00e9veloppements     section MongoDB        1: des1, 2023-11-01,60d        2: des2, 2023-12-01,90d        3: des3, 2023-12-01,90d     section Couche API        4: des4, 2024-01-01,120d        5: des5, 2024-04-01,60d     section Interface Web        6a: des6, 2024-06-01,60d        6b: des7, 2024-07-01,60d        6c: des8, 2024-09-01,60d"},{"location":"infrastructure/","title":"Infrastructure","text":""},{"location":"infrastructure/#infrastructure-local-remote-or-mixed","title":"Infrastructure : Local, Remote or Mixed","text":"<p>The necessary Infrastructure involves 1) a machine running a Linux OS and 2) a dedicated storage space.</p> <p>1 - The machine will most often be of \"virtual\" type because more simpler to deploy, either locally (with VM providers such as VirtualBox, VMware Workstation or MS Hyper-V) or remotely (e.g VMware ESXi, Openstack: example of deployment). Moreover, the OS of your machine must allow you the deployment of docker containers. See for more details on \u201cWhat is Docker\u201d. The minimum characteristics of the VM are:  2 cpu, 2 Go RAM, 8 Go HD.</p> <p>2 - The dedicated storage space could be either in the local space of the VM, or in a remote place on the network.</p> <ul> <li>If the storage space is directly included in the VM, then tools like WinSCP or RcloneBrowser will allow you to easily transfer your files to the data space.</li> <li>If the storage space is your collective's NAS, you will need to make sure to open the port corresponding to the remote disk mount protocol (e.g SMB, NFS, iSCSI, ...). on your network's firewall. If both VM and data storage are not in the same private network, it will probably also require installing the sofware layer corresponding to your corporate VPN on the VM so that it can access your NAS. See example successfully tested.</li> <li>If the storage space is in a data center (e.g. NextCloud, Google Drive), then you will need to install a tool such as rclone on your VM in order to be able to mount the storage space on the VM's disk space. See example successfully tested.</li> </ul> <p></p>"},{"location":"installation/","title":"Installation","text":""},{"location":"installation/#install-on-your-linux-computer-or-linux-unix-server","title":"Install on your linux computer or linux / unix server","text":"<p>Requirements: The installation must be carried out on a (virtual) machine with  a recent Linux OS that support Docker (see Infrastructure)</p> <p></p>"},{"location":"installation/#retrieving-the-code","title":"Retrieving the code","text":"<p>Go to the destination directory of your choice then clone the repository and <code>cd</code> to your clone path:</p> <pre><code>git clone https://github.com/inrae/pgd-mmdt.git pgd-mmdt\ncd pgd-mmdt\n</code></pre> <p></p>"},{"location":"installation/#installation-of-docker-containers","title":"Installation of Docker containers","text":"<p>MAGGOT uses 3 Docker images for 3 distinct services:</p> <ul> <li>pgd-mmdt-db which hosts the MongoDB database</li> <li>pgd-mmdt-scan which scans the data and updates the contents of the database and the web interface</li> <li>pgd-mmdt-web which hosts the web server and the web interface pages</li> </ul> <p></p>"},{"location":"installation/#configuration","title":"Configuration","text":"<ul> <li>run : defines root of the data directory (including for development)</li> <li>dockerdbpart/initialisation/setupdb-js.template : defines MongoDB settings</li> <li>dockerscanpart/scripts/config.py : defines MongoDB settings (dbserver, dbport, username, password)</li> <li>web/inc/config/mongodb.inc : defines MongoDB settings (dbserver, dbport, username, password)</li> <li>web/inc/config/config.inc : defines many of web parameters (modify only if necessary)</li> <li>web/inc/config/local.inc : defines the application parameters specific to the local installation (not erase when updating).</li> </ul> <p>See Configuration settings</p> <p>Warning : You have to pay attention to put the same MongoDB settings in all the above configuration files. It is best not to change anything. It would have been preferable to put a single configuration file but this was not yet done given the different languages involved (bash, javascript, python, PHP). To be done!</p> <p>Note : If you want to run multiple instances, you will need to change in the run file, i) the container names, ii) the data path, iii) the MongoDB volume name.</p> <p>The following two JSON files are defined by default but can be easily configured from the web interface. See the Terminology Definition section.</p> <ul> <li>web/conf/config_terms.json : define the terminology</li> <li>web/conf/maggot-schema.json : define the JSON schema used to validate metadata files.</li> </ul> <p></p>"},{"location":"installation/#commands","title":"Commands","text":"<p>The run shell script allows you to perform multiple actions by specifying an option :</p> <pre><code>cd pgd-mmdt\nsh ./run &lt;option&gt;\n</code></pre> <p>Options:</p> <ul> <li>build : Create the 3 Docker images namely pgd-mmdt-db, pgd-mmdt-scan and pgd-mmdt-web</li> <li>start : 1) Launch the 3 services by creating the Docker containers corresponding to the Docker images; 2) Create also the MongoDB volume.</li> <li>stop :  1) Remove all the 3 Docker containers; 2) Remove the MongoDB volume.</li> <li>initdb : Create and initialize the Mongo collection</li> <li>scan : Scan the data  according to a fixed period (30 min) and update the contents of the database and the web interface</li> <li>fullstart : Perform the 3 actions start, initdb and scan</li> <li>restart : Perform the 2 actions stop then fullstart</li> <li>ps : Check that all containers are running correctly</li> <li>passwd &lt;user&gt;: Define the admin password if no user is specified, allowing you to copy the new configuration file on the server via the web interface (see configuration and to add entries in dictionaries. If a user is specified, the dictionary consultation will be authorized for this user.</li> </ul> <p></p>"},{"location":"installation/#starting-the-application","title":"Starting the application","text":"<ul> <li> <p>You must first build the 3 docker container images if this has not already been done, by :    <pre><code>sh ./run build\n</code></pre></p> </li> <li> <p>The application can be sequentially started :</p> <ul> <li>Starting the web interface  <pre><code>sh ./run start\n</code></pre></li> <li>Initialization of the MongoDB database  <pre><code>sh ./run initdb\n</code></pre></li> <li>Scanning the data directory for metadata files (META_XXXX.json)  <pre><code>sh ./run scan\n</code></pre></li> </ul> </li> <li> <p>You can also launch these 3 steps with a single command:    <pre><code>sh ./run fullstart\n</code></pre></p> </li> </ul> <p></p>"},{"location":"installation/#launching-the-web-application-in-the-web-browser","title":"Launching the web application in the web browser","text":"<ul> <li> <p>Once the application is started, we can see if the containers are started using the following command:    <pre><code>docker ps -a\n</code></pre></p> </li> <li> <p>which should produce a result similar to the following:</p> </li> </ul> <pre>\n   CONTAINER ID  IMAGE          COMMAND                 CREATED          STATUS         PORTS                                  NAMES\n   5914504f456d  pgd-mmdt-web   \"docker-php-entrypoi.\"  12 seconds ago   Up 10 seconds  0.0.0.0:8087-&gt;80/tcp, :::8087-&gt;80/tcp  mmdt-web\n   226b13ed9467  pgd-mmdt-scan  \"cron -f\"               12 seconds ago   Up 11 seconds                                         mmdt-scan\n   81fecbb56d23  pgd-mmdt-db    \"docker-entrypoint.s.\"  13 seconds ago   Up 12 seconds  27017/tcp                              mmdt-db\n</pre> <ul> <li> <p>On the first line, the one which corresponds to the web interface, we see that port 80 of the docker is exported to port 8087 of the VM. Let's say that the IP address of your VM is 192.168.56.2, then in your browser you will need to put the URL http://192.168.56.2:8087/. You can of course change the port number in the 'run' file.</p> </li> <li> <p>It may be preferable to use a lightweight http server like nginx so that the Maggot URL will be http://192.168.56.2/maggot/. Below an example of config:    <pre><code>## /etc/nginx/nginx.conf\nhttp {\n\n...\n    upstream maggot  { server 127.0.0.1:8087; }\n...\n\n}\n\n## /etc/nginx/conf.d/my-site.conf\n\nserver {\nlisten 80 default;\nserver_name $host;\n\n...\n\n    location /maggot/ {\nproxy_set_header Host $host;\nproxy_set_header X-App-Name 'maggot';\nproxy_set_header X-Real-Ip $remote_addr;\nproxy_set_header X-Forwarded-Host $host;\nproxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\nproxy_pass http://maggot/;\n}\n\n...\n\n}\n</code></pre></p> </li> </ul> <p></p>"},{"location":"installation/#stoping-the-application","title":"Stoping the application","text":"<ul> <li>To stop the application :    <pre><code>sh ./run stop\n</code></pre></li> </ul>"},{"location":"installation/#updating-the-application","title":"Updating the application","text":"<p>When updating the application, it is imperative to preserve a whole set of configuration files as well as the content of certain directories (dictionaries, javascripts dedicated to vocabularies, etc.). An update script is available (./etc/update-maggot.sh) preferably placed under '/usr/local/bin'. To preserve your configuration, it is recommended to create local configuration files.</p> <ul> <li> <p>A first file 'local.conf' will contain all the parameters to be preserved, initially contained in the 'run' file. A small example could be as follow :    <pre><code>#!/bin/bash\n\n# Local HTTP Port for web application\nWEB_PORT=8088\n\n# Path to the data\nDATADIR=/media/Workdir/Share/DATA/\n</code></pre></p> </li> <li> <p>A second file './web/inc/config/local.inc' will contain all the parameters to be preserved, initially contained in the './web/inc/config/config.inc' file. A small example could be as follow :    <pre><code>&lt;?php\n\n# Main title\n$TITLE ='Metadata management - My Labs';\n$MAINTITLE =$TITLE;\n\n# File Browser\n$FILEBROWSER=1;\n$URL_FILEBROWSER='/fb/';\n\n# Enable some functionalities\n$export_oai = 1;\n\n?&gt;\n</code></pre></p> </li> </ul>"},{"location":"installation/#architecture-diagram","title":"Architecture diagram","text":"<p> Note: See how to do proceed for configuration steps. </p>"},{"location":"installation/#file-browser","title":"File Browser","text":"<p>You can provide access to your data via a file browser. This application must be installed separately but can be connected to Maggot by specifying the corresponding URL in the configuration file. Users and their rights are managed in the filebrowser application. Likewise, we can also create links to the data without a password. These links can be usefully specified as external resources in the metadata managed by Maggot.</p> <p> See how to do install in github. </p> <p></p>"},{"location":"private-access/","title":"Private access","text":""},{"location":"private-access/#private-access-key-management","title":"Private access key management","text":""},{"location":"private-access/#motivation","title":"Motivation","text":"<p>Although the Maggot tool is designed to foster the sharing of metadata within a collective, it may be necessary to temporarily privatize access to the metadata of an ongoing project with confidentiality constraints. So even within our own collective, access to metadata must be restricted to authorized users only.</p>"},{"location":"private-access/#implementation","title":"Implementation","text":"<p>The choice of not wanting to manage users in the Maggot tool was made in order to make the metadata completely open by default within a collective. Furthermore, access rights to the storage space are managed independently of the Maggot tool by the administrator of this space. It is therefore through the storage space that we must give or not access to the metadata via the web interface.</p> <p>The chosen mechanism for privatizing access is described below. It has the dual advantage of being simple to implement and simple to use.</p> <ol> <li> <p>First we have to generate a file containing the encrypted key for a private access. This file must be generated from the web interface then downloaded as shown in the figure below. Then this file must be manually deposited in the data directory corresponding to the dataset whose access we wish to privatize. The presence of this file within a directory is enough to block access to metadata and data by default. It should be noted that we can put this same file containing the encrypted private key in several data directories (included within the same project for example). The deposit must be done by hand because the Maggot tool must only have access to the storage space in read mode. This also guarantees that the user has writing rights to this space without having to manage user accounts on the Maggot side.</p> <p> </p> <p> By default, \u2018untwist1\u2019 metadata are not accessible to anyone </p> </li> <li> <p>When we want to have access to the metadata of this dataset, we have to simply enter the private key in the current session. This will have the effect of unlocking access to the metadata via the web interface only in the current session of our web browser. This means that we will have to enter the private key for each session (by default, a session lasts a maximum of 1 hour).</p> <p> </p> <p> Now the \u2018untwist1\u2019 metadata are accessible only to us </p> </li> <li> <p>When we want to give access to the metadata to the entire collective, we simply need to delete the private access file (named by default 'META_auth.txt') from the concerned data directory.</p> </li> </ol> <p></p>"},{"location":"settings/","title":"Configuration settings","text":""},{"location":"settings/#configuration-settings_1","title":"Configuration settings","text":"<p>Here is the list of all files that may be subject to adjustment of certain parameters according to the needs of the instance site.</p> <p></p>"},{"location":"settings/#dockerscanpartscriptsconfigpy","title":"dockerscanpart/scripts/config.py","text":"<p>This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.</p> <p>Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template</p> Parameter Description Default value dbserver Name of the MongoDB server mmdt-db database Name of the MongoDB database pgd-db dbport Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read/Write access userw-pgd password Password corresponding to the username of the Mongo DB pgd-db wwwww <p></p>"},{"location":"settings/#incconfigmongodbinc","title":"inc/config/mongodb.inc","text":"<p>This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.</p> <p>Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template</p> Parameter Description Default value docker_mode Indicates whether the installation involves using docker containers. In this case, the Mongo DB IP address will be different from 127.0.0.1. 1 uritarget the Mongo DB IP address mmdt-db (docker_mode=1) or 127.0.0.1 (docker_mode=0) database Name of the MongoDB database pgd-db collection Name of the MongoDB collection metadata port Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read access only userr-pgd password Password corresponding to the username of the Mongo DB pgd-db rrrrr <p></p>"},{"location":"settings/#incconfigconfiginc","title":"inc/config/config.inc","text":"<p>This file defines parameters related to i) the web interface, ii) the functionalities allowed for users. Only the parameters that could be useful to be changed for the needs of an instance are described here.</p> Parameter Description Default value EXTERN Indicates if the use of the tool is only for external use, i.e. without using a storage space. 0 PRIVATE_ACCESS Gives the possibility of managing private access to metadata 0 ZOOMWP Zoom level regarding the web interface. By reducing the size slightly, you get a better layout. 90% RESMEDIA Gives the possibility of putting a MINE type on each resource in the metadata. 1 TITLE Title to display in main banner Metadata management FILEBROWSER Indicates whether the file browser is used. This assumes it is installed. 0 URL_FILEBROWSER File browser URL. It can be absolute or relative. /fb/ APPNAME Name given in the URL to access the web interface. maggot dataverse_urls Array of Dataverse repository URLs where you can upload metadata and data - zenodo_urls Array of Zenodo repository URLs where you can upload metadata and data - SERVER_URL Default Dataverse repository URL https://entrepot.recherche.data.gouv.fr ZENODO_SERVER_URL Default Zenodo repository URL https://zenodo.org export_dataverse Indicates whether the Dataverse feature is enabled 1 export_zenodo Indicates whether the Zenodo feature is enabled 1 export_jsonld Indicates whether the JSON-LD feature is enabled 1 export_oai Indicates whether the OAI-PMH feature is enabled 0 export_bloxberg Indicates whether the Bloxberg Blockchain feature is enabled (Experimental) 0 cvdir Relative path of the Control Vocabulary Listes (cvlist) cvlist/ maggot_fulltitle Maggot name of the field corresponding to the title in dataverse/zenodo fulltitle auth_senddata_file Name of the file that must be present in the data directory to authorize the transfer of the data file META_datafile_ok.txt private_auth_file Name of the private access file META_auth.txt sendMail Configuring messaging for sending metadata to data managers (see below) NULL <p></p> <p>The messaging configuration is done using the following array in the inc/config/config.inc file (or more judiciously in inc/config/local.inc in order to be preserved during an update) - To understand how it works see Send Emails using PHPmailer</p> <pre><code>$sendMail['smtpHost'] = 'smtp.example.org';        //  Set the SMTP server to send through\n$sendMail['smtpSecure'] = 'tls';                   //  Enable TLS encryption\n$sendMail['smtpPort'] = 587;                       //  Set the TCP port to connect to\n$sendMail['CheckEmail'] = 'maggot@exemple.org';    //  Email address authorized to send emails\n$sendMail['CheckPass'] = 'password';               //  The corresponding password\n$sendMail['CheckName'] = 'Maggot';                 //  Alias name\n$sendMail['UserEmail'] = 'admin@exemple.org';      //  Email of data managers, separated by a comma\n</code></pre> <p></p>"},{"location":"settings/#run","title":"run","text":"<p>This file contains the essential parameters to be set before any use.</p> Parameter Description Default value WEB_PORT Local HTTP Port for web application 8087 DATADIR Path to the data /opt/data/ DB_IMAGE Docker image name of the MongoDB pgd-mmdt-db SCAN_IMAGE Docker image name of the Scan process pgd-mmdt-scan WEB_IMAGE Docker image name of the Web interface pgd-mmdt-web DB_CONTAINER Docker container name of the MongoDB mmdt-db SCAN_CONTAINER Docker container name of the Scan process mmdt-scan WEB_CONTAINER Docker container name of the Web interface mmdt-web MONGO_VOL Volume name for MongoDB mmdt-mongodb USER Admin user in the htpasswd file admin <p></p>"},{"location":"definitions/","title":"Definition Files","text":""},{"location":"definitions/#metadata-definition-files","title":"Metadata definition files","text":"<p>The Maggot tool offers great flexibility in configuration. It allows you to completely choose all the metadata you want to describe your data. You can base yourself on an existing metadata schema, invent your own schema or, more pragmatically, mix one or more schemas by introducing some metadata specific to your field of application. However, keep in mind that if you want to add descriptive metadata to your data then a certain amount of information is expected. But a completely different use of the tool is possible, it's up to you.</p> <p>There are two levels of definition files as shown the figure below:</p> <p></p> <p>1 - The first level concerns the definition of terminology (metadata) similar to a descriptive metadata plan. Clearly, this category is more akin to configuration files. They represent the heart of the application around which everything else is based. The input and search interfaces are completely generated from these definition files (especially the web/conf/config_terms.txt file), thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration step is essential in order to be able to configure all the other modules.</p> <p>2 - The second level concerns the definitions of the mapping to a differently structured metadata schema (metadata crosswalk, i.e a specification for mapping one metadata standard to another), used either i) for metadata export to a remote repository (e.g. Dataverse, Zenodo) or ii) for metadata harvesting (e.g. JSON-LD, OAI-PMH). Simply place the definition files in the configuration directory (web/conf) for them to be taken into account, provided you have adjusted the configuration (web/inc/config/config.inc).</p> <p>All definition files are made using a simple spreadsheet then exported in TSV format. </p> <p>The list of definition files in Maggot are given below. All must be put under the directory web/conf. </p> <p>See an example on line : https://pmb-bordeaux.fr/maggot/config/view and the corresponding form based on these definition files.</p> <p></p>"},{"location":"definitions/config_terms/","title":"Terminlogy Definition","text":""},{"location":"definitions/config_terms/#example-of-a-terminlogy-definition-file","title":"Example of a Terminlogy Definition file","text":"Field Section Required Search ShortView Type features Label Predefined terms title definition Y N 1 textbox width=350px Short name fulltitle definition Y Y 2 textbox Full title subject definition Y Y checkbox open=0 Subject Agricultural Sciences,Arts and Humanities,Astronomy and Astrophysics,Business and Management,Chemistry,Computer and Information Science,Earth and Environmental Sciences,Engineering,Law,Mathematical Sciences,Medicine Health and Life Sciences,Physics,Social Sciences,Other description definition Y Y areabox rows=6,cols=30 Description of the dataset note definition N Y areabox rows=4,cols=30 Notes status status N Y 3 dropbox width=350px Status of the dataset Processed,In progress,Unprocessed access_rights status N Y 4 dropbox width=350px Access rights to data Public,Mixte,Private language status N Y checkbox open=0 Language Czech,Danish,Dutch,English,Finnish,French,German,Greek,Hungarian,Icelandic,Italian,Lithuanian,Norwegian,Romanian,Slovenian,Spanish,Swedish lifeCycleStep status N Y multiselect autocomplete=lifecycle,min=1 Life cycle step license status N Y textbox autocomplete=license,min=1 License datestart status N Y datebox width=350px Start of collection dateend status N Y datebox width=350px End of collection dmpid status N Y textbox DMP identifier contacts management Y Y multiselect autocomplete=people,min=1 Contacts authors management Y Y multiselect autocomplete=people,min=1 Authors collectors management N Y multiselect autocomplete=people,min=1 Data collectors curators management N Y multiselect autocomplete=people,min=1 Data curators members management N Y multiselect autocomplete=people,min=1 Project members leader management N Y multiselect autocomplete=people,min=1 Project leader wpleader management N Y multiselect autocomplete=people,min=1 WP leader depositor management N Y textbox Depositor producer management N Y multiselect autocomplete=producer,min=1 Producer grantNumbers management N Y multiselect autocomplete=grant,min=1 Grant Information kindOfData descriptors Y Y checkbox open=0 Kind of Data Audiovisual,Collection,Dataset,Event,Image,Interactive Resource,Model,Physical Object,Service,Software,Sound,Text,Workflow,Other keywords descriptors N Y multiselect autocomplete=bioportal,onto=EFO:JERM:EDAM:MS:NMR:NCIT:OBI:PO:PTO:AGRO:ECOCORE:IOBC:NCBITAXON Keywords topics descriptors N Y multiselect autocomplete=VOvocab Topic Classification dataOrigin descriptors N Y checkbox open=0 Data origin observational data,experimental data,survey data,analysis data,text corpus,simulation data,aggregate data,audiovisual corpus,computer code,Other experimentfactor descriptors N Y multiselect autocomplete=vocabulary,min=1 Experimental Factor measurement descriptors N Y multiselect autocomplete=vocabulary,min=1 Measurement type technology descriptors N Y multiselect autocomplete=vocabulary,min=1 Technology type publication_citation descriptors N Y areabox rows=5,cols=30 Publication - Citation publication_idtype descriptors N Y dropbox width=200px Publication - ID Type -,ark,arXiv,bibcode,doi,ean13,eissn,handle,isbn,issn,istc,lissn,lsid,pmid,purl,upc,url,urn publication_idnumber descriptors N Y textbox width=400px Publication - ID Number publication_url descriptors N Y textbox Publication - URL comment other N Y areabox rows=15, cols=30 Additional information"},{"location":"definitions/dataverse/","title":"Dataverse Definition File","text":"<p>Open source research data repository software, approved by Europe.</p>"},{"location":"definitions/dataverse/#dataverse-definition-file_1","title":"Dataverse definition File","text":"<p>This definition file will allow Maggot to automatically export the dataset into a data repository based on Dataverse. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Dataverse, knowing that this metadata crosswalk was made possible by choosing the right metadata schema at upstream.</p> <p> </p> <p>The structure of the Dataverse JSON output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.</p> <p>The file must have 4 columns with headers defined as follows:</p> <ul> <li>column 1 - Field : shortname of the Maggot fields</li> <li>column 2 - Typename : The corresponding Dataverse fields. </li> <li>column 3 - Type :The Dataverse field at the top level encapsulating the fields in the \u201cTypename\u201d column where applicable</li> <li>column 4 - Terminology : Indicates from which vocabulary the mapping should be performed. Concerning the vocabulary to be mapped either i) on a dictionary, you must put the name of the corresponding dictionary prefixed with cvlist:, or ii) on ontologies or on a thesaurus, you must specify the corresponding entry in the mapping definition file prefixing it with cv:.</li> </ul> <p>Below an example of Dataverse definition file (TSV)  </p> <p>Example of Dataverse JSON file generated based on the definition file itself given as an example above.</p> <ul> <li>Dataverse JSON of the FRIM dataset</li> </ul> <p></p>"},{"location":"definitions/json-ld/","title":"JSON-LD Definition File","text":""},{"location":"definitions/json-ld/#json-ld-definition-file_1","title":"JSON-LD definition File","text":"<p>This definition file will allow harvesters to collect structured metadata based on a semantic schema, i.e the fields themselves and not just their content can be associated with a semantic definition (ontology for example) which will then facilitate the link between the metadata and therefore the data (JSON-LD). The chosen semantic schema is based on several metadata schemas.</p> <p>The full workflow to \"climb the Link Open Data mountain\" is resumed by the figure below :  </p> <p>Metadata schemas used to build the model proposed by default:</p> <ul> <li>Schema.org, Bioschemas.org, Datacite, DDI-RDF, DubinCore, Dataverse</li> </ul> <p>Definition of the JSON-LD context using the metadata schemas proposed by default  </p> <p>The structure of the JSON-LD is not known internally, information on the structure will therefore be necessary to carry out the correspondence.</p> <p>Example of JSON-LD definition file (partial) using the metadata schemas proposed by default (TSV)  </p> <p>Example of JSON-LD file generated based on the definition file itself given as an example above.</p> <ul> <li>JSON-LD file of the FRIM dataset</li> </ul> <p></p>"},{"location":"definitions/mapping/","title":"Mapping Definition File","text":""},{"location":"definitions/mapping/#mapping-definition-file_1","title":"Mapping definition File","text":"<p>The mapping file is used as indicated by its name to match a term chosen by the user during entry with another term from an ontology or a thesaurus and therefore to obtain a URL which will be used for referencing. It can be used for each metadata crosswalk requiring such a mapping (e.g. to the Dataverse, Zenodo or JSON-LD format).</p> <p>The role of this definition file is illustrated with the figure above  </p> <p>The file must have 5 columns with headers defined as follows:</p> <ul> <li>column 1 - CVname : name of the mapping entry</li> <li>column 2 - CVtype : type of the CV target (must be either bioportal or skosmos)</li> <li>column 3 - CVurl : URL of the corresponding web API</li> <li>column 4 - CVterm : name of the thesaurus or the ontology list separated by a comma</li> <li>column 5 - CVlang : the chosen language (mainly for thesauri)</li> </ul> <p>Below an example of Mapping definition file (TSV)</p> <p> </p>"},{"location":"definitions/oai-pmh/","title":"OAI-PMH Definition File","text":"<p>OAI-PMH is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives.</p>"},{"location":"definitions/oai-pmh/#oai-pmh-definition-file_1","title":"OAI-PMH definition File","text":"<p>This definition file will allow harvesters to collect metadata structured according to a standard schema (OAI-DC).</p> <ul> <li> <p>Based on the Open Archives Initiative Protocol for Metadata Harvesting - Version 2</p> </li> <li> <p>Example of a OAI-PMH Data Provider Validation</p> </li> <li> <p>Example of OAI-PMH output for a dataset</p> <ul> <li>FRIM dataset</li> </ul> </li> </ul> <p>The structure of the OAI-PMH output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.</p> <p>Example of OAI-PMH definition file (TSV)  </p> <p>Another example of OAI-PMH definition file (TSV) with identifers &amp; vocabulary mapping  </p> <p></p>"},{"location":"definitions/terminology/","title":"Terminology","text":""},{"location":"definitions/terminology/#definition-of-terminology","title":"Definition of terminology","text":"<p>There are two definition files to set up.</p> <ul> <li>The terminology definition file (config_terms.txt) serving to describe all terminology used to define the metadata of a dataset.</li> <li>The terminology documentation file (config_doc.txt) serving to documente all terminology definitions.</li> </ul> <p>Each time there is a change in these two definition files, it is necessary to convert them so that they are taken into account by the application.</p> <p>Terminology is the set of terms used to define the metadata of a dataset. A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces (e.g screenshot) are completely generated from this definition file, thus defining i) each of the fields, their input type (checkbox, dropbox, textbox, ...) and ii) the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms).</p> <p> </p> <p>The metadata schema proposed by defaut is mainly established according to the DDI (Data Documentation Initiative) schema that also corresponds to that adopted by the Dataverse software.</p> <p>Terminology is organised in several sections. By default 6 sections are proposed, but you can redefine them as you wish:</p> <ul> <li>DEFINITION : Section for describing shortly the dataset. </li> <li>STATUS : Section for defining the status of the dataset, associated rights, dates, etc.</li> <li>MANAGEMENT : Section for assigning names of people or organizations who participated in the production of data and according to the type of participation.</li> <li>DESCRIPTORS : Section for defining elements characterizing the data themselves and certain experimental conditions for obtaining them.</li> <li>OTHER : Section for entering miscellious information (protocols, comments, issues, ...)</li> <li>RESOURCES : Section for defining metadata about all the resources you want, i.e both external (links) and internal (data files on the storage space) resources. This section does not require any configuration a priori, it is added de facto.</li> </ul> <p>For each section, fields are then defined. These fields can be defined according to the way they will be entered via the web interface. There are 6 different types of input: check boxes (checkbox), drop lists (dropbox), single-line text boxes (textbox), single-line text boxes with an additional box for multiple selection from a catalog of terms (multiselect), date picker (datebox) and multi-line text boxes (areabox).</p> <p> </p> <p>For two types (checkbox and dropbox), it is possible to define the values to be selected (predefined terms).</p> <p></p>"},{"location":"definitions/terminology/#structure-of-the-terminology-definition-file-tsv","title":"Structure of the Terminology definition file (TSV)","text":"<p>The file must have 9 columns with headers defined as follows:</p> <ul> <li>column 1 - Field : shortname of the fields</li> <li>column 2 - Section : shortname ot the sections</li> <li>column 3 - Required : indicates if the field is mandatory ('Y') or not ('N')</li> <li>column 4 - Search : indicates if the field can be used as a criterion search ('Y') or not ('N')</li> <li>column 5 - Shortview : indicates with ordered numbers if the field serves for the overview table after the search (empty by default)</li> <li>column 6 - Type : indicates the way they will be entered via the web interface (possible values are: textbox, dropbox, checkbox, multiselect, datebox and areabox).</li> <li>column 7 - Features : dependings on the Type value, one can specifiy some specific features. If several features, they must be separated by a comma.<ul> <li>open=0 or open=1 (checkbox) :  indicates if the selection is opened or not. See Vocabulary.</li> <li>autocomplete=entity (textbox, checkbox &amp; multiselect) :  The entity.js file must be present under web/cvlist/entity/ if the entity is a dictionary otherwise it must be present under web/js/autocomplete. See Vocabulary.</li> <li>width=NNNpx (textbox, dropbox, datebox) : allows you to specify the width of the box. Usefull if you want put several fields in the same line. See note 1 below.</li> <li>row=NN and cols=NN (areabox) : allows you to specify the row and column size of the textarea.</li> </ul> </li> <li>column 8 - Label : Labels corresponding to the fields that will appear in the web interface</li> <li> <p>column 9 - Predefined terms : for fields defined with a type equal to checkbox or dropbox, one can give a list of terms separated by a comma.</p> </li> <li> <p>Notes</p> <ul> <li>the fields will be displayed in the same order as in the file and by section. So if you want to specify several textboxes with particular sizes so that they are on the same line, they should belong to the same section and follow each other in the file in the same order.</li> <li>the title and description fields are mandatory but not necessarily in the same section.</li> </ul> </li> </ul> <p>Below an example of Terminology definition file (TSV)  </p> <p>Example of Maggot JSON file generated based on the same definition file</p> <ul> <li>Maggot JSON of the FRIM dataset and its corresponding JSON-schema</li> </ul> <p></p>"},{"location":"definitions/terminology/#structure-of-the-terminology-documentation-file-tsv","title":"Structure of the Terminology documentation file (TSV)","text":"<p>The documentation definition file is used to have online help for each field (small icon placed next to each label on the form). So it should only be modified when a field is added or deleted, or moved to another section. This file will be used then to generate the online metadata documentation according to the figure below (See Configuration to find out how to carry out this transformation).</p> <p> </p> <p>The file must have 3 columns with headers defined as follows:</p> <ul> <li>column 1 - Type : The type of the element, namely 'section', 'field' or 'option'. An 'option' type must correspond to each of the options for a field corresponding to a drop-down list.</li> <li>column 2 - Name : Name of the element. The names of the sections, variables and drop-down options must be exactly the same as those specified in the terminology definition file.</li> <li>column 3 - Description : The description corresponding to the element, serving as much as possible to give indications on the information to be selected or entered, in order to remove possible ambiguities.</li> </ul> <p>Below an example of Terminology documentation file (TSV)  </p> <p>Same example as above converted to HTML format using Markdown format</p> <ul> <li>Metadata Documentation</li> </ul> <p></p>"},{"location":"definitions/vocabulary/","title":"Vocabulary","text":""},{"location":"definitions/vocabulary/#vocabulary_1","title":"Vocabulary","text":"<ul> <li>In this section we expose the full extent of the possibilities concerning the vocabulary in Maggot.</li> <li>Choosing the type of vocabulary and how to enter it depends entirely on what you put in the terminology definition file. However, some approaches require a little technicality by writing small scripts based on JavaScript, but nothing too serious. You can always take an already ready-made script and modify only the part that concerns your focus.</li> </ul> <p>1 -  Vocabulary based on a list of terms fixed in advance (checbox with feature open=0)</p> <ul> <li>List of well-chosen and limited Control Vocabulary e.g according to a reference e.g. Data Document Initiative.</li> </ul> <p> </p> <p>2 - Vocabulary open for addition (checkbox with feature open=1)</p> <ul> <li>allows you to collect the desired Control Vocabulary (CV) from users. In order to initiate the list, you can put some terms in the predefined terms column of the terminology definition file.</li> </ul> <p> </p> <p>3 - Vocabulary based on a web API in a text field (textbox)</p> <ul> <li>The web API is defined in a JavaScript file with the same name as the assigned variable (here cities) and must present under web/js/autocomplete. For example, to enter a French city you can use the API geo.api.gouv.fr. See cities.js</li> </ul> <p> </p> <p>4 - Vocabulary based on a dictionary with multiple selection (multiselect) </p> <ul> <li>Dictionaries allow you to record multiple information necessary to define an entity, such as the names of people or even the funders. These information, once entered and saved in a file called a dictionary. Based on a very simple JavaScript retrieving the complete list of items included in the dictionary, thus creating a sort of internal API, we can fill a Maggot field by  autocompletion related to a search for these items.</li> <li>The JavaScript file must be named dico.js and be present under web/cvlist/dico/ where dico is the name of the dictionary. See for instance people.js</li> </ul> <p> </p> <p>5 - Vocabulary based on a SKOSMOS Thesaurus with multiple selection (multiselect) </p> <ul> <li>SKOSMOS is a web tool facilitating the posting of controlled vocabulary online in the form of a thesaurus according to the SKOS data model. It offers a navigation interface as well as a web API. A simple JavaScript allows you to easily connect this web API with a multiselect field.</li> <li>The JavaScript file must have the same name as the assigned variable (here VOvocab) and must present under web/js/autocomplete. See for instance VOvocab.js.</li> </ul> <p> </p> <p>6 - Vocabulary based on an OntoPortal with multiple selection (multiselect) </p> <ul> <li>Portals based on OntoPortal offer the wealth of ontologies according to several domains of application (e.g. BioPortal in the biomedical domain, AgroPortal in the domain of plants).</li> <li>No need of JavaScript file. The Bioportal Autocompletion widget has been implemented into Maggot. You have to only declare the ontology you want to use directly into the terminology definition file in order to easily connect this widget with a multiselect field.</li> </ul> <p> </p> <p></p>"},{"location":"definitions/zenodo/","title":"Zenodo Definition File","text":"<p>Open source research data repository software, approved by Europe.</p>"},{"location":"definitions/zenodo/#zenodo-definition-file_1","title":"Zenodo definition File","text":"<p>This definition file will allow Maggot to automatically export the dataset into a data repository based on Zenodo. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Zenodo.</p> <p>The structure of the Zenodo JSON output file is not known internally, information on the structure will therefore be necessary to carry out the correspondence.</p> <p>Below an example of Zenodo definition file (TSV)  </p> <p>Example of Zenodo JSON file generated based on the definition file itself given as an example above.</p> <ul> <li>Zenodo JSON of the FRIM dataset</li> </ul> <p></p>"},{"location":"publish/","title":"Publish Metadata","text":""},{"location":"publish/#publish-metadata_1","title":"Publish Metadata","text":"<ul> <li>Once we have decided to publish our metadata with possibly our data, we can choose the repository that suits us. Currently repositories based on Dataverse and Zenodo are supported, both being Europe-approved repositories.</li> <li> <p>Using an approach that might be called \u201cmachine-readable metadata,\u201d it is possible to populate metadata for a dataset into one of the proposed data repositories via its web API, provided that you have taken care to correctly define your metadata schema so that it is possible to make a correspondence with the chosen data repository using a mapping definition file.</p> </li> <li> <p>The principle is illustrated by the figure above.</p> </li> </ul> <p> </p> <ul> <li>We start from the Maggot JSON format metadata file generated from the web interface and based on the metadata profile defined by the terminology definition files. </li> <li>Then from a file defining the correspondence between the Maggot fields and those of the target repository, we can perform a metadata crosswalk to the JSON format supported by the web API of the target repository.</li> <li>During the process we enrich the metadata with controlled vocabularies based either on dictionaries or on thesauri and/or ontologies. For the latter cases, we use the web APIs of these sources to perform the mapping (see the definition of mapping).</li> <li>Finally, to be able to carry out the transfer i.e. the submission to the target repository (we say \"push\" for short), we first need to connect to the repository in order to retrieve the key (the API token) authorizing us to submit the dataset. This obviously assumes that we have the privileges (creation/modification rights) to do so.</li> </ul>"},{"location":"publish/#httpswwwgooglecomsearchqmetadatacrosswalkdefinitionoqmetadatacrosswalk","title":"https://www.google.com/search?q=metadata+crosswalk+definition&amp;oq=metadata+crosswalk","text":""},{"location":"publish/dataverse/","title":"Publish into Dataverse","text":""},{"location":"publish/dataverse/#publish-into-dataverse_1","title":"Publish into Dataverse","text":"<ul> <li>Based on the Dataverse Native API</li> </ul> <p>1 - To submit metadata to a Dataverse repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.</p> <p>2 - You then need to connect to the repository in order to retrieve the key (the  API token) authorizing you to submit the dataset. This obviously assumes that you have the privileges (creation/modification rights) to do so.</p> <p>3 - After choosing the repository URL, you must also specify on which dataverse collection you want to deposit the datasets. As previously, you must have write rights to this dataverse collection.</p> <p> </p> <p></p> <ul> <li>Then, all you have to do is click on 'Publish' to \"push\" the metadata to the repository. The figure below illustrates based on an example how the metadata is recorded in the repository as well as the Mapping corresponding to the fields linked to Controlled Vocabularies.</li> </ul> <p></p> <p></p>"},{"location":"publish/dataverse/#deposit-data-files","title":"Deposit data files","text":"<ul> <li> <p>If you also want to deposit data files at the same time as the metadata, you will need:</p> <ul> <li> <p>1 - declare the files to be deposited in the resources; these same files must also be present in the storage space.</p> </li> <li> <p>2 - create a semaphore file (META_datafile_ok.txt); its sole presence, independently of its content (which may be empty) will authorize the transfer. Indeed, the creation of such a file guarantees that the user has actually write rights to the storage space corresponding to his dataset. This prevents someone else from publishing the data without having the right to do so. This mechanism also avoids having to manage user accounts on Maggot.</p> </li> </ul> </li> </ul> <p> </p> <p></p> <ul> <li>The figure below illustrates based on an example how data files appear on the repository with annotations corresponding to those created in Maggot.  </li> </ul> <p></p>"},{"location":"publish/zenodo/","title":"Publish into Zenodo","text":""},{"location":"publish/zenodo/#publish-into-zenodo_1","title":"Publish into Zenodo","text":"<ul> <li>Based on the Zenodo REST-API</li> </ul> <p>1 - To submit metadata to a Zenodo repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.</p> <p>2 - Unless you have previously saved your API token, you must create a new one and copy and paste it before validating it. Before validating, you must check the deposit:access and deposit:write boxes in order to obtain creation and modification rights with this token.</p> <p>3 - After choosing the repository URL, you can optionally choose a community to which the dataset will be linked. By default, you can leave empty this field.</p> <ul> <li>Warning : given the new changes introduced to the Zenodo validation process (October 2023), it seems that it is no longer possible to validate a community via API. Only a choice via the Zenodo web interface will allow you to do so in order to be validated later by the manager of this community.</li> </ul> <p> </p> <p></p>"},{"location":"publish/zenodo/#deposit-data-files","title":"Deposit data files","text":"<ul> <li> <p>If you also want to deposit data files at the same time as the metadata, you will need (see figure below)</p> <ul> <li> <p>1 - declare the files to be deposited in the resources (1) ; these same files must also be present in the storage space.</p> </li> <li> <p>2 - create a semaphore file (META_datafile_ok.txt) (2); its sole presence, independently of its content will authorize the transfer. Indeed, the creation of such a file guarantees that the user has actually write rights to the storage space corresponding to his dataset. This prevents someone else from publishing the data without having the right to do so. This mechanism also avoids having to manage user accounts on Maggot.</p> </li> </ul> </li> <li> <p>Then, all you have to do is click on 'Publish' to \"push\" the metadata and data to the repository (3).</p> </li> <li> <p>After submission and if everything went well, a link to the deposit will be given to you (4).</p> </li> </ul> <p> </p> <p></p> <ul> <li>The figure below illustrates based on an example how the metadata and data is recorded in the repository.</li> </ul> <p> </p> <p></p>"},{"location":"tutorial/","title":"Quick tutorial","text":""},{"location":"tutorial/#quick-tutorial_1","title":"Quick tutorial","text":"<p>This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user. </p> <p>See a short Presentation and Poster if you want to have a more general overview of the tool..</p> <p></p>"},{"location":"tutorial/#overview","title":"Overview","text":"<p>The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:</p> <p> </p> Configuration <p>This module mainly concerns the data manager and makes it possible to construct all the terminology definition files, i.e. the metadata and sources of associated vocabularies. See Definition files then Configuration.</p> Private Access <p>This module allows data producer to temporarily protect access to metadata for the time necessary before sharing it within his collective. See Private access key management.</p> Dictionaries <p>This module allows data producer to view content of all dictionaries. It also allows data steward to edit their content. See Dictionaries for technical details only.</p> Metadata Entry <p>This is the main module allowing the data producer to enter their metadata relating to a dataset. See the corresponding tutorial for Metadata Entry.</p> Search datasets <p>This module allows users to search datasets based on the associated metadata, to see all the metadata and possibly to have access to the data itself. This obviously assumes that the metadata files have been deposited in the correct directory in the storage space dedicated to data management within your collective. See Infrastructure.</p> File Browser <p>This module gives users access to a file browser provided that the data manager has installed it. See File Browser</p> Publication <p>This module allows either the data producer or the data steward to publish the metadata with possibly the corresponding data within the suitable data repository. See Publication</p> <p></p>"},{"location":"tutorial/describe/","title":"Quick tutorial","text":""},{"location":"tutorial/describe/#metadata-entry","title":"Metadata Entry","text":"<p>The figures are given here for illustration purposes but certain elements may be different for you given that this will depend on the configuration on your instance, in particular the choice of metadata, and the associated vocabulary sources.</p> <p>Indeed, the choice of vocabulary sources (ontologies, thesauri, dictionaries) as well as the choice of metadata fields to enter must in principle have been the subject of discussion between data producers and data manager during the implementation of the Maggot tool in order to find the best compromise between the choice of sources and all the scientific fields targeted (see Definition files). However a later addition is always possible.</p> <p></p>"},{"location":"tutorial/describe/#overview","title":"Overview","text":"<p>When you enter the metadata entry module you should see a page that looks like the figure below:</p> <p> </p> <ul> <li> <p>All the fields (metadata) to be filled in are distributed between several tabs, also called sections. Each section tries to group together a set of fields relating to the same topic.</p> </li> <li> <p>You can reload a previously created metadata file. All form fields will then be initialized with the value(s) defined in the metadata file.</p> </li> <li> <p>You must at least complete the mandatory fields marked with a red star.</p> </li> <li> <p>It is possible to obtain help for each field to be completed. A mini-icon with a question mark is placed after each field label. By clicking on this icon, a web page opens with the focus on the definition of the corresponding field. This help should provide you with at least a definition of a field and, if necessary, instructions on how to fill it in. It should be noted that the quality of the documentation depends on each instance and its configuration.</p> </li> <li> <p>Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. See Metadata File</p> </li> </ul> <p></p>"},{"location":"tutorial/describe/#dictionaries","title":"Dictionaries","text":"<p>Dictionary-based metadata (e.g. people's names) can easily be entered by autocomplete in the 'Search value' box provided the name appears in the corresponding dictionary.</p> <p> </p> <p>However, if the name does not yet appear in the dictionary, simply enter the full name (first name &amp; last name) in the main box, making sure to separate each name with a comma and then a space as shown in the figure below.</p> <p> </p> <p>Then you can request to add the additional person name(s) to the dictionary later as described below:</p> <p> </p> <ul> <li> <p>From the home page, select \"Dictionaries\". As username, just put \"maggot\" (this might be different within your instance).</p> </li> <li> <p>Then after choosing the \"people\" dictionary, you can download the entire dictionary in a TSV file (Tab-Separated Values) ready to be edited with your favorite spreadsheet.</p> </li> <li> <p>Add all the desired people's names with their institution, and possibly their ORCID and their email address. Please note that emails are required for authors and contacts</p> </li> <li> <p>You will then just have to send it to the data manager so that he can add new people's names to the online dictionary.</p> </li> </ul> <p>Please proceed in the same way for all dictionaries (people, funders, producer, vocabulary)</p> <p></p>"},{"location":"tutorial/describe/#controlled-vocabulary","title":"Controlled Vocabulary","text":"<p>Depending on the configuration of your instance, it is very likely that certain fields (eg. keywords) are connected to a controlled vocabulary source (e.g. ontology, thesaurus). Vocabulary based on ontologies, thesauri or even dictionaries can easily be entered by autocomplete in the \"search for a value\" box provided that the term exists in the corresponding vocabulary source. </p> <p> </p> <p>If a term cannot be found by autocomplete, you can enter the term directly in the main box, making sure to separate each term with a comma and a space as shown in the figure below.</p> <p> </p> <p>The data steward will later try to link it to a vocabulary source that may be suitable for the domain in question. Furthermore, even if the choice of vocabulary sources was made before the tool was put into service, a later addition is always possible. You should make the request to your data manager.</p> <p></p>"},{"location":"tutorial/describe/#resources","title":"Resources","text":"<p>Because data is often scattered across various platforms, databases, and file formats, this making it challenging to locate and access. This is called data fragmentation. So the Maggot tool allows you to specify resources, i.e. data in the broader sense, whether external or internal, allowing to centralize all links towards data.</p> <ul> <li>External resources will be specified by a URL with preference for a permanent identifier (e.g. DOI) but also any URL pointing to data whether they comply with the FAIR principle (e.g. ODAM) or not.</li> <li>Internal resources will be the data files to be uploaded to the data repository at push time. In the latter case the exact name of the file on the storage space must appear in the location field.</li> <li>Furthermore, in the case of local data management, it would be wise to indicate in which space the data is located if it is not located in the same place as the metadata (e.g. NextCloud, Unit NAS, etc.)</li> </ul> <p>Four fields must be filled in :</p> <p> </p> <ul> <li> <p>Resource Type : Choose the type of the resource in the droplist.</p> </li> <li> <p>Media Type : Choose a media type if applicable by autocomplete.</p> </li> <li> <p>Description : Provide a concise and accurate description of the resource. Must not exceed 30 characters.</p> </li> <li> <p>Location : Preferably indicate an URL to an external resource accessible to all. But it can also be a password-protected resource (e.g. a disk space on the cloud). This can also be text clearly indicating where the resource is located (internal disk space). Finally, this can be the name of a file deposited on the same disk space as the metadata file, in order to be able to push it in the data repository at the same time as the metadata (see Publication).</p> </li> </ul> <p></p>"},{"location":"tutorial/metadata/","title":"Quick tutorial","text":""},{"location":"tutorial/metadata/#metadata-file","title":"Metadata File","text":"<p>Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. The file is in JSON format and must have the prefix 'META_'.</p> <p>By clicking on the \"Generate the metadata file\" button, you can save it on your disk space. </p> <p> </p> <p>Furthermore, if email sending has been configured (see settings), then you have the possibility of sending the metadata file to the data managers for conservation, and possibly also for supporting its storage on data disk space if specific rights are required.</p> <p> </p> <p>In the (most common) case where you want to save the metadata file to your disk space, two uses of this file are possible:</p> 1. The first use is the recommended one because it allows metadata management within your collective. <p>You drop the metadata file directly under the data directory corresponding to the metadata. Indeed, when installing the tool, a storage space dedicated to the tool had to be provided for this purpose. See infrastructure. Once deposited, you just have to wait around 30 minutes maximum so that the tool has had time to scan the root of the data directories looking for new files in order to update the database. After this period, the description of your dataset will be visible from the interface, and a selection of criteria will be made in order to restrict the search.</p> <p> </p> <p>You will then have the possibility to publish the metadata later with possibly the corresponding data in a data repository such as Dataverse or Zenodo.</p> 2. The second use is only to deposit the metadata into a data repository <p>Whether with Dataverse or Zenodo, you have the possibility of publishing metadata directly in one or other of these repositories without using the storage space.</p> <p> </p> <p>Please note that you cannot also deposit the data files in this way. You will have to do this manually for each of them directly online in the repository.</p> <p></p>"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\\\s\\\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"<p> An ecosystem for sharing metadata <p> </p>"},{"location":"#foster-good-data-management-with-data-sharing-in-mind","title":"Foster good data management, with data sharing in mind","text":"<p>Sharing descriptive Metadata is the first essential step towards Open Scientific Data. With this in mind, Maggot was specifically designed to annotate datasets by creating a metadata file to attach to the storage space. Indeed, it allows users to easily add descriptive metadata to datasets produced within a collective of people (research unit, platform, multi-partner project, etc.). This approach fits perfectly into a data management plan as it addresses the issues of data organization and documentation, data storage and frictionless metadata sharing within this same collective and beyond.</p>"},{"location":"#main-features-of-maggot","title":"Main features of Maggot","text":"<p>The main functionalities of Maggot were established according to a well-defined need (See Background).</p> <ol> <li>Documente with Metadata your datasets produced within a collective of people, thus making it possible :<ul> <li>to answer certain questions of the Data Management Plan (DMP) concerning the organization, documentation, storage and sharing of data in the data storage space, </li> <li>to meet certain data and metadata requirements, listed for example by the Open Research Europe in accordance with the FAIR principles.</li> </ul> </li> <li>Search datasets by their metadata<ul> <li>Indeed, the descriptive metadata thus produced can be associated with the corresponding data directly in the storage space then it is possible to perform a search on the metadata in order to find one or more sets of data. Only descriptive metadata is accessible by default.</li> </ul> </li> <li>Publish the metadata of datasets along with their data files into an Europe-approved repository</li> </ol> <p>See a short Presentation and Poster for a quick overview.</p> <p></p>"},{"location":"#overview-of-the-different-stages-of-metadata-management","title":"Overview of the different stages of metadata management","text":"<p> Note: The step numbers indicated in the figure correspond to the different points developed below </p> <p>1 - First you must define all the metadata that will be used to describe your datasets.   All metadata can be defined using a single file (in TSV format, therefore using a spreadsheet). This is a unavoidable step because both input and search interfaces are completely generated from these definition files, defining in this way each of the fields along with their input type and also the associated Controlled Vocabulary (ontology, thesaurus, dictionary, list of fixed terms). The metadata proposed by default was mainly established according to the DDI (Data Documentation Initiative) metadata schema. This schema also largely corresponds to that adopted by the Dataverse software. See the Terminology Definition section.  </p> <p>2 - Entering metadata will be greatly facilitated by the use of dictionaries.   The dictionaries offered by default are: people, funders, data producers, as well as a vocabulary dictionary allowing you to mix ontologies and thesauri from several sources. Each of these dictionaries allows users, by entering a name by autocompletion, to associate information which will then be added when exporting the metadata either to a remote repository, or for harvesting the metadata. Thus this information, once entered into a dictionary, will not need to be re-entered again.  </p> <p>3 - The web interface for entering metadata is entirely built on the basis of definition files.    The metadata are distributed according to the different sections chosen, each constituting a tab (see screenshot). Mandatory fields are marked with a red star and must be documented in order to be able to generate the metadata file. The entry of metadata governed by a controlled vocabulary is done by autocompletion from term lists (dictionary, thesaurus or ontology). We can also define external resources (URL links) relating to documents, publications or other related data. Maggot thus becomes a hub for your datasets connecting different resources, local and external. Once the mandatory fields (at least) and other recommended fields (at best) have been entered, the metadata file can be generated in JSON format.  </p> <p>4 - The file generated in JSON format must be placed in the storage space reserved for this purpose.    The role played by this metadata file can be seen as a README file adapted for machines, but also readable by humans. With an internal structure, it offers coherence and consistency of information that a simple README file with a completely free and therefore unstructured text format does not allow. Furthermore, the central idea is to use the storage space as a local data repository, so that the metadata should go to the data and not the other way around.  </p> <p>5 - A search of the datasets can thus be carried out on the basis of the metadata.    Indeed, all the JSON metadata files are scanned and parsed according to a fixed time interval (30 min) then loaded into a database. This allows you to perform searches based on predefined metadata. The search form, in a compact shape, is almost the same as the entry form (see a screenshot). Depending on the search criteria, a list of data sets is provided, with for each of them a link pointing to the detailed sheet.  </p> <p>6 - The detailed metadata sheet provides all the metadata divided by section.   Unfilled metadata does not appear by default. When a URL can be associated with information (ORCID, Ontology, web site, etc.), you can click on it to go to the corresponding link. Likewise, it is possible to follow the associated link on each of the resources. From this sheet, you can also export the metadata according to different schemata (Dataverse, Zenodo, JSON-LD). See screenshot 1 &amp; screenshot 2.  </p> <p>7 - Finally, once you have decided to publish your metadata with your data, you can choose the repository   that suits you (currently repositories based on Dataverse and Zenodo are supported).  </p> <p></p>"},{"location":"#additional-key-points","title":"Additional key points","text":"<ul> <li><p>Being able to generate descriptive metadata from the start of a project or study without waiting for all the data to be acquired or processed, nor for the moment when one wish to publish data, thus respecting the research data lifecycle as best as possible</p></li> <li><p>The implementation of the tool requires involving all data stakeholders upstream (definition of the metadata schema, vocabularies, targeted data repositories, etc.); everyone has their role: data manager/data steward on one side but also scientists and data producers on the other.</p></li> <li><p>A progressive rise towards an increasingly controlled and standardized vocabulary is not only possible but even encouraged. First we can start with a simple vocabulary dictionary used locally and grouping together domain vocabularies. Then we can consider the creation of a thesaurus with or without mapping to ontologies. The promotion of ontologies must also be done gradually by selecting those which are truly relevant for the collective. A tool like Maggot makes it easy to implement them.</p></li> </ul> <p></p> <p></p>"},{"location":"about/","title":"About","text":""},{"location":"about/#background","title":"Background","text":""},{"location":"about/#motives","title":"Motives","text":"<ul> <li>Meet the challenges of organizing, documenting, storing and sharing data from a site, a project or a structure (unit, platform, etc.).</li> <li>Have visibility of what is produced within the collective: datasets, software, databases, images, sounds, videos, analyses, codes, etc.</li> <li>Fall within an open science quality approach for sharing and reproducibility.</li> <li>Promote FAIR (at least the Findable &amp; Accessible criteria) within the collective.</li> <li>Raise awareness among newcomers and students about a better description of what they produce.</li> </ul>"},{"location":"about/#state-of-need","title":"State of need","text":"<ul> <li>Implementing a data management plan imposes prerequisites such as the externalization of data to be preserved outside of users' disk space. This does not only concern published data but all data produced during the duration of a project. Above all, this outsourcing makes it possible to gather the data in one place and already constitutes a first-level backup. This becomes even more necessary when temporary agents (doctoral students, post-docs, interns, fixed-term contracts) are involved in data production.</li> <li>Consequently, the concern arises about the organization of these storage spaces. Should they be harmonized, i.e. impose good practices such as i) the naming of folders and files, ii) a folder structure (docs, data, scripts, etc.), iii) the use of README files, etc.</li> <li>At a minimum, using a README file seems the simplest and least restrictive. But then the question arises \u201cwhat to put in it\u201d? Templates can be offered to simplify their writing. But then the question arises of how to use them effectively when we want to find information? With what vocabulary?</li> </ul>"},{"location":"about/#proposed-approach","title":"Proposed approach","text":"<ul> <li>The two main ideas behind the tool are:<ul> <li>Make the data storage space a data repository without having to move the data, then ensure that the metadata gets to the data.</li> <li>Be able to \u201ccapture\u201d the user\u2019s metadata as easily as possible by using their vocabulary.</li> </ul> </li> <li>Concerning the first idea: \"Just\" place a metadata file (JSON format) describing the project data in each subdirectory, and then find the projects and/or data corresponding to specific criteria. The choice fell on the JSON format, very suitable for describing metadata, readable by both humans and machines.</li> <li> <p>Concerning the second idea: Given the diversity of the fields, the approach chosen is to be both the most flexible and the most pragmatic possible by allowing users to choose their own vocabulary (controlled or not) corresponding to the reality of their field and their activities. However, a good approach is as much as possible to use only controlled vocabulary, that is to say relevant and sufficient vocabulary used as a reference in the field concerned to allow users to describe a project and its context without having to add additional terms. To this end, the tool must allow users a progressive approach towards the adoption of standardized controlled vocabularies (thesauri or even ontologies).</p> </li> <li> <p>With the approach proposed by Maggot, initially there is no question of opening the data, but of managing metadata associated with the data on a storage space with a precise perimeter represented by the collective (unit, team, project , platform, \u2026). The main characteristic of the tool is, above all, to \u201ccapture\u201d the metadata as easily as possible according to a well-chosen metadata schema. However, the opening of data via their metadata must be a clearly stated objective within the framework of projects financed by public institutions (e.g Europe). Therefore if you have taken care to correctly define your metadata schema so that it is possible to make a metadata crosswalk (using a mapping file) with a data repository recognized by the international community, then you can easily \"push\" its metadata with the data without having re-enter anything.</p> </li> </ul>"},{"location":"about/#links","title":"Links","text":"<ul> <li>Source code on Github : inrae/pgd-mmdt</li> <li>Issues tracker : inrae/pgd-mmdt/issues</li> <li>Instance online : INRAE UMR 1322 BFP</li> </ul>"},{"location":"about/#contacts","title":"Contacts","text":"<ul> <li>Daniel Jacob (INRAE UMR BFP) : daniel.jacob @ inrae.fr</li> </ul>"},{"location":"about/#designers-developers","title":"Designers / Developers","text":"<ul> <li> <p>Daniel Jacob (INRAE UMR BFP) | CATI PROSODIe</p> </li> <li> <p>Fran\u00e7ois Ehrenmann (INRAE UMR BioGECO) | CATI GEDEOP</p> </li> <li> <p>Philippe Chaumeil (INRAE UMR BioGECO)</p> </li> </ul>"},{"location":"about/#contributors","title":"Contributors","text":"<ul> <li> <p>Edouard Guitton (INRAE Dept. SA, Emerg'IN)</p> </li> <li> <p>St\u00e9phane Bernillon (INRAE UR MycSA)</p> </li> <li> <p>Joseph TRAN (INRAE UMR EGFV) | CATI BARIC</p> </li> </ul> <p></p> <p> </p> <p></p>"},{"location":"bloxberg/","title":"Bloxberg Blockchain","text":""},{"location":"bloxberg/#experimental-certification-of-metadata-file-on-the-bloxberg-blockchain","title":"EXPERIMENTAL - Certification of metadata file on the bloxberg blockchain","text":""},{"location":"bloxberg/#motivation","title":"Motivation","text":"<p>To guarantee the authenticity and integrity of a metadata file by recording it permanently and immutably on the bloxberg blockchain.</p> <p>Indeed, the blockchain is a technology that makes it possible to keep track of a set of transactions (writings in the blockchain), in a decentralized, secure and transparent manner, in the form of a blockchain. A blockchain can therefore be compared to a large (public or private) unfalsifiable register. Blockchain is today used in many fields because it provides solutions to many problems. For example in the field of Higher Education and Research, registration of dataset metadata in the blockchain, makes possible in this way to certify, in an inalienable, irrefutable and completely transparent manner, the ownership and authenticity of the data as well as for example, the license of use and the date of production of the data. Research stakeholders are then more open to the dissemination of their data (files, results, protocols, publications, etc.) since they know that, in particular, the ownership, content and conditions of use of the data cannot not be altered.</p> <p>The Maggot tool could thus serve as a gateway to certify its data with the associated metadata. The complete process is schematized by the following figure:</p> <p> </p>"},{"location":"bloxberg/#about-bloxberg","title":"About bloxberg","text":"<p>bloxberg is the most important blockchain project in science. It was founded in 2019 by MPDL , looking for a way to store research results and make them available to other researchers. In this sense, bloxberg is a decentralized register in which results can be stored in a tamper-proof way with a time stamp and an identifier.</p> <p>bloxberg is based on the Ethereum Blockchain. However, it makes use of a different consensus mechanism: instead of \u201cProof of Stake\u201d used by Ethereum since 2022, bloxberg validates blocks through \u201cProof of Authority\u201d. Each node is operated by one member. All members of the association are research institutions and are known in the network.  Currently, bloxberg has 49 nodes. It is an international project with participating institutions from all over the world.</p>"},{"location":"bloxberg/#how-to-process","title":"How to process ?","text":"<p>You will need a Ethereum address and an API key (must be requested via bloxberg-services (at) mpdl.mpg.de). See an example of pushing a metadata file to the bloxberg blockchain using Maggot.</p> <p></p>"},{"location":"bloxberg/#useful-links","title":"Useful links","text":"<ul> <li>Bloxberg Documentation</li> <li>Blockexplorer</li> <li>Blockchain ESR (France)</li> </ul>"},{"location":"configuration/","title":"Configuration","text":""},{"location":"configuration/#terminology-configuration","title":"Terminology configuration","text":"<p>A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces are completely generated from this definition file, thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration and conversion step into JSON format is essential in order to be able to configure all the other modules (example: creation of the MongoDB database schema when starting the application before filling it).</p> <p> </p> <ul> <li>Note : The step numbers shown in the figure above are mentioned in brackets in the text below.</li> </ul>"},{"location":"configuration/#tsv-to-json","title":"TSV to JSON","text":"<ul> <li> <p>This function is used to generate the terminology definition file in JSON format (config_terms.json) and the corresponding JSON-Schema file (maggot-schema.json) from a tabulated file (1). You can either create a terminology definition file in TSV format from scratch (see below to have more details), or extract the file from the current configuration (see JSON to TSV).</p> </li> <li> <p>Once the terminology definition file has been obtained (2), you can load it and press 'Submit'.</p> </li> <li> <p>Three files are generated (3 &amp; 5):</p> </li> <li>config_terms.json and maggot-schema.json : These files should be placed in the web/conf directory (3). A (re)start of the application must be done in full mode (4) (sh ./run fullstart)</li> <li>config_doc.txt (5) : This file serves as a template for the documentation of the metadata profile. You should edit it with a spreadsheet program, and fill in the description column (6). Then it is used to generate the documentation file in markdown format (see TSV to DOC).</li> </ul>"},{"location":"configuration/#tsv-to-doc","title":"TSV to DOC","text":"<ul> <li> <p>This function generates the markdown documentation file (doc.md) from the template file (config_doc.txt) which is itself generated from the metadata definition file (config_terms.txt, cf TSV to JSON).</p> </li> <li> <p>Once the template file for the documentation (config_doc.txt) has been edited and documented (6) (see below to have more details), you can load it and press Submit button.</p> </li> <li> <p>The documentation file in markdown format (doc.md) is thus generated (7) and must be placed in the web/docs directory (8). Users will have access to this documentation file via the web interface, in the documentation section, heading \"Metadata\".</p> </li> </ul>"},{"location":"configuration/#json-to-tsv","title":"JSON to TSV","text":"<ul> <li>This function allows you to extract the terminology definition file in TSV format (config_terms.txt) from the current configuration. This allows you to start from this file, either to adapt your own metadata profile or simply to modify it slightly.</li> </ul>"},{"location":"dictionaries/","title":"Dictionaries","text":""},{"location":"dictionaries/#presentation","title":"Presentation","text":"<ul> <li>The use of dictionaries has no other purpose to facilitate the entry of metadata, entry which can be long and repetitive in generalist data warehouses (such as repository based on Dataverse).</li> <li>Dictionaries allow you to record multiple information necessary to define an entity, such as the names of people or even the funders. These information, once entered and saved in a file called a dictionary, can be subsequently associated with the corresponding entity. </li> <li>The dictionaries offered by default are: people (people), funders (grant), data producers (producer), as well as a vocabulary dictionary (vocabulary) allowing you to mix ontologies and thesauri from several sources.</li> <li>To add a new dictionary, simply create a directory under web/cvlist then putting the files corresponding to the dictionary inside. Dictionaries will be automatically found by browsing this directory.</li> <li>Dictionary files are made using a simple spreadsheet then exported in TSV format.</li> <li>Dictionaries are accessed through secure access limited to administrators allowing their editing. The login is by default 'admin'. You can add another account for consultation only using the following command:  <pre><code>sh ./run passwd &lt;user&gt;\n</code></pre></li> </ul>"},{"location":"dictionaries/#the-people-dictionary","title":"The people dictionary","text":"<ul> <li>Note : must not be changed in its format nor in its name.</li> <li> <p>Like any dictionary, there must be 3 files (see below). Please note that the names of these files must always contain the name of the dictionary, i.e. same as the directory.  </p> </li> <li> <p>The format of the file containing the dictionary data (people.txt) is defined by another file (people_format.txt).</p> </li> </ul> <p> </p> <ul> <li>Thus, we know that the people dictionary must contain 5 columns (last name, first name, institution, ORCID number and email address) and that some fields are mandatory (last name, first name, institution) and others optional (ORCID number, email address).</li> <li>Each of the fields must respect a format specified by a regular expression in order to be accepted as valid.</li> <li>Optionally, you can connect an web API to each of the fields in order to make an entry by autocompletion from a remote register. Currently only ROR (Research Organization Registry) web API is possible but the mechanism is in place for new extensions.</li> <li>The third file, a very simple script written in JavaScript, defines the way to retrieve the list of names (here by containing the first and last name). Note that the name of the variable must always be identical to that of the dictionary. <pre><code>var people = [];\n// Each item in the 'people' list consists of the first two columns (0,1) separated by a space\nget_dictionary_values('people', merge=[0,' ',1]) </code></pre></li> <li> <p>Below, an example is given when modifying a record. When you click on the Institute field which is connected to the ROR web API, the drop-down list of reseach organizations that can correspond in the register appears, if there are any.  </p> </li> <li> <p>Note: It is possible to edit dictionaries, by adding an entry for example, and at the same time be able to immediately find this new entry in the metadata entry in the Maggot tool. Indeed each dictionary is reloaded into memory as soon as the corresponding input box is clicked. See an illustration.</p> </li> </ul> <p></p>"},{"location":"dictionaries/#other-dictionaries","title":"Other dictionaries","text":"<ul> <li> <p>Funders : The dictionary of the funders allows you to define the funding agency, project ID and its corresponding URL.  </p> <ul> <li>Note : can be renamed but while keeping its format (same columns and same layout).  </li> </ul> </li> <li> <p>Producers : The dictionary of the data producers allows you to define their Institute and  project ID and their corresponding URL. Optionally, you can add the URL of the logo.  </p> <ul> <li>Note : can be renamed but while keeping its format (same columns and same layout).  </li> </ul> </li> <li> <p>Vocabulary : Use this dictionary for mixing thesauri and ontologies in order to better target the entire controlled vocabulary of its field of application. Only the vocabulary is mandatory, the URL linked to an ontology or a thesaurus is optional. See Vocabulary section to learn the extent of the possibilities concerning vocabulary in Maggot.  </p> <ul> <li>Note : can be duplicated but while keeping its format (same columns and same layout).  </li> </ul> </li> </ul> <p></p>"},{"location":"gant/","title":"Gant","text":""},{"location":"gant/#gantt-diagrams-of-the-developments","title":"Gantt diagrams of the developments","text":"gantt     dateFormat YYYY-MM-DD     axisFormat  %Y-%m     title Diagrammes de Gantt pr\u00e9visionnel des d\u00e9veloppements     section MongoDB        1: des1, 2023-11-01,60d        2: des2, 2023-12-01,90d        3: des3, 2023-12-01,90d     section Couche API        4: des4, 2024-01-01,120d        5: des5, 2024-04-01,60d     section Interface Web        6a: des6, 2024-06-01,60d        6b: des7, 2024-07-01,60d        6c: des8, 2024-09-01,60d"},{"location":"infrastructure/","title":"Infrastructure","text":""},{"location":"infrastructure/#infrastructure-local-remote-or-mixed","title":"Infrastructure : Local, Remote or Mixed","text":"<p>The necessary Infrastructure involves 1) a machine running a Linux OS and 2) a dedicated storage space.</p> <p>1 - The machine will most often be of \"virtual\" type because more simpler to deploy, either locally (with VM providers such as VirtualBox, VMware Workstation or MS Hyper-V) or remotely (e.g VMware ESXi, Openstack: example of deployment). Moreover, the OS of your machine must allow you the deployment of docker containers. See for more details on \u201cWhat is Docker\u201d. The minimum characteristics of the VM are:  2 cpu, 2 Go RAM, 8 Go HD.</p> <p>2 - The dedicated storage space could be either in the local space of the VM, or in a remote place on the network.</p> <ul> <li>If the storage space is directly included in the VM, then tools like WinSCP or RcloneBrowser will allow you to easily transfer your files to the data space.</li> <li>If the storage space is your collective's NAS, you will need to make sure to open the port corresponding to the remote disk mount protocol (e.g SMB, NFS, iSCSI, ...). on your network's firewall. If both VM and data storage are not in the same private network, it will probably also require installing the sofware layer corresponding to your corporate VPN on the VM so that it can access your NAS. See example successfully tested.</li> <li>If the storage space is in a data center (e.g. NextCloud, Google Drive), then you will need to install a tool such as rclone on your VM in order to be able to mount the storage space on the VM's disk space. See example successfully tested.</li> </ul> <p></p>"},{"location":"installation/","title":"Installation","text":""},{"location":"installation/#install-on-your-linux-computer-or-linux-unix-server","title":"Install on your linux computer or linux / unix server","text":"<p>Requirements: The installation must be carried out on a (virtual) machine with  a recent Linux OS that support Docker (see Infrastructure)</p> <p></p>"},{"location":"installation/#retrieving-the-code","title":"Retrieving the code","text":"<p>Go to the destination directory of your choice then clone the repository and <code>cd</code> to your clone path:</p> <pre><code>git clone https://github.com/inrae/pgd-mmdt.git pgd-mmdt\ncd pgd-mmdt\n</code></pre> <p></p>"},{"location":"installation/#installation-of-docker-containers","title":"Installation of Docker containers","text":"<p>MAGGOT uses 3 Docker images for 3 distinct services:</p> <ul> <li>pgd-mmdt-db which hosts the MongoDB database</li> <li>pgd-mmdt-scan which scans the data and updates the contents of the database and the web interface</li> <li>pgd-mmdt-web which hosts the web server and the web interface pages</li> </ul> <p></p>"},{"location":"installation/#configuration","title":"Configuration","text":"<ul> <li>run : defines root of the data directory (including for development)</li> <li>dockerdbpart/initialisation/setupdb-js.template : defines MongoDB settings</li> <li>dockerscanpart/scripts/config.py : defines MongoDB settings (dbserver, dbport, username, password)</li> <li>web/inc/config/mongodb.inc : defines MongoDB settings (dbserver, dbport, username, password)</li> <li>web/inc/config/config.inc : defines many of web parameters (modify only if necessary)</li> <li>web/inc/config/local.inc : defines the application parameters specific to the local installation (not erase when updating).</li> </ul> <p>See Configuration settings</p> <p>Warning : You have to pay attention to put the same MongoDB settings in all the above configuration files. It is best not to change anything. It would have been preferable to put a single configuration file but this was not yet done given the different languages involved (bash, javascript, python, PHP). To be done!</p> <p>Note : If you want to run multiple instances, you will need to change in the run file, i) the container names, ii) the data path, iii) the MongoDB volume name.</p> <p>The following two JSON files are defined by default but can be easily configured from the web interface. See the Terminology Definition section.</p> <ul> <li>web/conf/config_terms.json : define the terminology</li> <li>web/conf/maggot-schema.json : define the JSON schema used to validate metadata files.</li> </ul> <p></p>"},{"location":"installation/#commands","title":"Commands","text":"<p>The run shell script allows you to perform multiple actions by specifying an option :</p> <pre><code>cd pgd-mmdt\nsh ./run &lt;option&gt;\n</code></pre> <p>Options:</p> <ul> <li>build : Create the 3 Docker images namely pgd-mmdt-db, pgd-mmdt-scan and pgd-mmdt-web</li> <li>start : 1) Launch the 3 services by creating the Docker containers corresponding to the Docker images; 2) Create also the MongoDB volume.</li> <li>stop :  1) Remove all the 3 Docker containers; 2) Remove the MongoDB volume.</li> <li>initdb : Create and initialize the Mongo collection</li> <li>scan : Scan the data  according to a fixed period (30 min) and update the contents of the database and the web interface</li> <li>fullstart : Perform the 3 actions start, initdb and scan</li> <li>restart : Perform the 2 actions stop then fullstart</li> <li>ps : Check that all containers are running correctly</li> <li>passwd &lt;user&gt;: Define the admin password if no user is specified, allowing you to copy the new configuration file on the server via the web interface (see configuration and to add entries in dictionaries. If a user is specified, the dictionary consultation will be authorized for this user.</li> </ul> <p></p>"},{"location":"installation/#starting-the-application","title":"Starting the application","text":"<ul> <li> <p>You must first build the 3 docker container images if this has not already been done, by :    <pre><code>sh ./run build\n</code></pre></p> </li> <li> <p>The application can be sequentially started :</p> <ul> <li>Starting the web interface  <pre><code>sh ./run start\n</code></pre></li> <li>Initialization of the MongoDB database  <pre><code>sh ./run initdb\n</code></pre></li> <li>Scanning the data directory for metadata files (META_XXXX.json)  <pre><code>sh ./run scan\n</code></pre></li> </ul> </li> <li> <p>You can also launch these 3 steps with a single command:    <pre><code>sh ./run fullstart\n</code></pre></p> </li> </ul> <p></p>"},{"location":"installation/#launching-the-web-application-in-the-web-browser","title":"Launching the web application in the web browser","text":"<ul> <li> <p>Once the application is started, we can see if the containers are started using the following command:    <pre><code>docker ps -a\n</code></pre></p> </li> <li> <p>which should produce a result similar to the following:</p> </li> </ul> <pre>\n   CONTAINER ID  IMAGE          COMMAND                 CREATED          STATUS         PORTS                                  NAMES\n   5914504f456d  pgd-mmdt-web   \"docker-php-entrypoi.\"  12 seconds ago   Up 10 seconds  0.0.0.0:8087-&gt;80/tcp, :::8087-&gt;80/tcp  mmdt-web\n   226b13ed9467  pgd-mmdt-scan  \"cron -f\"               12 seconds ago   Up 11 seconds                                         mmdt-scan\n   81fecbb56d23  pgd-mmdt-db    \"docker-entrypoint.s.\"  13 seconds ago   Up 12 seconds  27017/tcp                              mmdt-db\n</pre> <ul> <li> <p>On the first line, the one which corresponds to the web interface, we see that port 80 of the docker is exported to port 8087 of the VM. Let's say that the IP address of your VM is 192.168.56.2, then in your browser you will need to put the URL http://192.168.56.2:8087/. You can of course change the port number in the 'run' file.</p> </li> <li> <p>It may be preferable to use a lightweight http server like nginx so that the Maggot URL will be http://192.168.56.2/maggot/. Below an example of config:    <pre><code>## /etc/nginx/nginx.conf\nhttp {\n\n...\n    upstream maggot  { server 127.0.0.1:8087; }\n...\n\n}\n\n## /etc/nginx/conf.d/my-site.conf\n\nserver {\nlisten 80 default;\nserver_name $host;\n\n...\n\n    location /maggot/ {\nproxy_set_header Host $host;\nproxy_set_header X-App-Name 'maggot';\nproxy_set_header X-Real-Ip $remote_addr;\nproxy_set_header X-Forwarded-Host $host;\nproxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\nproxy_pass http://maggot/;\n}\n\n...\n\n}\n</code></pre></p> </li> </ul> <p></p>"},{"location":"installation/#stoping-the-application","title":"Stoping the application","text":"<ul> <li>To stop the application :    <pre><code>sh ./run stop\n</code></pre></li> </ul>"},{"location":"installation/#updating-the-application","title":"Updating the application","text":"<p>When updating the application, it is imperative to preserve a whole set of configuration files as well as the content of certain directories (dictionaries, javascripts dedicated to vocabularies, etc.). An update script is available (./etc/update-maggot.sh) preferably placed under '/usr/local/bin'. To preserve your configuration, it is recommended to create local configuration files.</p> <ul> <li> <p>A first file 'local.conf' will contain all the parameters to be preserved, initially contained in the 'run' file. A small example could be as follow :    <pre><code>#!/bin/bash\n\n# Local HTTP Port for web application\nWEB_PORT=8088\n\n# Path to the data\nDATADIR=/media/Workdir/Share/DATA/\n</code></pre></p> </li> <li> <p>A second file './web/inc/config/local.inc' will contain all the parameters to be preserved, initially contained in the './web/inc/config/config.inc' file. A small example could be as follow :    <pre><code>&lt;?php\n\n# Main title\n$TITLE ='Metadata management - My Labs';\n$MAINTITLE =$TITLE;\n\n# File Browser\n$FILEBROWSER=1;\n$URL_FILEBROWSER='/fb/';\n\n# Enable some functionalities\n$export_oai = 1;\n\n?&gt;\n</code></pre></p> </li> </ul>"},{"location":"installation/#architecture-diagram","title":"Architecture diagram","text":"<p> Note: See how to do proceed for configuration steps. </p>"},{"location":"installation/#file-browser","title":"File Browser","text":"<p>You can provide access to your data via a file browser. This application must be installed separately but can be connected to Maggot by specifying the corresponding URL in the configuration file. Users and their rights are managed in the filebrowser application. Likewise, we can also create links to the data without a password. These links can be usefully specified as external resources in the metadata managed by Maggot.</p> <p> See how to do install in github. </p> <p></p>"},{"location":"private-access/","title":"Private access","text":""},{"location":"private-access/#private-access-key-management","title":"Private access key management","text":""},{"location":"private-access/#motivation","title":"Motivation","text":"<p>Although the Maggot tool is designed to foster the sharing of metadata within a collective, it may be necessary to temporarily privatize access to the metadata of an ongoing project with confidentiality constraints. So even within our own collective, access to metadata must be restricted to authorized users only.</p>"},{"location":"private-access/#implementation","title":"Implementation","text":"<p>The choice of not wanting to manage users in the Maggot tool was made in order to make the metadata completely open by default within a collective. Furthermore, access rights to the storage space are managed independently of the Maggot tool by the administrator of this space. It is therefore through the storage space that we must give or not access to the metadata via the web interface.</p> <p>The chosen mechanism for privatizing access is described below. It has the dual advantage of being simple to implement and simple to use.</p> <ol> <li> <p>First we have to generate a file containing the encrypted key for a private access. This file must be generated from the web interface then downloaded as shown in the figure below. Then this file must be manually deposited in the data directory corresponding to the dataset whose access we wish to privatize. The presence of this file within a directory is enough to block access to metadata and data by default. It should be noted that we can put this same file containing the encrypted private key in several data directories (included within the same project for example). The deposit must be done by hand because the Maggot tool must only have access to the storage space in read mode. This also guarantees that the user has writing rights to this space without having to manage user accounts on the Maggot side.</p> <p> </p> <p> By default, \u2018untwist1\u2019 metadata are not accessible to anyone </p> </li> <li> <p>When we want to have access to the metadata of this dataset, we have to simply enter the private key in the current session. This will have the effect of unlocking access to the metadata via the web interface only in the current session of our web browser. This means that we will have to enter the private key for each session (by default, a session lasts a maximum of 1 hour).</p> <p> </p> <p> Now the \u2018untwist1\u2019 metadata are accessible only to us </p> </li> <li> <p>When we want to give access to the metadata to the entire collective, we simply need to delete the private access file (named by default 'META_auth.txt') from the concerned data directory.</p> </li> </ol> <p></p>"},{"location":"settings/","title":"Configuration settings","text":""},{"location":"settings/#configuration-settings_1","title":"Configuration settings","text":"<p>Here is the list of all files that may be subject to adjustment of certain parameters according to the needs of the instance site.</p> <p></p>"},{"location":"settings/#dockerscanpartscriptsconfigpy","title":"dockerscanpart/scripts/config.py","text":"<p>This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.</p> <p>Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template</p> Parameter Description Default value dbserver Name of the MongoDB server mmdt-db database Name of the MongoDB database pgd-db dbport Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read/Write access userw-pgd password Password corresponding to the username of the Mongo DB pgd-db wwwww <p></p>"},{"location":"settings/#incconfigmongodbinc","title":"inc/config/mongodb.inc","text":"<p>This file defines the connection parameters to the Mongo database. Knowing that this database is only accessible internally, in principle they do not need to be changed.</p> <p>Note: These settings must be the same as defined in dockerdbpart/initialisation/setupdb-js.template</p> Parameter Description Default value docker_mode Indicates whether the installation involves using docker containers. In this case, the Mongo DB IP address will be different from 127.0.0.1. 1 uritarget the Mongo DB IP address mmdt-db (docker_mode=1) or 127.0.0.1 (docker_mode=0) database Name of the MongoDB database pgd-db collection Name of the MongoDB collection metadata port Port of the MongoDB server 27017 username Username of the Mongo database pgd-db with Read access only userr-pgd password Password corresponding to the username of the Mongo DB pgd-db rrrrr <p></p>"},{"location":"settings/#incconfigconfiginc","title":"inc/config/config.inc","text":"<p>This file defines parameters related to i) the web interface, ii) the functionalities allowed for users. Only the parameters that could be useful to be changed for the needs of an instance are described here.</p> Parameter Description Default value EXTERN Indicates if the use of the tool is only for external use, i.e. without using a storage space. 0 PRIVATE_ACCESS Gives the possibility of managing private access to metadata 0 ZOOMWP Zoom level regarding the web interface. By reducing the size slightly, you get a better layout. 90% RESMEDIA Gives the possibility of putting a MINE type on each resource in the metadata. 1 TITLE Title to display in main banner Metadata management FILEBROWSER Indicates whether the file browser is used. This assumes it is installed. 0 URL_FILEBROWSER File browser URL. It can be absolute or relative. /fb/ APPNAME Name given in the URL to access the web interface. maggot dataverse_urls Array of Dataverse repository URLs where you can upload metadata and data - zenodo_urls Array of Zenodo repository URLs where you can upload metadata and data - SERVER_URL Default Dataverse repository URL https://entrepot.recherche.data.gouv.fr ZENODO_SERVER_URL Default Zenodo repository URL https://zenodo.org export_dataverse Indicates whether the Dataverse feature is enabled 1 export_zenodo Indicates whether the Zenodo feature is enabled 1 export_jsonld Indicates whether the JSON-LD feature is enabled 1 export_oai Indicates whether the OAI-PMH feature is enabled 0 export_bloxberg Indicates whether the Bloxberg Blockchain feature is enabled (Experimental) 0 cvdir Relative path of the Control Vocabulary Listes (cvlist) cvlist/ maggot_fulltitle Maggot name of the field corresponding to the title in dataverse/zenodo fulltitle auth_senddata_file Name of the file that must be present in the data directory to authorize the transfer of the data file META_datafile_ok.txt private_auth_file Name of the private access file META_auth.txt sendMail Configuring messaging for sending metadata to data managers (see below) NULL <p></p> <p>The messaging configuration is done using the following array in the inc/config/config.inc file (or more judiciously in inc/config/local.inc in order to be preserved during an update) - To understand how it works see Send Emails using PHPmailer</p> <pre><code>$sendMail['smtpHost'] = 'smtp.example.org';        //  Set the SMTP server to send through\n$sendMail['smtpSecure'] = 'tls';                   //  Enable TLS encryption\n$sendMail['smtpPort'] = 587;                       //  Set the TCP port to connect to\n$sendMail['CheckEmail'] = 'maggot@exemple.org';    //  Email address authorized to send emails\n$sendMail['CheckPass'] = 'password';               //  The corresponding password\n$sendMail['CheckName'] = 'Maggot';                 //  Alias name\n$sendMail['UserEmail'] = 'admin@exemple.org';      //  Email of data managers, separated by a comma\n</code></pre> <p></p>"},{"location":"settings/#run","title":"run","text":"<p>This file contains the essential parameters to be set before any use.</p> Parameter Description Default value WEB_PORT Local HTTP Port for web application 8087 DATADIR Path to the data /opt/data/ DB_IMAGE Docker image name of the MongoDB pgd-mmdt-db SCAN_IMAGE Docker image name of the Scan process pgd-mmdt-scan WEB_IMAGE Docker image name of the Web interface pgd-mmdt-web DB_CONTAINER Docker container name of the MongoDB mmdt-db SCAN_CONTAINER Docker container name of the Scan process mmdt-scan WEB_CONTAINER Docker container name of the Web interface mmdt-web MONGO_VOL Volume name for MongoDB mmdt-mongodb USER Admin user in the htpasswd file admin <p></p>"},{"location":"definitions/","title":"Definition Files","text":""},{"location":"definitions/#metadata-definition-files","title":"Metadata definition files","text":"<p>The Maggot tool offers great flexibility in configuration. It allows you to completely choose all the metadata you want to describe your data. You can base yourself on an existing metadata schema, invent your own schema or, more pragmatically, mix one or more schemas by introducing some metadata specific to your field of application. However, keep in mind that if you want to add descriptive metadata to your data then a certain amount of information is expected. But a completely different use of the tool is possible, it's up to you.</p> <p>There are two levels of definition files as shown the figure below:</p> <p></p> <p>1 - The first level concerns the definition of terminology (metadata) similar to a descriptive metadata plan. Clearly, this category is more akin to configuration files. They represent the heart of the application around which everything else is based. The input and search interfaces are completely generated from these definition files (especially the web/conf/config_terms.txt file), thus defining each of the fields, their input type (checkbox, dropbox, textbox, ...) and the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms). This is why a configuration step is essential in order to be able to configure all the other modules.</p> <p>2 - The second level concerns the definitions of the mapping to a differently structured metadata schema (metadata crosswalk, i.e a specification for mapping one metadata standard to another), used either i) for metadata export to a remote repository (e.g. Dataverse, Zenodo) or ii) for metadata harvesting (e.g. JSON-LD, OAI-PMH). Simply place the definition files in the configuration directory (web/conf) for them to be taken into account, provided you have adjusted the configuration (web/inc/config/config.inc).</p> <p>All definition files are made using a simple spreadsheet then exported in TSV format. </p> <p>The list of definition files in Maggot are given below. All must be put under the directory web/conf. </p> <p>See an example on line : https://pmb-bordeaux.fr/maggot/config/view and the corresponding form based on these definition files.</p> <p></p>"},{"location":"definitions/config_terms/","title":"Terminlogy Definition","text":""},{"location":"definitions/config_terms/#example-of-a-terminlogy-definition-file","title":"Example of a Terminlogy Definition file","text":"Field Section Required Search ShortView Type features Label Predefined terms title definition Y N 1 textbox width=350px Short name fulltitle definition Y Y 2 textbox Full title subject definition Y Y checkbox open=0 Subject Agricultural Sciences,Arts and Humanities,Astronomy and Astrophysics,Business and Management,Chemistry,Computer and Information Science,Earth and Environmental Sciences,Engineering,Law,Mathematical Sciences,Medicine Health and Life Sciences,Physics,Social Sciences,Other description definition Y Y areabox rows=6,cols=30 Description of the dataset note definition N Y areabox rows=4,cols=30 Notes status status N Y 3 dropbox width=350px Status of the dataset Processed,In progress,Unprocessed access_rights status N Y 4 dropbox width=350px Access rights to data Public,Mixte,Private language status N Y checkbox open=0 Language Czech,Danish,Dutch,English,Finnish,French,German,Greek,Hungarian,Icelandic,Italian,Lithuanian,Norwegian,Romanian,Slovenian,Spanish,Swedish lifeCycleStep status N Y multiselect autocomplete=lifecycle,min=1 Life cycle step license status N Y textbox autocomplete=license,min=1 License datestart status N Y datebox width=350px Start of collection dateend status N Y datebox width=350px End of collection dmpid status N Y textbox DMP identifier contacts management Y Y multiselect autocomplete=people,min=1 Contacts authors management Y Y multiselect autocomplete=people,min=1 Authors collectors management N Y multiselect autocomplete=people,min=1 Data collectors curators management N Y multiselect autocomplete=people,min=1 Data curators members management N Y multiselect autocomplete=people,min=1 Project members leader management N Y multiselect autocomplete=people,min=1 Project leader wpleader management N Y multiselect autocomplete=people,min=1 WP leader depositor management N Y textbox Depositor producer management N Y multiselect autocomplete=producer,min=1 Producer grantNumbers management N Y multiselect autocomplete=grant,min=1 Grant Information kindOfData descriptors Y Y checkbox open=0 Kind of Data Audiovisual,Collection,Dataset,Event,Image,Interactive Resource,Model,Physical Object,Service,Software,Sound,Text,Workflow,Other keywords descriptors N Y multiselect autocomplete=bioportal,onto=EFO:JERM:EDAM:MS:NMR:NCIT:OBI:PO:PTO:AGRO:ECOCORE:IOBC:NCBITAXON Keywords topics descriptors N Y multiselect autocomplete=VOvocab Topic Classification dataOrigin descriptors N Y checkbox open=0 Data origin observational data,experimental data,survey data,analysis data,text corpus,simulation data,aggregate data,audiovisual corpus,computer code,Other experimentfactor descriptors N Y multiselect autocomplete=vocabulary,min=1 Experimental Factor measurement descriptors N Y multiselect autocomplete=vocabulary,min=1 Measurement type technology descriptors N Y multiselect autocomplete=vocabulary,min=1 Technology type publication_citation descriptors N Y areabox rows=5,cols=30 Publication - Citation publication_idtype descriptors N Y dropbox width=200px Publication - ID Type -,ark,arXiv,bibcode,doi,ean13,eissn,handle,isbn,issn,istc,lissn,lsid,pmid,purl,upc,url,urn publication_idnumber descriptors N Y textbox width=400px Publication - ID Number publication_url descriptors N Y textbox Publication - URL comment other N Y areabox rows=15, cols=30 Additional information"},{"location":"definitions/dataverse/","title":"Dataverse Definition File","text":"<p>Open source research data repository software, approved by Europe.</p>"},{"location":"definitions/dataverse/#dataverse-definition-file_1","title":"Dataverse definition File","text":"<p>This definition file will allow Maggot to automatically export the dataset into a data repository based on Dataverse. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Dataverse, knowing that this metadata crosswalk was made possible by choosing the right metadata schema at upstream.</p> <p> </p> <p>The structure of the Dataverse JSON output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.</p> <p>The file must have 4 columns with headers defined as follows:</p> <ul> <li>column 1 - Field : shortname of the Maggot fields</li> <li>column 2 - Typename : The corresponding Dataverse fields. </li> <li>column 3 - Type :The Dataverse field at the top level encapsulating the fields in the \u201cTypename\u201d column where applicable</li> <li>column 4 - Terminology : Indicates from which vocabulary the mapping should be performed. Concerning the vocabulary to be mapped either i) on a dictionary, you must put the name of the corresponding dictionary prefixed with cvlist:, or ii) on ontologies or on a thesaurus, you must specify the corresponding entry in the mapping definition file prefixing it with cv:.</li> </ul> <p>Below an example of Dataverse definition file (TSV)  </p> <p>Example of Dataverse JSON file generated based on the definition file itself given as an example above.</p> <ul> <li>Dataverse JSON of the FRIM dataset</li> </ul> <p></p>"},{"location":"definitions/json-ld/","title":"JSON-LD Definition File","text":""},{"location":"definitions/json-ld/#json-ld-definition-file_1","title":"JSON-LD definition File","text":"<p>This definition file will allow harvesters to collect structured metadata based on a semantic schema, i.e the fields themselves and not just their content can be associated with a semantic definition (ontology for example) which will then facilitate the link between the metadata and therefore the data (JSON-LD). The chosen semantic schema is based on several metadata schemas.</p> <p>The full workflow to \"climb the Link Open Data mountain\" is resumed by the figure below :  </p> <p>Metadata schemas used to build the model proposed by default:</p> <ul> <li>Schema.org, Bioschemas.org, Datacite, DDI-RDF, DubinCore, Dataverse</li> </ul> <p>Definition of the JSON-LD context using the metadata schemas proposed by default  </p> <p>The structure of the JSON-LD is not known internally, information on the structure will therefore be necessary to carry out the correspondence.</p> <p>Example of JSON-LD definition file (partial) using the metadata schemas proposed by default (TSV)  </p> <p>Example of JSON-LD file generated based on the definition file itself given as an example above.</p> <ul> <li>JSON-LD file of the FRIM dataset</li> </ul> <p></p>"},{"location":"definitions/mapping/","title":"Mapping Definition File","text":""},{"location":"definitions/mapping/#mapping-definition-file_1","title":"Mapping definition File","text":"<p>The mapping file is used as indicated by its name to match a term chosen by the user during entry with another term from an ontology or a thesaurus and therefore to obtain a URL which will be used for referencing. It can be used for each metadata crosswalk requiring such a mapping (e.g. to the Dataverse, Zenodo or JSON-LD format).</p> <p>The role of this definition file is illustrated with the figure above  </p> <p>The file must have 5 columns with headers defined as follows:</p> <ul> <li>column 1 - CVname : name of the mapping entry</li> <li>column 2 - CVtype : type of the CV target (must be either bioportal or skosmos)</li> <li>column 3 - CVurl : URL of the corresponding web API</li> <li>column 4 - CVterm : name of the thesaurus or the ontology list separated by a comma</li> <li>column 5 - CVlang : the chosen language (mainly for thesauri)</li> </ul> <p>Below an example of Mapping definition file (TSV)</p> <p> </p>"},{"location":"definitions/oai-pmh/","title":"OAI-PMH Definition File","text":"<p>OAI-PMH is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives.</p>"},{"location":"definitions/oai-pmh/#oai-pmh-definition-file_1","title":"OAI-PMH definition File","text":"<p>This definition file will allow harvesters to collect metadata structured according to a standard schema (OAI-DC).</p> <ul> <li> <p>Based on the Open Archives Initiative Protocol for Metadata Harvesting - Version 2</p> </li> <li> <p>Example of a OAI-PMH Data Provider Validation</p> </li> <li> <p>Example of OAI-PMH output for a dataset</p> <ul> <li>FRIM dataset</li> </ul> </li> </ul> <p>The structure of the OAI-PMH output file being known internally, a minimum of information is therefore necessary to carry out the correspondence.</p> <p>Example of OAI-PMH definition file (TSV)  </p> <p>Another example of OAI-PMH definition file (TSV) with identifers &amp; vocabulary mapping  </p> <p></p>"},{"location":"definitions/terminology/","title":"Terminology","text":""},{"location":"definitions/terminology/#definition-of-terminology","title":"Definition of terminology","text":"<p>There are two definition files to set up.</p> <ul> <li>The terminology definition file (config_terms.txt) serving to describe all terminology used to define the metadata of a dataset.</li> <li>The terminology documentation file (config_doc.txt) serving to documente all terminology definitions.</li> </ul> <p>Each time there is a change in these two definition files, it is necessary to convert them so that they are taken into account by the application.</p> <p>Terminology is the set of terms used to define the metadata of a dataset. A single file (web/conf/config_terms.txt) contains all the terminology. The input and search interfaces (e.g screenshot) are completely generated from this definition file, thus defining i) each of the fields, their input type (checkbox, dropbox, textbox, ...) and ii) the associated controlled vocabulary (ontology and thesaurus by autocompletion, drop-down list according to a list of fixed terms).</p> <p> </p> <p>The metadata schema proposed by defaut is mainly established according to the DDI (Data Documentation Initiative) schema that also corresponds to that adopted by the Dataverse software.</p> <p>Terminology is organised in several sections. By default 6 sections are proposed, but you can redefine them as you wish:</p> <ul> <li>DEFINITION : Section for describing shortly the dataset. </li> <li>STATUS : Section for defining the status of the dataset, associated rights, dates, etc.</li> <li>MANAGEMENT : Section for assigning names of people or organizations who participated in the production of data and according to the type of participation.</li> <li>DESCRIPTORS : Section for defining elements characterizing the data themselves and certain experimental conditions for obtaining them.</li> <li>OTHER : Section for entering miscellious information (protocols, comments, issues, ...)</li> <li>RESOURCES : Section for defining metadata about all the resources you want, i.e both external (links) and internal (data files on the storage space) resources. This section does not require any configuration a priori, it is added de facto.</li> </ul> <p>For each section, fields are then defined. These fields can be defined according to the way they will be entered via the web interface. There are 6 different types of input: check boxes (checkbox), drop lists (dropbox), single-line text boxes (textbox), single-line text boxes with an additional box for multiple selection from a catalog of terms (multiselect), date picker (datebox) and multi-line text boxes (areabox).</p> <p> </p> <p>For two types (checkbox and dropbox), it is possible to define the values to be selected (predefined terms).</p> <p></p>"},{"location":"definitions/terminology/#structure-of-the-terminology-definition-file-tsv","title":"Structure of the Terminology definition file (TSV)","text":"<p>The file must have 9 columns with headers defined as follows:</p> <ul> <li>column 1 - Field : shortname of the fields</li> <li>column 2 - Section : shortname ot the sections</li> <li>column 3 - Required : indicates if the field is mandatory ('Y') or not ('N')</li> <li>column 4 - Search : indicates if the field can be used as a criterion search ('Y') or not ('N')</li> <li>column 5 - Shortview : indicates with ordered numbers if the field serves for the overview table after the search (empty by default)</li> <li>column 6 - Type : indicates the way they will be entered via the web interface (possible values are: textbox, dropbox, checkbox, multiselect, datebox and areabox).</li> <li>column 7 - Features : dependings on the Type value, one can specifiy some specific features. If several features, they must be separated by a comma.<ul> <li>open=0 or open=1 (checkbox) :  indicates if the selection is opened or not. See Vocabulary.</li> <li>autocomplete=entity (textbox, checkbox &amp; multiselect) :  The entity.js file must be present under web/cvlist/entity/ if the entity is a dictionary otherwise it must be present under web/js/autocomplete. See Vocabulary.</li> <li>width=NNNpx (textbox, dropbox, datebox) : allows you to specify the width of the box. Usefull if you want put several fields in the same line. See note 1 below.</li> <li>row=NN and cols=NN (areabox) : allows you to specify the row and column size of the textarea.</li> </ul> </li> <li>column 8 - Label : Labels corresponding to the fields that will appear in the web interface</li> <li> <p>column 9 - Predefined terms : for fields defined with a type equal to checkbox or dropbox, one can give a list of terms separated by a comma.</p> </li> <li> <p>Notes</p> <ul> <li>the fields will be displayed in the same order as in the file and by section. So if you want to specify several textboxes with particular sizes so that they are on the same line, they should belong to the same section and follow each other in the file in the same order.</li> <li>the title and description fields are mandatory but not necessarily in the same section.</li> </ul> </li> </ul> <p>Below an example of Terminology definition file (TSV)  </p> <p>Example of Maggot JSON file generated based on the same definition file</p> <ul> <li>Maggot JSON of the FRIM dataset and its corresponding JSON-schema</li> </ul> <p></p>"},{"location":"definitions/terminology/#structure-of-the-terminology-documentation-file-tsv","title":"Structure of the Terminology documentation file (TSV)","text":"<p>The documentation definition file is used to have online help for each field (small icon placed next to each label on the form). So it should only be modified when a field is added or deleted, or moved to another section. This file will be used then to generate the online metadata documentation according to the figure below (See Configuration to find out how to carry out this transformation).</p> <p> </p> <p>The file must have 3 columns with headers defined as follows:</p> <ul> <li>column 1 - Type : The type of the element, namely 'section', 'field' or 'option'. An 'option' type must correspond to each of the options for a field corresponding to a drop-down list.</li> <li>column 2 - Name : Name of the element. The names of the sections, variables and drop-down options must be exactly the same as those specified in the terminology definition file.</li> <li>column 3 - Description : The description corresponding to the element, serving as much as possible to give indications on the information to be selected or entered, in order to remove possible ambiguities.</li> </ul> <p>Below an example of Terminology documentation file (TSV)  </p> <p>Same example as above converted to HTML format using Markdown format</p> <ul> <li>Metadata Documentation</li> </ul> <p></p>"},{"location":"definitions/vocabulary/","title":"Vocabulary","text":""},{"location":"definitions/vocabulary/#vocabulary_1","title":"Vocabulary","text":"<ul> <li>In this section we expose the full extent of the possibilities concerning the vocabulary in Maggot.</li> <li>Choosing the type of vocabulary and how to enter it depends entirely on what you put in the terminology definition file. However, some approaches require a little technicality by writing small scripts based on JavaScript, but nothing too serious. You can always take an already ready-made script and modify only the part that concerns your focus.</li> </ul> <p>1 -  Vocabulary based on a list of terms fixed in advance (checbox with feature open=0)</p> <ul> <li>List of well-chosen and limited Control Vocabulary e.g according to a reference e.g. Data Document Initiative.</li> </ul> <p> </p> <p>2 - Vocabulary open for addition (checkbox with feature open=1)</p> <ul> <li>allows you to collect the desired Control Vocabulary (CV) from users. In order to initiate the list, you can put some terms in the predefined terms column of the terminology definition file.</li> </ul> <p> </p> <p>3 - Vocabulary based on a web API in a text field (textbox)</p> <ul> <li>The web API is defined in a JavaScript file with the same name as the assigned variable (here cities) and must present under web/js/autocomplete. For example, to enter a French city you can use the API geo.api.gouv.fr. See cities.js</li> </ul> <p> </p> <p>4 - Vocabulary based on a dictionary with multiple selection (multiselect) </p> <ul> <li>Dictionaries allow you to record multiple information necessary to define an entity, such as the names of people or even the funders. These information, once entered and saved in a file called a dictionary. Based on a very simple JavaScript retrieving the complete list of items included in the dictionary, thus creating a sort of internal API, we can fill a Maggot field by  autocompletion related to a search for these items.</li> <li>The JavaScript file must be named dico.js and be present under web/cvlist/dico/ where dico is the name of the dictionary. See for instance people.js</li> </ul> <p> </p> <p>5 - Vocabulary based on a SKOSMOS Thesaurus with multiple selection (multiselect) </p> <ul> <li>SKOSMOS is a web tool facilitating the posting of controlled vocabulary online in the form of a thesaurus according to the SKOS data model. It offers a navigation interface as well as a web API. A simple JavaScript allows you to easily connect this web API with a multiselect field.</li> <li>The JavaScript file must have the same name as the assigned variable (here VOvocab) and must present under web/js/autocomplete. See for instance VOvocab.js.</li> </ul> <p> </p> <p>6 - Vocabulary based on an OntoPortal with multiple selection (multiselect) </p> <ul> <li>Portals based on OntoPortal offer the wealth of ontologies according to several domains of application (e.g. BioPortal in the biomedical domain, AgroPortal in the domain of plants).</li> <li>No need of JavaScript file. The Bioportal Autocompletion widget has been implemented into Maggot. You have to only declare the ontology you want to use directly into the terminology definition file in order to easily connect this widget with a multiselect field.</li> </ul> <p> </p> <p></p>"},{"location":"definitions/zenodo/","title":"Zenodo Definition File","text":"<p>Open source research data repository software, approved by Europe.</p>"},{"location":"definitions/zenodo/#zenodo-definition-file_1","title":"Zenodo definition File","text":"<p>This definition file will allow Maggot to automatically export the dataset into a data repository based on Zenodo. The approach consists of starting from the Maggot metadata file in JSON format and transforming it into another JSON format compatible with Zenodo.</p> <p>The structure of the Zenodo JSON output file is not known internally, information on the structure will therefore be necessary to carry out the correspondence.</p> <p>Below an example of Zenodo definition file (TSV)  </p> <p>Example of Zenodo JSON file generated based on the definition file itself given as an example above.</p> <ul> <li>Zenodo JSON of the FRIM dataset</li> </ul> <p></p>"},{"location":"publish/","title":"Publish Metadata","text":""},{"location":"publish/#publish-metadata_1","title":"Publish Metadata","text":"<ul> <li>Once we have decided to publish our metadata with possibly our data, we can choose the repository that suits us. Currently repositories based on Dataverse and Zenodo are supported, both being Europe-approved repositories.</li> <li> <p>Using an approach that might be called \u201cmachine-readable metadata,\u201d it is possible to populate metadata for a dataset into one of the proposed data repositories via its web API, provided that you have taken care to correctly define your metadata schema so that it is possible to make a correspondence with the chosen data repository using a mapping definition file.</p> </li> <li> <p>The principle is illustrated by the figure above.</p> </li> </ul> <p> </p> <ul> <li>We start from the Maggot JSON format metadata file generated from the web interface and based on the metadata profile defined by the terminology definition files. </li> <li>Then from a file defining the correspondence between the Maggot fields and those of the target repository, we can perform a metadata crosswalk to the JSON format supported by the web API of the target repository.</li> <li>During the process we enrich the metadata with controlled vocabularies based either on dictionaries or on thesauri and/or ontologies. For the latter cases, we use the web APIs of these sources to perform the mapping (see the definition of mapping).</li> <li>Finally, to be able to carry out the transfer i.e. the submission to the target repository (we say \"push\" for short), we first need to connect to the repository in order to retrieve the key (the API token) authorizing us to submit the dataset. This obviously assumes that we have the privileges (creation/modification rights) to do so.</li> </ul>"},{"location":"publish/#httpswwwgooglecomsearchqmetadatacrosswalkdefinitionoqmetadatacrosswalk","title":"https://www.google.com/search?q=metadata+crosswalk+definition&amp;oq=metadata+crosswalk","text":""},{"location":"publish/dataverse/","title":"Publish into Dataverse","text":""},{"location":"publish/dataverse/#publish-into-dataverse_1","title":"Publish into Dataverse","text":"<ul> <li>Based on the Dataverse Native API</li> </ul> <p>1 - To submit metadata to a Dataverse repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.</p> <p>2 - You then need to connect to the repository in order to retrieve the key (the  API token) authorizing you to submit the dataset. This obviously assumes that you have the privileges (creation/modification rights) to do so.</p> <p>3 - After choosing the repository URL, you must also specify on which dataverse collection you want to deposit the datasets. As previously, you must have write rights to this dataverse collection.</p> <p> </p> <p></p> <ul> <li>Then, all you have to do is click on 'Publish' to \"push\" the metadata to the repository. The figure below illustrates based on an example how the metadata is recorded in the repository as well as the Mapping corresponding to the fields linked to Controlled Vocabularies.</li> </ul> <p></p> <p></p>"},{"location":"publish/dataverse/#deposit-data-files","title":"Deposit data files","text":"<ul> <li> <p>If you also want to deposit data files at the same time as the metadata, you will need:</p> <ul> <li> <p>1 - declare the files to be deposited in the resources; these same files must also be present in the storage space.</p> </li> <li> <p>2 - create a semaphore file (META_datafile_ok.txt); its sole presence, independently of its content (which may be empty) will authorize the transfer. Indeed, the creation of such a file guarantees that the user has actually write rights to the storage space corresponding to his dataset. This prevents someone else from publishing the data without having the right to do so. This mechanism also avoids having to manage user accounts on Maggot.</p> </li> </ul> </li> </ul> <p> </p> <p></p> <ul> <li>The figure below illustrates based on an example how data files appear on the repository with annotations corresponding to those created in Maggot.  </li> </ul> <p></p>"},{"location":"publish/zenodo/","title":"Publish into Zenodo","text":""},{"location":"publish/zenodo/#publish-into-zenodo_1","title":"Publish into Zenodo","text":"<ul> <li>Based on the Zenodo REST-API</li> </ul> <p>1 - To submit metadata to a Zenodo repository, you must first select a dataset either from the drop-down list corresponding to the datasets listed on the data storage space or a metadata file from your local disk.</p> <p>2 - Unless you have previously saved your API token, you must create a new one and copy and paste it before validating it. Before validating, you must check the deposit:access and deposit:write boxes in order to obtain creation and modification rights with this token.</p> <p>3 - After choosing the repository URL, you can optionally choose a community to which the dataset will be linked. By default, you can leave empty this field.</p> <ul> <li>Warning : given the new changes introduced to the Zenodo validation process (October 2023), it seems that it is no longer possible to validate a community via API. Only a choice via the Zenodo web interface will allow you to do so in order to be validated later by the manager of this community.</li> </ul> <p> </p> <p></p>"},{"location":"publish/zenodo/#deposit-data-files","title":"Deposit data files","text":"<ul> <li> <p>If you also want to deposit data files at the same time as the metadata, you will need (see figure below)</p> <ul> <li> <p>1 - declare the files to be deposited in the resources (1) ; these same files must also be present in the storage space.</p> </li> <li> <p>2 - create a semaphore file (META_datafile_ok.txt) (2); its sole presence, independently of its content will authorize the transfer. Indeed, the creation of such a file guarantees that the user has actually write rights to the storage space corresponding to his dataset. This prevents someone else from publishing the data without having the right to do so. This mechanism also avoids having to manage user accounts on Maggot.</p> </li> </ul> </li> <li> <p>Then, all you have to do is click on 'Publish' to \"push\" the metadata and data to the repository (3).</p> </li> <li> <p>After submission and if everything went well, a link to the deposit will be given to you (4).</p> </li> </ul> <p> </p> <p></p> <ul> <li>The figure below illustrates based on an example how the metadata and data is recorded in the repository.</li> </ul> <p> </p> <p></p>"},{"location":"tutorial/","title":"Quick tutorial","text":""},{"location":"tutorial/#quick-tutorial_1","title":"Quick tutorial","text":"<p>This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user. </p> <p>See a short Presentation and Poster if you want to have a more general overview of the tool.</p> <p></p>"},{"location":"tutorial/#overview","title":"Overview","text":"<p>The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:</p> <p> </p> Configuration <p>This module mainly concerns the data manager and makes it possible to construct all the terminology definition files, i.e. the metadata and sources of associated vocabularies. See Definition files then Configuration.</p> Private Access <p>This module allows data producer to temporarily protect access to metadata for the time necessary before sharing it within his collective. See Private access key management.</p> Dictionaries <p>This module allows data producer to view content of all dictionaries. It also allows data steward to edit their content. See Dictionaries for technical details only.</p> Metadata Entry <p>This is the main module allowing the data producer to enter their metadata relating to a dataset. See the corresponding tutorial for Metadata Entry.</p> Search datasets <p>This module allows users to search datasets based on the associated metadata, to see all the metadata and possibly to have access to the data itself. This obviously assumes that the metadata files have been deposited in the correct directory in the storage space dedicated to data management within your collective. See Infrastructure.</p> File Browser <p>This module gives users access to a file browser provided that the data manager has installed it. See File Browser</p> Publication <p>This module allows either the data producer or the data steward to publish the metadata with possibly the corresponding data within the suitable data repository. See Publication</p> <p></p>"},{"location":"tutorial/describe/","title":"Quick tutorial","text":""},{"location":"tutorial/describe/#metadata-entry","title":"Metadata Entry","text":"<p>The figures are given here for illustration purposes but certain elements may be different for you given that this will depend on the configuration on your instance, in particular the choice of metadata, and the associated vocabulary sources.</p> <p>Indeed, the choice of vocabulary sources (ontologies, thesauri, dictionaries) as well as the choice of metadata fields to enter must in principle have been the subject of discussion between data producers and data manager during the implementation of the Maggot tool in order to find the best compromise between the choice of sources and all the scientific fields targeted (see Definition files). However a later addition is always possible.</p> <p></p>"},{"location":"tutorial/describe/#overview","title":"Overview","text":"<p>When you enter the metadata entry module you should see a page that looks like the figure below:</p> <p> </p> <ul> <li> <p>All the fields (metadata) to be filled in are distributed between several tabs, also called sections. Each section tries to group together a set of fields relating to the same topic.</p> </li> <li> <p>You can reload a previously created metadata file. All form fields will then be initialized with the value(s) defined in the metadata file.</p> </li> <li> <p>You must at least complete the mandatory fields marked with a red star.</p> </li> <li> <p>It is possible to obtain help for each field to be completed. A mini-icon with a question mark is placed after each field label. By clicking on this icon, a web page opens with the focus on the definition of the corresponding field. This help should provide you with at least a definition of a field and, if necessary, instructions on how to fill it in. It should be noted that the quality of the documentation depends on each instance and its configuration.</p> </li> <li> <p>Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. See Metadata File</p> </li> </ul> <p></p>"},{"location":"tutorial/describe/#dictionaries","title":"Dictionaries","text":"<p>Dictionary-based metadata (e.g. people's names) can easily be entered by autocomplete in the 'Search value' box provided the name appears in the corresponding dictionary.</p> <p> </p> <p>However, if the name does not yet appear in the dictionary, simply enter the full name (first name &amp; last name) in the main box, making sure to separate each name with a comma and then a space as shown in the figure below.</p> <p> </p> <p>Then you can request to add the additional person name(s) to the dictionary later as described below:</p> <p> </p> <ul> <li> <p>From the home page, select \"Dictionaries\". As username, just put \"maggot\" (this might be different within your instance).</p> </li> <li> <p>Then after choosing the \"people\" dictionary, you can download the entire dictionary in a TSV file (Tab-Separated Values) ready to be edited with your favorite spreadsheet.</p> </li> <li> <p>Add all the desired people's names with their institution, and possibly their ORCID and their email address. Please note that emails are required for authors and contacts</p> </li> <li> <p>You will then just have to send it to the data manager so that he can add new people's names to the online dictionary.</p> </li> </ul> <p>Please proceed in the same way for all dictionaries (people, funders, producer, vocabulary)</p> <p></p>"},{"location":"tutorial/describe/#controlled-vocabulary","title":"Controlled Vocabulary","text":"<p>Depending on the configuration of your instance, it is very likely that certain fields (eg. keywords) are connected to a controlled vocabulary source (e.g. ontology, thesaurus). Vocabulary based on ontologies, thesauri or even dictionaries can easily be entered by autocomplete in the \"search for a value\" box provided that the term exists in the corresponding vocabulary source. </p> <p> </p> <p>If a term cannot be found by autocomplete, you can enter the term directly in the main box, making sure to separate each term with a comma and a space as shown in the figure below.</p> <p> </p> <p>The data steward will later try to link it to a vocabulary source that may be suitable for the domain in question. Furthermore, even if the choice of vocabulary sources was made before the tool was put into service, a later addition is always possible. You should make the request to your data manager.</p> <p></p>"},{"location":"tutorial/describe/#resources","title":"Resources","text":"<p>Because data is often scattered across various platforms, databases, and file formats, this making it challenging to locate and access. This is called data fragmentation. So the Maggot tool allows you to specify resources, i.e. data in the broader sense, whether external or internal, allowing to centralize all links towards data.</p> <ul> <li>External resources will be specified by a URL with preference for a permanent identifier (e.g. DOI) but also any URL pointing to data whether they comply with the FAIR principle (e.g. ODAM) or not.</li> <li>Internal resources will be the data files to be uploaded to the data repository at push time. In the latter case the exact name of the file on the storage space must appear in the location field.</li> <li>Furthermore, in the case of local data management, it would be wise to indicate in which space the data is located if it is not located in the same place as the metadata (e.g. NextCloud, Unit NAS, etc.)</li> </ul> <p>Four fields must be filled in :</p> <p> </p> <ul> <li> <p>Resource Type : Choose the type of the resource in the droplist.</p> </li> <li> <p>Media Type : Choose a media type if applicable by autocomplete.</p> </li> <li> <p>Description : Provide a concise and accurate description of the resource. Must not exceed 30 characters.</p> </li> <li> <p>Location : Preferably indicate an URL to an external resource accessible to all. But it can also be a password-protected resource (e.g. a disk space on the cloud). This can also be text clearly indicating where the resource is located (internal disk space). Finally, this can be the name of a file deposited on the same disk space as the metadata file, in order to be able to push it in the data repository at the same time as the metadata (see Publication).</p> </li> </ul> <p></p>"},{"location":"tutorial/metadata/","title":"Quick tutorial","text":""},{"location":"tutorial/metadata/#metadata-file","title":"Metadata File","text":"<p>Once the form has been completed, even partially (at least those which are mandatory and marked with a red star), you can export your metadata in the form of a file. The file is in JSON format and must have the prefix 'META_'.</p> <p>By clicking on the \"Generate the metadata file\" button, you can save it on your disk space. </p> <p> </p> <p>Furthermore, if email sending has been configured (see settings), then you have the possibility of sending the metadata file to the data managers for conservation, and possibly also for supporting its storage on data disk space if specific rights are required.</p> <p> </p> <p>In the (most common) case where you want to save the metadata file to your disk space, two uses of this file are possible:</p> 1. The first use is the recommended one because it allows metadata management within your collective. <p>You drop the metadata file directly under the data directory corresponding to the metadata. Indeed, when installing the tool, a storage space dedicated to the tool had to be provided for this purpose. See infrastructure. Once deposited, you just have to wait around 30 minutes maximum so that the tool has had time to scan the root of the data directories looking for new files in order to update the database. After this period, the description of your dataset will be visible from the interface, and a selection of criteria will be made in order to restrict the search.</p> <p> </p> <p>You will then have the possibility to publish the metadata later with possibly the corresponding data in a data repository such as Dataverse or Zenodo.</p> 2. The second use is only to deposit the metadata into a data repository <p>Whether with Dataverse or Zenodo, you have the possibility of publishing metadata directly in one or other of these repositories without using the storage space.</p> <p> </p> <p>Please note that you cannot also deposit the data files in this way. You will have to do this manually for each of them directly online in the repository.</p> <p></p>"}]}
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 63dc2ef..caf5dbd 100755
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/tutorial/index.html b/tutorial/index.html
index c4594fa..2a07a08 100755
--- a/tutorial/index.html
+++ b/tutorial/index.html
@@ -763,7 +763,7 @@ <h1 id="quick-tutorial">Quick tutorial<a class="headerlink" href="#quick-tutoria
 
 <h3 id="quick-tutorial_1">Quick tutorial<a class="headerlink" href="#quick-tutorial_1" title="Permanent link">&para;</a></h3>
 <p>This is a quick tutorial of how to use the Maggot tool in practice and therefore preferably targeting the end user. </p>
-<p><em>See a short <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Presentation_Jan2024.pdf?download=false" target="_blank">Presentation</a> and <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Poster_Oct2023.pdf?download=false" target="_blank">Poster</a> if you want to have a more general overview of the tool..</em></p>
+<p>See a short <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Presentation_Jan2024.pdf?download=false" target="_blank">Presentation</a> and <a href="https://inrae.github.io/pgd-mmdt/pdf/MAGGOT_Poster_Oct2023.pdf?download=false" target="_blank">Poster</a> if you want to have a more general overview of the tool.</p>
 <p><br></p>
 <h4 id="overview">Overview<a class="headerlink" href="#overview" title="Permanent link">&para;</a></h4>
 <p>The Maggot tool is made up of several modules, all accessible from the main page by clicking on the corresponding part of the image as shown in the figure below:</p>