ELN API #35
Replies: 15 comments 34 replies
-
Potentially relevant resources:
|
Beta Was this translation helpful? Give feedback.
-
Fully support this idea, and would also extend the discussion to material/chem-specific archives that may want to support rendering or indexing on the data that these ELNs could provide in such a format. So, I will also cc some other initiatives that may be interested: MDF (@blaiszik), MaterialsCloud (of course overlapping with AiiDAlab too, @giovannipizzi), the work being done through BIG-MAP (@eibfl-dtu), FAIRSpec (@BobHanson) and the design discussions for the UK's PSDI (@colessj) My key interest would be a minimal structural/syntactic standard for serving data attached alongside any schemas used by the ELNs themselves. e.g., if I make a export from my ELN (or whatever, could just be a manually created data export) that lists some entities (say, samples), measurements attached to them, and then links to raw data files (either external or also contained within the repository), and descriptions of the fields used (both their literal type and any attached free-text descriptions). The metadata schema can then be driven by commonalities across ELNs (e.g. recording the user, dates etc. as mentioned in the eLabFTW talk). It may be easier to work in reverse and agree to implement an existing standard for parcelling data with its context as data packages (e.g., frictionless data, or something lower-level like JSONSchema) and then try to find commonalities after the fact that can be used to define an API standard (see Q2 below). My additional questions:
|
Beta Was this translation helpful? Give feedback.
-
Hello, IMHO the file format for exported ELN entries should be a ZIP archive with a JSON file describing all the text data/fields. It seems a safe and reasonable choice. It could have a Regarding the question "Who would be interested?", I think the whole research community would be a target. The ability to have portable ELN entries would be great. That means exporting it in one side, and importing it in another side. With eLabFTW you can already do that, export in zip, and reimport in another eLabFTW instance. In order to import it to another ELN I see two possibilities:
I believe the second option could be explored. Imagine a piece of software able to translate things from one ELN to the other (like So basically a rosetta stone of ELN entries. That might be a more approachable goal than to unify all ELN structures (even if it's only a facade). |
Beta Was this translation helpful? Give feedback.
-
I like @NicolasCARPi POV. A unified description of data fileds can live in the cloud, i.e. a namespace, that you can import for each ELN and comply with. Here is an interesting yet simple data schema for experimental data that could be extended into a more complete ELN namespace: https://github.com/SINTEF/dlite |
Beta Was this translation helpful? Give feedback.
-
Maybe we can discuss this wrt to three main aspects, the upper discussions are somewhat convoluted and mix up several aspects somehow imo:
|
Beta Was this translation helpful? Give feedback.
-
I have come across this: https://www.researchobject.org/ro-crate/ Is anyone using it? |
Beta Was this translation helpful? Give feedback.
-
On Mon, Feb 14, 2022 at 10:30 AM Kevin Jablonka ***@***.***> wrote:
Not aware of any ELN that uses it, but i know that @djeanner
<https://github.com/djeanner>, who's part of the FAIRSpec team, had some
thoughts about it. @***@***.*** <https://github.com/BobHanson>
what were your thought about this when you looked at it prior to designing
FAIRSpec? I understand that there's currently not a direct compatibility,
what's the reason for that?
Looking it again, it seems like an interesting starting point for the
topmost level about the really basic metadata.
—
Yes, Damien and I looked fairly closely at RO-Crate early on in our
project. I went to an RO-Crate monthly meeting. Here are my notes from
that. Consider the fact that this was two years ago, and I have not
followed them lately. In the end, we realized that our mission was not to
define a standard for data packaging, so we did not go any further with
this.
<notes by=hansonr date=Feb_2020>
- I got plenty of names and contacts -- two or three main people and 8 more
secondaries and a few others such as me (for example, from DataVerse).
- They are strictly JSON-LD now, but they have decided to scrap the Linked
Data idea and make that just JSON.
- They are aware of the domain-specific issue, but they don't have a
solution for it right now. There was quite a discussion about this, and in
two weeks the meeting will focus on domain-specific issues.
- I think they see something like "RO-Bagit" as well, but I don't think
that is actually defined.
- They talked about building automated deposit mechanisms for Zenodo and
DataCite. It's not clear to me that they understand how Zenodo works ---
for example, there was no discussion about DOIs, and that Zenodo
automatically creates the DOIs that are registered with DataCite.
- It's not 100% clear to me what the advantage of RO-Crate is. Basically it
is a set of metadata key/values that are registered with schema.org.
</notes>
About a year ago our group also discussed this RO-Crate presentation from
Elixer:
https://www.dropbox.com/s/wzh2v6knlpzzvq0/2021-02-25-ro-crate-fdo-FINAL.pptx?dl=0
That's all I know.
Bob
|
Beta Was this translation helpful? Give feedback.
-
Here are the notes for our discussion today: https://docs.google.com/document/d/1drhkr54WT2HoPqXSkPCMw4boHr7BpOo3-yoMhWRitxo/edit?usp=sharing Some action points we discussed:
|
Beta Was this translation helpful? Give feedback.
-
one point we had in the last discussion was to figure out on which part of the pipeline we are interested in. For this we made a rough sketch here https://excalidraw.com/#room=b3e4efc0c6347e682439,wucc4hpQvI5KQVZYgNvdIg -- perhaps you can put your name on one of the arrows you would like to focus on. |
Beta Was this translation helpful? Give feedback.
-
Hello, all. I have been lurking here, but I think I might have something
interesting to contribute very soon to this discussion. I have appreciated
all the discussion re JSON-LD, RO-Crate, BagIt, and other container
formats. These are all topics our IUPAC project
<https://iupac.org/projects/project-details/?project_nr=2019-031-1-024>
task group has discussed over the past couple of years (sigh!). At this
particular moment I do not have time to contribute fully to this
discussion. (I need to get my bicycle out of the basement and meet some
friends for breakfast.)
However, I would like to announce the acceptance of our first major
contribution in this area --
A presentation of the principles associated with this work are elaborated
upon in the Pure and Applied Chemistry article IUPAC Specification for the
FAIR Management of Spectroscopic Data in Chemistry (IUPAC FAIRSpec) -
Guiding Principles
<https://github.com/IUPAC/IUPAC-FAIRSpec/blob/main/documents/publications/2022.03.13%20PAC%20-%20FAIRSpec%20Guiding%20Principles%20-%20accepted.pdf>
(accepted Mar 13, 2022). Perhaps I can buy some time with that, as you read
it. :)
In addition, I am hard at work preparing for an ACS presentation on our
digital object model at the upcoming meeting in San Diego (aka Zoom) next
Monday. I am keenly interested in this discussion, as I believe we have
something for you. It is the IUPAC FAIRSpec Finding Aid and its associated
IUPAC FAIRSpec Digital Object Model and IUPAC FAIRSpec Metadata Model. At
least in super-alpha it is proving to be quite capable of capturing the
relationships among experiments, samples, spectral data, chemical
structures, and post-acquisition analyses. It's too sketchy to deliver
today, but we are starting to get pretty excited about it and would love to
share it with this crowd ASAP. I think you will find it intriguing and,
hopefully, quite useful and relatively easy to implement. I very much look
forward to getting your feedback on this, as the ELN point is the starting
point for everything we are talking about.
Sorry for the mystery and the tease. That is not my intent, just the way it
is at this particular moment today.
Bob Hanson
…On Thu, Mar 17, 2022 at 3:33 AM Steffen Brinckmann ***@***.***> wrote:
Thank you for your points of view: I just differ in the "they clearly
haven't solved the problem" of @sphuber <https://github.com/sphuber>.
Standardization works best if there are many identical objects which are
labeled differently. In science, we have labs were one researcher build up
a experimental setups and computer-codes that the researcher build and that
is internationally unique. How do you want to standardize that?
RO-crate has therefore an ad-hoc local context, great, as they allow the
above mentioned researcher to contribute. But RO-crate also opens the door
to label everything ad-hoc local, as it is convenient and follows the
"standard". Possibly with empty definitions. Thereby, it makes itself an
unnecessary overhead.
All ELNs I know of are connected to a database. If that is case then one
can serialize that information and save it.
I will have to integrate ~20 years of research data from a medium research
group using an Oracle SQL dump (because proprietary database) in the
future. That 1990s data-structure has no context.., but it should be still
integrated in modern ELNs. I - currently - am of the oppinion that I can
'clearly' integrate it. But again, points-of-view can differ.
—
Reply to this email directly, view it on GitHub
<#35 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEHNCW5K6NQUAHUFA4A3JDTVALU7BANCNFSM5N5LJXWQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Robert M. Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want,
it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
*We stand on the homelands of the Wahpekute Band of the Dakota Nation. We
honor with gratitude the people who have stewarded the land throughout the
generations and their ongoing contributions to this region. We acknowledge
the ongoing injustices that we have committed against the Dakota Nation,
and we wish to interrupt this legacy, beginning with acts of healing and
honest storytelling about this place.*
|
Beta Was this translation helpful? Give feedback.
-
Sure, @ptrxyz. So perhaps what we are working on is tangential, but we
think that standardizing the data exchange bit of it -- which seems to be
the point of this discussion -- even just among ELNs, has particular
interest to us. ELNs are where it all starts. We need ELNs to inform us on
the experiment/protocol/procedure/sample parts of the equation that will be
a key part of the standard. This is critical. So we hope that this group
will be early adopters of our recommendations and work with us closely to
finalize them.
Correct me if I'm wrong.
Bob
ps -- sorry, I don't know who ptrxyz is.
…On Thu, Mar 17, 2022 at 8:29 AM ptrxyz ***@***.***> wrote:
I tried to put my name there, for Chemotion speaking, we are mostly
working on communication/exchange formats with other ELNs and 3rd party
apps right now, as for us a repo is pretty much "included" already.
—
Reply to this email directly, view it on GitHub
<#35 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEHNCW5XEOOMCKW7K6DSC3TVAMXUZANCNFSM5N5LJXWQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Robert M. Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want,
it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
*We stand on the homelands of the Wahpekute Band of the Dakota Nation. We
honor with gratitude the people who have stewarded the land throughout the
generations and their ongoing contributions to this region. We acknowledge
the ongoing injustices that we have committed against the Dakota Nation,
and we wish to interrupt this legacy, beginning with acts of healing and
honest storytelling about this place.*
|
Beta Was this translation helpful? Give feedback.
-
Frank, can you give us some links to actual RO-crate objects? I'd like to
learn more about what those are, particularly in this context. And to see
how hard it would be to fashion an IUPAC FAIRData Finding Aid
<https://docs.google.com/document/d/1PmqYur26JnnAytC4n4_zFwY1efggvL09YTU2zgF9Gkk/edit?usp=sharing>
around one.
Are they all specific protocols that just need "filling in the blanks"? Or
were they more free-form experimental procedures as is common in organic
chemistry?
How did you/would you handle accompanying spectroscopic analyses?
Bob Hanson
…On Wed, Apr 27, 2022 at 9:39 AM Sebastiaan Huber ***@***.***> wrote:
Thanks a lot for the comment @f-krueger <https://github.com/f-krueger> .
Very good to know that RO-Crate can indeed successfully be used for ELN
exports. I will have a look at the linked reference.
—
Reply to this email directly, view it on GitHub
<#35 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEHNCW5EWJAWLDJNTAZGS5TVHFGQ3ANCNFSM5N5LJXWQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Robert M. Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want,
it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
*We stand on the homelands of the Wahpekute Band of the Dakota Nation. We
honor with gratitude the people who have stewarded the land throughout the
generations and their ongoing contributions to this region. We acknowledge
the ongoing injustices that we have committed against the Dakota Nation,
and we wish to interrupt this legacy, beginning with acts of healing and
honest storytelling about this place.*
|
Beta Was this translation helpful? Give feedback.
-
Very interesting, Frank. Can I ask some questions as to how this works? If
I have this right...
…-- you captured the ELN sample and process metadata entirely in
ro-crate-metadata.json.
-- ro-crate-metadata.json then also ties these to specific digital
representations -- the images, primarily
-- the ro-crate JSON @graph array
-- manifests the digital representations in the package ***@***.***: "File")
-- adds a set of "Activities" related to the process
So, for example, for the image
Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg:
{
***@***.***": "Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg",
***@***.***": "File",
"contentSize": 137776,
"encodingFormat": "None",
"foaf:name": {
***@***.***": "en",
***@***.***": "Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg"
},
"https://schema.org/dateModified": {
***@***.***": "xsd:dateTime",
***@***.***": "2021-03-03T16:57:33"
},
"sha512": "d529...4380ba6bd738f757f203c"
},
along with some sort of (standardized? vendor-specific?) metadata:
{
***@***.***":
"Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg_metadata.xml",
***@***.***": "File",
"contentSize": 752,
"encodingFormat": "text/xml",
"foaf:name": {
***@***.***": "en",
***@***.***":
"Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg_metadata.xml"
},
"https://schema.org/dateModified": {
***@***.***": "xsd:dateTime",
***@***.***": "2021-03-03T16:57:47"
},
"sha512": "9bda1870...e083bfed75e"
},
What I don't understand is how the metadata associates that file
Data/03_Zeitserie-nach-Stimulation_Kontrolle_screen.jpg with some sort of
context -- what it was from, where it fits into the procedure, etc.
Is that somewhere in this package?
Am I right that siegfried_output.json is just an automated analysis of the
files in order to get some idea of what their natures are and how that was
determined?
Bob Hanson
|
Beta Was this translation helpful? Give feedback.
-
Frank,
Excellent! I enjoyed reading your paper. Our group had talked about
RO-Crate early on (2019-2020) but decided we could get back to that later
(now?) after first working out the detailed metadata object model that
would underpin the whole process.
This sounds almost precisely how my extractor works for ACS supporting
information data files in order to create IUPAC FAIRData Finding Aids.
Semi-automatic in my case, as each of the authors had their own
interpretation of what the guidelines meant (or to what extent they could
meet them), so it comes down to a small JSON template that is used to scan
the aggregation (our term for "a bunch of unidentified digital objects")
and turn those into an IUPAC FAIRData Collection. (This also required some
correction of mistakes in the way Bruker dataset were packaged by authors.)
It's a bit of a hack, but a necessary hack, since post-process extraction
is not the ultimate goal. Our goal is to work with ELNs to create the
finding aid on the fly. (Likewise, I am sure, for you.)
My reading of this is that if you can get the RO-Crate working from the
front end rather than the back end, one element of that could be added
would be an IUPAC FAIRData Finding Aid
<https://docs.google.com/document/d/1PmqYur26JnnAytC4n4_zFwY1efggvL09YTU2zgF9Gkk/edit?usp=sharing>.
Interested in whether you are open to that idea.
The finding aid accomplishes something possibly quite complementary to what
you have described. It is a highly object-oriented (meaning subclassible
and extendable) description of the contents of a collection, including
relationships -- collections, associations, multi-object analyses -- as
well as key discipline-specific "high-FAIR-value" metadata. It can be
customized to suit quite a wide context.
What we would say is that an RO-Crate is one particular *representation *of
a digital collection, and the rocrate.json digital object is essentially a
digital representation of a *finding aid.* (I think it is particularly
interesting that both of us have tapped the digital archival community for
ideas here. You for using Seigfreid, and I for the finding aid idea.
Right?)
Q: Do you conceptualize what you are doing in object-oriented terms?
(classes, fields, methods, subclasses, instantiation and such?) If so, have
you found that helpful?
Bob Hanson
chair, IUPAC 2019-031-1-024
<https://iupac.org/projects/project-details/?project_nr=2019-031-1-024>
…On Thu, Apr 28, 2022 at 1:11 AM Frank Krüger ***@***.***> wrote:
Great questions, Bob. You are right about the siegfried_output.json. The
siegfried tool was used to determine some information about the files.
With respect to the actual data files, the users of the ELN attached all
files that originated from the experiment (incl. data files and metadata
files) to the ELN. Some files (mostly data files) are mentioned in the
textual description of the protocol, some not. We followed two objectives
here:
1.) Add all files to the ro-crate
2.) Describe as much as we can about the process and the resulting files
Whenever a file was mentioned in the protocol, we were able to link the
file to the corresponding step of the experimental procedure. When the file
was not mentioned in the protocol at all, we just included it in the
ro-crate, but did not link them to any activity.
The result is that we satisfy the minimal requirements of ro-crate in that
we list the content of the package along with some metadata, but we also
used the "higher level" features for the description of the provenance of
some files.
—
Reply to this email directly, view it on GitHub
<#35 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEHNCW4FFAI67737Q7WRE7LVHIT2DANCNFSM5N5LJXWQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Robert M. Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want,
it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
*We stand on the homelands of the Wahpekute Band of the Dakota Nation. We
honor with gratitude the people who have stewarded the land throughout the
generations and their ongoing contributions to this region. We acknowledge
the ongoing injustices that we have committed against the Dakota Nation,
and we wish to interrupt this legacy, beginning with acts of healing and
honest storytelling about this place.*
|
Beta Was this translation helpful? Give feedback.
-
This https://github.com/TheELNConsortium/TheELNFileFormat It's pretty cool to be able to transfer entries from one ELN to another without losing data on the road. Example with eLabFTW: https://github.com/TheELNConsortium/TheELNFileFormat/tree/master/examples/elabftw |
Beta Was this translation helpful? Give feedback.
-
Initial motivation (see attached the first pitch we [@ml-evs / I] had for this workshop) and brought up by @deltablot in his talk.
Can we can come up, following the example of Optimade, with a common API for ELN/LIMS/... ?
Objectives:
Non-objectives:
Initial questions:
I made a Doodle here https://doodle.com/poll/pnby5rwf6q29v9kw?utm_source=poll&utm_medium=link. If all those dates seem unreasonable, please suggest new ones.
For more informal discussion, hop on the
eln-api
channel on the conference discord.cc'ing @lpatiny, @helgestein, @nicolejung, @yakutovicha, @eibfl-dtu (couldn't find anyone from openbis here)
optimade_eln_abstract.pdf
Beta Was this translation helpful? Give feedback.
All reactions