Hacking problems #33
Replies: 7 comments 27 replies
-
might also be a useful resource https://github.com/linkml/linkml-model-template |
Beta Was this translation helpful? Give feedback.
-
As discussed on Discord, potentially reaching out to developers of some web-based XAS tools like https://times-webxrs.stanford.edu/ and seeing if we can wrap their code for interop |
Beta Was this translation helpful? Give feedback.
-
Another vague idea:
|
Beta Was this translation helpful? Give feedback.
-
FYI we will have a "hacking room" on Gather (see emails/Discord for links) that will run throughout the week. |
Beta Was this translation helpful? Give feedback.
-
Hacking idea from the last breakout (typing on my phone but can tidy up if there is interest). We discussed how parsing unstructured logs/output files is still very common. Everyone's writes their own parser that maps to their own data models. Usually these are bundled as part of larger packages that do the analysis (e.g. pymatgen, Ase, aiida in comp. mat. sci), and rewritten in multiple languages (e.g. the modular cheminfo parsers in JS). Do we think a simple registry/framework for code objects that operate on files and return structured data would be a useful investment? e.g., a docker image per parser with a unified interface that also spits out a schema for the parsed data? Do such things already exist? Does this go any way to tackling the scalability of our current ecosystem, or is this just creating more laborious work? This could then motivate the development by the original raw file creators, like instrument manufacturers and code authors. These could then be employed across multiple ELN/repository services and used for ETL in perhaps a more scalable way than is currently available. Given the wealth of existing parsers it would be easy to test this out quite quickly, and there is potential for nice integration with many existing services present at the workshop. |
Beta Was this translation helpful? Give feedback.
-
10+ years ago I wrote JUMBO-Converters as a declarative approach to parsing. Was delighted to see Carlo had extended this. Declarative parsers can be implemented by a small amount of code and maintained by people with marginal programming skills. Be happy to chat about this. |
Beta Was this translation helpful? Give feedback.
-
This is all wonderful. One of the key features of JUMBO-Converters is that it (hopefully) chunks the output hierarchically into nested objects. (This can be hard with procedural code - you have to "read-to-next-section" . And J-C finds all the instances of subobjects (e.g. matrices) without you having to anticipate them.
First, this makes me very happy! I will wear my JUMBO t-shirt today. J-C is a very good way of identifying the sub-objects that could/should be in data-dictionaries and ontologies. |
Beta Was this translation helpful? Give feedback.
-
Motivation: Make some place to already start write some code to start something that can last for after the workshop
Beta Was this translation helpful? Give feedback.
All reactions