Replies: 5 comments
-
I am so excited to be invited to this meeting and am making notes here before my presentation on 2022-02-07 called "The Language of Materials". For background I'm a crystallographer (and have been on the CIF project for ?30 years!) and with Henry Rzepa have developed Chemical Markup Language (which is able to support a lot of concepts in materials). Also working with Nico Adams on materials ontology and Polymers. However I've been mainly working in bioscience and "came back" last year through Matthew Evans and Matthew Dunstan , and being invited to OMDI2021 on materials ontology. I'm impressed with the advances in communal approach to ontologies and the desire for unifying infrastructure and in these notes I'll share some experiences. These are not prescriptive and I hope they are useful guidelines. I will be supremely ignorant of recent developments and look forward to being updated. I believe we need a language-based approach to semantic science and I'll explain why. |
Beta Was this translation helpful? Give feedback.
-
I believe we need a language for scientific knowledge. Like other languages this can be built on linguistic rules that we already understand and should reflect the way we already talk about science. There's a rough hierarchy (easy->hard) of:
It should be possible to explore 1, 2 (maybe 3) for materials |
Beta Was this translation helpful? Give feedback.
-
requirements for a successful language(Gathered from several disciplines over many years). The formal languages that have taken off are generally based on current practice at the time, supported by a critical mass of community. CIF is a pre-eminent example, as it has involved the whole community, is not unbalanced / undermined by commercial players, always has useful working implementations. The IETF motto "Rough Consensus and Running Code" is a valuable guide. The principles (not the syntax) of XML may be useful - don't get hung up on XML itself:
Most of these (1996) have stood the test of time. I stressed the importance of easy-implementation and suggested we had running editors and parsers (and these were created - Tim Bray, Jim Clark, David Megginson). Some language features were dropped because they were too complex, or fuzzy or otherwise unimplementable. (In contrast XML Schema created an overly complex system based on abstract design when simpler methods would have been preferable). So running code and examples is critical. (Please prove me wrong on this para!) I believe that the creation of Open, valuable, unique data will be an important part in developing the new knowledge. I think and hope that this can be done by extraction from existing publications (-more later). |
Beta Was this translation helpful? Give feedback.
-
nouns and identifiersNouns are the building blocks of languages and they seem hardwired into our thinking. There are tools for extracting parts of speech from natural language and words/phrases can be identified as nouns even if their meaning is unknown. But there is ambiguity in human speech and writing ("lead" is a metal and a verb; "tin" is a metal and a generic container, etc.). So we need an identifier system. "Tins (https://www.wikidata.org/wiki/Q2846150) are no longer made of tin (https://www.wikidata.org/wiki/Q1096)" There is also synonymy ("tin" and "can" are synonyms). By adding unique, permanent, maintained identifiers we can avoid much of the ambiguity. Over the last 10 years we've seen the rapid rise of Wikidata (wikidata.org) and my judgment is that it will become the universal identifier system for the Open web. (That's why recently we requested that Crystallography Open Database identifiers were given a property (https://www.wikidata.org/wiki/Property:P9824). One of the most cost-effective actions for MADICES would be to get materials, and their properties, into Wikidata. |
Beta Was this translation helpful? Give feedback.
-
Here's a diagram with a number of nouns (we are extracting the data - more later).
|
Beta Was this translation helpful? Give feedback.
-
Discussion for Peter Murray-Rust's talk on "The Language of Materials" on the first day
Beta Was this translation helpful? Give feedback.
All reactions