Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance sch:let to support functions, accumuators and keys #45

Open
rjelliffe opened this issue Aug 2, 2022 · 0 comments
Open

Enhance sch:let to support functions, accumuators and keys #45

rjelliffe opened this issue Aug 2, 2022 · 0 comments

Comments

@rjelliffe
Copy link
Member

rjelliffe commented Aug 2, 2022

This is a proposal to enhance sch:let so that it can be used for

  • function definition
  • xsl:key functionality
  • xsl:accumulator functionality

This current email only looks at two issues:

  • First the diffence between denotative languages like Schematron and functional languages like XST.
  • Second, a syntax for functions that would work in any XSLT implementaion

A desirable side-effect might be that, with viable "Schematron-y" ways to do functions etc, XSLT namespace elements could be disallowed in Schematron schemas (unless pulled in as an external library.)

Background

During the initial standardization, we discussed whether Schematron should have functions. Because it was not clear then that XSLT would be as common as XPath implementations or what was happening with data typing, we decided to keep things minimal, and to allow piggybacking constrained by the QLB.

Providing them was prudent, practical and generous for a new standard for a juvenile language, especially one aiming to be as minimal as possible. However,perhaps we should now be looking at how to evolve Schematron to have denotative support for them: it may low-hanging fruit!

The context of this is the discussions on libraries and which XSLT elements should be allowed in the standard QLBs.

Denotative versus Functional Languages

  • XSLT is a functional language. So it is natural that XSLT developers will want to implement things in a functional kind of ways: with functions. ISO Schematron allows some xslt:* elements to help, but I think it is reasonable that it keeps them to a minimum, otherwise it stops being a thing in its own right.

    • An example I saw of this was when I started working at a large organization who had hired a Java/XSLT developer to make its Schematron schemas: the developer was very smart (and I appreciated working with her) but had decided that all assertions should be either external Java function calls, or XSLT functions that usually called Java: so rather than take advantage of XPaths, XPaths were in effect banned in assertions (and were simple in the template matches.) Thinking that this would be more easy to write and maintain, she didn't end up with a bad Schematron schema, but a bad Java program
    • Now actually in this case the underlying reason was that the code needed a lot of database calls; and the wrong call had been made architecturally, so that instead of a single coarse-grain call to the database at the start, storing into a variable, each function had to make its own fine-grain enquiry.
    • So this is an example that what people think of language-capability issues can very often be architecture issues; how you have gone about solving the problem. When you find you have to go through hoops to do something, you need to check that there is not a better way.
  • However, Schematron is not a functional language. It is a denotative language.

    • Peter Landin's The Next 700 Programming Languages (1966) https://dl.acm.org/doi/10.1145/365230.365257 which introduced the term "denotative language" and gave why it is different from other LISPy approaches of "declarative" and "functional" programing.
    • A denotative language is concerned with "characterizing some ... system as a set of entities and functional relations between them." To do this, he makes heavy use of "where" clauses (and their named versions, "let" clauses.
      • For example, lets say we want to write some complex transform functionally:we may have a chain of functions, each one working on the output of the next, like a pipeline.

      • Or we may have a list of functions then apply them with some higher-order function (like map or fold).

      • We may adopt a discipline that we alternate between "mapping" or "reduction" functions, and so end up with a "map-reduce" system.

      • But if we want to write it denotatively then we think in terms of a sequence of scoped things (with names), what holds for them (type), what is the relationship from its sources (functional).

        • You could consider it a succession of views.
    • The advantage of having named objects (for documentation, incremental programming, program proving and efficiency) should be obvious.
      • Landin also said the advantage of denotative programming is that the represented entities can be optimized better: in XSLT systems this might be just lazy evaluation; in symbolic programming systems it can enable (in the practical or encouragement sense) re-arrangement of terms.
  • So in terms of Schematron, a functional programmer would be asking "how can I make functions that do useful things?" while a denotative programmer would be asking "how can I make variables that name useful things?"

    • Functional programming has a reputation of risking unmaintainability without extreme discipline: the more that generic functions are used, the less that the code gives a birds-eye view of intent. Denotative programming requires that the developer think in terms of neat, well-described stages.
    • Indeed, isn't this exactly the same discipline that Schematron itself attempts for XML documents? To prevent some brilliant and elegant abstraction (e.g. grammars) from producing artifacts that are divorced from intentSo to figure out CALS tables, you would create a series of variables, each with whatever information is needed to make explicit some hidden structure:
      • For example, you might have one variable with the max number of columns, and another containing a list of each spanned and/or merged cell, which your code then looks up.
      • You might say that the idiom is extracting out-of-band markup and derived features, rather than successive functional transformations of a document.
  • So the more that functional programmer starts attempting to do functional programming in a denotative language like Schematron, the more frustrated they will be that Schematron only provides such half-assed support.

    • Including libraries and so on become an issue: making Schematron more and more like XSLT becomes the supposed solution.
    • The issue of how to include foreign functional code (e.g. XSLT generated by REx) becomes the focus, rather than how to generate denotative Schematron variables (and e.g. abstract rules or patterns to get parameters if desired).
    • I note that more and more developers are using nested XPath for-each iterators. It is nice to have iterators, however, when the expression gets complicated it runs the risk of obscuring the intent. Refactoring to use either XPath variables or Schematron variables where possible can allow cake + eating.

Suggestion: parameterized variables = functions

Should Schematron variables be improved to take parameters of some type (these being implemented by construction Xpath or XSLT functions? ) Perhaps all we need is that a Schematron variable can be bound to an XPath (or XSLT) function. This would allow any function that can be defined entirely with Xpaths. (XPath3 allows variables, comments, etc, so it is not so bad,)

Formal parameters are provided. This allows the let to act as a function declaration. (It might be that all the functions have a built in parameter to get the context as their first parameter, defaulted, That might allow some simplification. )

< sch:rule context="moneyList"     id="r1" >
    < sch:let name="sum-children"  value="$myContext/sum(*)"  as="function(*)" > 
       < sch:param name="myContext" / >
      < / sch:let >     

      < sch:assert test="r1:sum-children(.)  >  0" >
                A money list.s ubelement should be greater than 0 < / sch:assert >

N.B. The idea of assigning a function to a variable may be strange, but it is supported in e.g. LISP, C, Java, JavaScript etc. However, in XSLT it would implemented as an xsl:function with prefix added if needed. (I.e. it would probably not be implemented as an xsl:variable bound to an XPath3 inline function declaration.

That it is a function would be known by the sch:let/@as statement. The QLBs for XSLT1 and XSLT2 would be revised to allow sch:let/@as but only for "function(*)" because they don't need to (or cannot) support XSD typiing.

Misc

Should Schematron variables be improved to subsume XSLT3 accumulators and xslt:key? (Can such things be implemented in XSLT 1 and 2 with extra modes?)

You can see the direction here: do away with XSLT elements in Schematron.

P.S. Interesting to see that paper from 56 years ago various people discussing whether using indentation rather than delimiters is good for programming languages...
P.P.S. Schematron's element is named "let" not "variable" as a reference to that paper, and so that it could be developed independently of xslt:variable. The other thing that I took from the paper (which we were taught in the Computer Languages course at Uni) is the idea that we want to think in terms of families of languages: so in Landin's terminology Schematron is the family and its XML syntax and Query Language Binding specifies the member.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant