Standard Built-In Macro Argument Splitting

Introduction

Some Built-in macros use special syntax. Others split the input into parts and use the parts to work with. The splitting is standardized and is implemented by Jamal to support these macros. This way these macros use the same input splitting algorithm.

In other words, the algorithm described in this document is how many built-in macros get their arguments. For example the macro if can be used as follows:

Jamal source

{@if /true/then/else}

which will result in the output

output

then

In this case the input /true/then/else is split into three parts true, then and else. The separation is done using the first character following the macro. This is the most common and simplest way to separate the arguments, but there is more to it.

The Algorithm

The algorithm is defined in the following way:

If the first non-white space character is a letter or digit, then the input is split along the white space characters. Multiple adjacent white space characters are counted as one. The splitting does not create empty parameters.
Jamal source
```
..{@if true then else}..
```
output
```
..then..
```
Note
No spaces around the output then.
If the first non-white space character is backtick (`), then the parsing expects a regular expression between backticks. After the regular expression and after the closing backtick, the rest of the input is split up using the regular expression as separator. Backticks inside the regular expression have to be doubled.
Jamal source
```
{@if `\s*THEN\s*|\s*ELSE\s*` true THEN then ELSE else}
```
output
```
then
```
If the first non-whitespace character is not a letter or digit or backtick (`), then the first non-whitespace character is the separator.
Jamal source
```
..{@if /true/  then  /else}..
```
output
```
..  then  ..
```
Note
There are spaces around the output then.
If the macro $REGEX is defined (no arguments) then its content will be used as regular expression to split the input.
Jamal source
```
{@define $REGEX=\s*THEN\s*|\s*ELSE\s*}
{@if  true THEN then ELSE else}
```
output
```
then
```

Some macros may decide to use the splitting in a way that ignores the last rule. Unless it is documented so, you should assume that these four rules apply.

Java Background

Note	You can ignore this part if you are not interested in the Java details. This section is for those who want to implement their own macros in Kotlin or Java.

Built-in macros are implemented in Java or Kotlin. Jamal itself is implemented in Java. When you develop macros in Kotlin, you can call the Java methods, or you can use the Jamal Kotlin library. This library provides a lightweight facáde to the Java implementation enabling Kotlin style development.

Each macro is a class implementing the interface Macro. The interface has the method named evaluate(). This method is invoked to process the input and result in the output.

Technically the input parameter of the method evaluate() is not a string. It is an instance of some class implementing the interface Input. The interface input extends the Java interface CharSequence. That way, input behaves like a string.

The Java code of the macro is free to interpret this string the way it wants. Different macros implement their syntax analysis differently.

To manage the input and ease the syntax analysis and interpretation of the input, there is a utility class named InputHandler. This class defines a method named getParts() which does a simple analysis. It splits the input into an array of strings in a "standard" way, as described above.

This method is used, for example, by the implementation of the if built-in macro. I recommend using this method when there is no special requirement for a macro. Using this method provides a concise way for macro argument separation.

The method getParts() is overloaded. It has four versions:

getParts(Input input, Processor processor) — This is the most general version. It splits the input to as many parts as possible according to the rules described above. It also uses the last rule looking up the macro $REGEX.
getParts(Input input, Processor processor, int limit) — This version splits the input as many partds as spossible, but not more than limit parts. That way, the last part may contain separators.
getParts(Input input) — Same as the one with the Processor argument, but it does not care the $REGEX macro.
getParts(Input input, int limit) — Same as the one with the Processor argument, but it does not care the $REGEX macro.

It is recommended to use one of the versions that do use the $REGEX macro.

Some of the built-in macros use this method to split the value of some parops. For example, the macros replace, and replaceLine use the method to split the value of the parop replace into parts. They enclose the string value of the parop in an Input object before calling getParts().

For the Kotlin support methods have look at the Kotlin file

../jamal-kotlin/src/main/kotlin/javax0/jamal/kotlin/InputSupport.kt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARGSPLIT.adoc

ARGSPLIT.adoc

Standard Built-In Macro Argument Splitting

Introduction

The Algorithm

Java Background

Files

ARGSPLIT.adoc

Latest commit

History

ARGSPLIT.adoc

File metadata and controls

Standard Built-In Macro Argument Splitting

Introduction

The Algorithm

Java Background