Skip to content
Malte Brunnlieb edited this page May 3, 2018 · 23 revisions

Text Merger Plug-in

The Text Merger Plug-in enables merging result free text documents to existing free text documents. Therefore, the algorithms are also very rudimentary.

Merger extensions

There are currently three main merge strategies that apply for the whole document:

  • merge strategy textmerge (appends the text directly to the end of the existing document) Remark: If no anchors are defined, this will simply append the patch.

  • merge strategy textmerge_appendWithNewLine (appends the text after adding a new line break to the existing document) Remark: empty patches will not result in appending a new line any more since v1.0.1 Remark: Only applicable if no anchors are defined

  • merge strategy textmerge_override (replaces the contents of the existing file with the patch) Remark: If anchors are defined, override is set as the default mergestrategy for every text block if not redefined in an anchor specification.

The following is Work In Progress

Anchor functionality

If a template contains text that fits the definition of anchor:${documentpart}:${mergestrategy}:anchorend or more specifically the regular expression \n.* anchor:.* :.* :anchorend\n, some additional functionality becomes available about specific parts of the incoming text and the way it will be merged with the existing text. These anchors always change things about the text to come up until the next anchor, text before it is ignored.

If no anchors are defined, the complete patch will be appended depending on your choice for the template in the templates.xml.

Anchor Definition

Anchors should always be defined as a comment of the language the template results in, as you do not want them to appear in your readable version, but cannot define them as comments in the freemarker template, or the merger will not know about them. Anchors will not only be read when they are comments due to the merger being able to merge multiple types of text-based languages, thus making it practically impossible to filter for the correct comment declaration. That is why anchors have to always be surrounded by line breaks. That way there is a universal way to filter anchors that should have anchor functionality and ones that should appear in the text. Remark: If the resulting language has closing tags for comments, they have to appear in the next line.

Documentparts

In general, ${documentpart} is an id to mark a part of the document, that way the merger knows what parts of the text to merge with which parts of the patch (e.g. if the existing text contains anchor:table:${}:anchorend that part will be merged with the part tagged anchor:table:${}:anchorend of the patch).

If the same documentpart is defined multiple times, it can lead to errors, so instead of defining table twice, do table1, table2, table3 etc.

If a ${documentpart} is defined in the document but not in the patch and they are in the same position, the following will result. If only the documentparts header, test and footer are defined in the document in that order, and the patch contains header, order and footer, the resulting order will be header, test, order then footer

The following documentparts have default functionality

  1. anchor:header:${mergestrategy}:anchorend marks the beginning of a header, that will be added once when the document is created, but not again. Remark: This is only done once, if you have header in another anchor, it will be treated as default text

  2. anchor:footer:${mergestrategy}:anchorend marks the beginning of a footer, that will be added once when the document is created, but not again. Once this is invoked, all following text will be included in the footer, including other anchors.

  3. anchor:default::anchorend is a documentpart that will be placed in the beginning of the document if anchors have been defined but there is none at the beginning of the document.

  4. anchor:${mergestrategy}:anchorend (${documentpart} left out) means that the following text is default text that will simply be appended in the place it is defined.

mergestrategies

Mergestrategies are only relevant in the patch, as the merger is only interested in how text in the patch should be managed, not how it was managed in the past.

  1. anchor:${documentpart}::anchorend or anything in place of ${mergestrategy} means that the text should only be appended.

  2. anchor:${documentpart}:${}newline:anchorend or anchor:${documentpart}:newline${}:anchorend states that a new line should be appended before or after this anchors text, depending on where the newline is (before or after the mergestrategy). anchor:${documentpart}:newline:anchorend puts a new line after the anchors text. Remark: Only works with appending strategies, not merging/replacing ones.

  3. achor:${documentpart}:replace:anchorend means that the new text of this documentpart will replace the existing one completely

  4. anchor:${documentpart}:appendbefore:anchorend or anchor:${documentpart}:appendafter:anchorend specifies whether the text of the patch should come before the existing text or after. Default is appendafter, as specified in 1.

  5. anchor:${documentpart}:anchorend will simply append the text, disregarding the documentpart, as the merger will read it as anchor::${documentpart}:anchorend instead of anchor:${documentpart}::anchorend.

Clone this wiki locally