Skip to content

Latest commit

 

History

History
218 lines (145 loc) · 16.4 KB

sep-0011.md

File metadata and controls

218 lines (145 loc) · 16.4 KB

Preamble

SEP: 0011
Title: Txrep: human-readable low-level representation of Stellar transactions
Author: David Mazières
Status: Active
Created: 2018-08-31

Simple Summary

Txrep is a human-readable representation of Stellar transactions that functions like an assembly language for XDR.

Abstract

This document specifies txrep, a human-readable format for Stellar transactions. Txrep is unambigous and machine-parsable. Binary XDR transactions can be decompiled into txrep format and recompiled to the exact same binary bytes. Txrep is designed to be parsed directly into data structures generated by XDR compilers, ideally by the very XDR code generated by those compilers.

Motivation

Bug reports, test vectors, semantic specifications in CAP documents, and Stellar documentation all need a way to talk concretely about the contents of transactions. Without a canonical textual format, different documents will each devise their own way of describing transactions, sometimes ambiguously, making it hard to relate information from multiple sources.

Furthermore, advanced Stellar users and devlopers need a way to craft arbitrary transactions, including potentially invalid ones for test cases. This functionality is currently available from the Stellar laboratory transaction builder, but that is not a good solution for high security, as it requires a browser (and in practice requires trusting an HTTPS certificate). Moreover, the transactions one builds in a web browser cannot easily be audited before signing, or described in documentation, or scripted, or placed under version control.

Finally, "dumb" ink-on-paper contracts may need to specify Stellar transactions unambiguously. For example, a legal contract promising to deliver tokens with a lock-up period in exchange for a wire transfer may want to specify the exact lock-up mechanism used through a human readable description of the Stellar transactions involved. Using txrep, this description can be unambiguous.

Specification

A txrep file consists of a number of lines, each describing the value of a field. We describe the line format using a simple BNF-like notation in which brackets ([...]) indicate optional contents, pipe (|) indicates alternatives, and an asterisk (*) indicates zero or more repetitions of the previous symbol. Literal text (e.g., :, ., [, ], ._present, .len, or _) indicates the occurence of those specific characters in the txrep source.

Each line of txrep has the following format:

line = field : SP* value [comment] LF | : comment LF | LF

comment = zero or more characters other than LF

SP = space (ASCII 32)

LF = newline (ASCII 10)

Blank lines are allowed, and any line starting with a colon (:) is a full-line comment. The field and value formats are described below.

Fields

A field has the following syntax:

field = tag selector* [pseudoselector]

selector = . tag | [ integer ]

tag = letter wordchar*

pseudoselector = . _inner* _present | .len

integer = A decimal integer

letter = Any letter

wordchar = Any letter, any digit, or _

Disregarding pseudoselector, each field names a field in an XDR TransactionEnvelope data structure. The name is generated by joining XDR field names with a period (.), similar to the syntax used for accessing nested fields in C++ or go representations of XDR data structures. Array elements (for both fixed- and variable-length arrays) are indexed using square brackets. As in C, arrays are 0-based. Pseudoselectors of the form _inner_present (or _inner_inner_present, etc.) are reserved for fields of type pointer to pointer. Such types do not appear in the current version of TransactionEnvelope, but might in the future, or txrep might be applied to other XDR data types that contain such nested pointers.

As an example, the field tx.timeBounds.minTime names the minTime field in the tx field of the TransactionEnvelope structure, which looks like this:

typedef uint64 TimePoint;
struct TimeBounds {
    TimePoint minTime;
    TimePoint maxTime; // 0 here means no maxTime
};
struct Transaction {
    /* ... */
    TimeBounds* timeBounds;
    Operation operations<100>;
    /* ... */
};
struct TransactionEnvelope {
    Transaction tx;
    DecoratedSignature signatures<20>;
};

Pointers and variable-length arrays use pseudoselectors to describe their state.

Pointers use the ._present pseudoselector with value true or false to indicate that the field is present or NULL, respectively. For example, a transaction with timebounds might be specified like this (using comments to annotate the times):

tx.timeBounds._present: true
tx.timeBounds.minTime: 1535756672 (Fri Aug 31 16:04:32 PDT 2018)
tx.timeBounds.maxTime: 1567292672 (Sat Aug 31 16:04:32 PDT 2019)

A transaction without timebounds would contain this line:

tx.timeBounds._present: false

For fields of type pointer-to-pointer, the pseudoselector ._present describes whether the outermost pointer is NULL or not. If ._present is true, then ._inner_present describes whether the nested pointer is NULL or not. _inner can be repeated as many times as needed for arbitrarily nested pointers.

Variable-length arrays use the pseudo-selector .len with an integer value to indicate the number of elements of the variable-length array. Only indices from 0 to len-1 are meaningful. For example, a transaction with one operation might look like this:

tx.operations.len: 1
tx.operations[0].sourceAccount._present: false
tx.operations[0].body.type: PAYMENT
...

As with any XDR code, implementations must avoid pre-allocating arrays of size len to avoid memory exhaustion attacks, and should instead repeatedly double arrays while deserializing txrep (thereby achieving amortized linear running time without allocating memory that is more than a constant factor larger than the size of the txrep).

Values

Most XDR types are rendered using C syntax. Specifically:

  • All integers (signed and unsigned, 32- and 64-bit) are represented as C integers (decimal by default, but prefix 0x can be used for hex and 0 for octal).

  • bool values are true or false

  • Enums are represented by the bare keyword of the value. They can also be specified numerically as "Type#Number" (e.g., MemoType#3 is equivalent to MEMO_HASH).

  • string values are represented as double-quoted interpreted string literals, in which non-ASCII bytes are represented with hex escapes ("\xff"), the " and \ characters can be escaped with another \ (e.g., "\\"), and \n designates a newline.

  • opaque values are represented as an unquoted hexadecimal string (using lower-case case a...f) with an even number of digits. An exception is that the 0-length opaque vector is represented as "0" (a single digit). Implementations are encouraged to add the comment "bytes" so that it reads "0 bytes" to further distinguish the 0-length vector from the vector with a single byte 0x00 (rendered "00").

A few aggregate values are special-cased:

  • The Asset type is rendered as native (or any string up to 12 characters not containing an unescaped colon) for the native asset, and a string of the form "Code:IssuerAccountID" for issued assets. "Code" must consist of printable ASCII characters (octets 0x21 through 0x7e). The sequence \x introduces a hex escape sequence, e.g., \x00 to introduce a 0-valued byte. Otherwise, \ escapes the next character, so \\ is required to introduce a backslash. Stellar disallows assets of type ASSET_TYPE_CREDIT_ALPHANUM12 that have fewer than 5 bytes, such assets can be represented in binary XDR, and so in txrep are rendered with trailing \x00 (escaped NUL bytes) to as to make the length 5---e.g., the 12-byte asset code ABC is rendered ABC\x00\x00. Note that stellar-core disallows non ASCII bytes in AssetCode fields, so the primary use of this feature is to construct or examine invalid transaction test cases.

  • The asset field of AllowTrustOp is rendered the same as the Code in Asset, only without the trailing ":IssuerAccountID" (since the issuer is the sourceAccount of the operation).

  • PublicKey and SignerKey are rendered as unquoted strings in strkey format, described below.

Any fields in the XDR TransactionEnvelope structure that are not specified in a txrep description are to be interpreted as false for bool, zero (for numeric values, enums, fixed-length opaque), or zero-length (for strings, variable-length arrays, and variable-length opaque). The _present pseudo-selector, if unspecified, defaults to true if any field or value of the pointer's data structure is specified, and otherwise defaults to _false.

Strkey format

Strkey provides a compact ASCII format for ED25519 public keys, ED25519 private keys (also known as seeds), pre-authorized transaction hashes, and hash-x signers (which provide signing authority upon revelation of a SHA-256 preimage). Each of these four types has a corresponding version byte, which determines the first character of the strkey encoding:

Key type Version byte First char
STRKEY_PUBKEY_ED25519 6 << 3 G
STRKEY_SEED_ED25519 18 << 3 S
STRKEY_PRE_AUTH_TX 19 << 3 T
STRKEY_HASH_X 23 << 3 X

The following steps transform a binary key into a strkey:

  1. Prepend the appropriate version byte to the binary key (so, e.g., a 32-byte ED25519 public key becomes 33 bytes when you prepend the byte 48).

  2. Compute a 16-bit CRC16 checksum of the combined version byte and binary key (using polynomial x16 + x12 + x5 + 1). Append the two-byte checksum to the version byte and binary key (e.g., producing a 35-byte quantity for an ED25519 public key).

  3. Encode the concatenated Version-byte, Binary-key, and CRC16 using RFC4648 base-32 encoding.

Note that the version bytes all consist of values shifted left by three because the first character of base-32-encoded output is determined by the most significant 5 bits of the first byte.

The strkey format is already widely used for Stellar. Example imlementations include these in go, C++, and JavaScript.

Normalized txrep

Fields in txrep are specified in an order-independent way. If a field appears twice, the second value overrides the first. This allows one to update a txrep-format transaction by appending lines to a file. However, in some cases, such as when disassembling binary transactions, it is useful to transform transactions into normalized form, for instance so two transactions can be more easily compared, or so users inspecting a transaction see a more predictable format.

Normalized txrep format is a txrep format with the following additional restrictions:

  • Every field and pseudofield in the the binary XDR transaction must appear exactly once in the description. Extraneous fields or fields that do not appear because of NULL pointers or incompatible union discriminants must not appear.

  • Fields in structs and unions must appear in the exact order they appear in the XDR file, which is also the order in which they are marshaled for XDR binary format. In particular, this also requires array elements to be listed in order from 0 to the length of the array.

  • Pseudo-fields must appear immediately before the values they affect. In particular, the ptr._present: true field must appear immediately before the value of ptr, and vector.len must appear immediately before vector[0].

  • Codes in Asset and AllowTrustOp should escape \, : and any byte outside the range 0x21-0x7e, but no other bytes. Trailing \x00 should not be shown except as needed to show an asset of type ASSET_TYPE_CREDIT_ALPHANUM12 fewer than 5 non-zero bytes.

  • Enums must be shown as their symbolic value, rather than "Type#Number", unless there is no symbolic value.

  • The native asset should be rendered as XLM for the Stellar public network, TestXLM for the Stellar test network, and, in the absence of a convention for any other network, native.

One possible use of non-normalized txrep is to allow users discover missing fields. A tool that allows users to construct transactions in txrep format can translate the transactions to normalized format to highlight any missing fields along with their default values. In contexts where users should not leave any fields missing, tools can refuse to accept non-normalized txrep.

Rationale

Txrep is designed to be easy to read into and generate from structures created by XDR compilers using code generated by XDR compilers. Writing txrep simply requires printing each field and value in an XDR structure instead of rendering it in binary format. Txrep can also trivially be read into a key-value map, as almost every language has functions to read files line-by-line and break strings at a specific character (to separate field and value at :). The map can then be consulted in XDR deserialization routines to read txrep.

Tailoring txrep specifically to XDR means it does not introduce a second native representation for transactions. Third party libraries for other popular formats such as JSON and YAML generally cannot be parsed directly into a TransactionEnvelope structure generated by an XDR compiler, in part because of XDR's tagged unions. Moreover, the ability to parse and generate txrep with nothing more than an XDR implementation reduces the attack surface of programs that process txrep.

Because each line of txrep is entirely self-contained, one can excerpt any subset of a transaction with no ambiguity. Because txrep takes the last value of a field, one can overwrite any previously set transaction field. This is convenient for scripts that may wish to tweak values by appending lines to a transaction template.

Two special cases (for keys and assets) make the output easier for humans to process by providing compatibility with other tools.

The ._present pseudoselector was selected to avoid conflicting with fields in the same data structure, since the XDR specification disallows fields that start with _. There is no similar ambiguity possible for .len, whose only siblings are bracketed array indices.

Test Cases

The following binary transaction:

AAAAACsWS5BDhC5BjpKQtznHFJ3CkU6+XtWopW+t+Q9KoH7QAAAAZAClKY0AAAABAAAAAQAAAABbicmAAAAAAF1q/QAAAAABAAAAFkVuam95IHRoaXMgdHJhbnNhY3Rpb24AAAAAAAEAAAAAAAAAAQAAAABAXzbt2M8i77+AcrmFtqTAFVHDTdOME3rI1A1ALNH3tAAAAAFVU0QAAAAAADJSVDIhkp9uz61Ra68rs3ScZIIgjT8ajX8Kkdc1be0LAAAAABfXk6AAAAAAAAAAAUqgftAAAABA3vtPH60cJ5MntVrxhP3N33P096jLQOflNKcdc6BRJLo2nbem0xtHyv0RhZIkaoV15sJJq5TsN2je22KSIhzlDA==

Can be rendered like this (note that comments are optional and may contain implementation-dependent information):

tx.sourceAccount: GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN
tx.fee: 100
tx.seqNum: 46489056724385793
tx.timeBounds._present: true
tx.timeBounds.minTime: 1535756672 (Fri Aug 31 16:04:32 PDT 2018)
tx.timeBounds.maxTime: 1567292672 (Sat Aug 31 16:04:32 PDT 2019)
tx.memo.type: MEMO_TEXT
tx.memo.text: "Enjoy this transaction"
tx.operations.len: 1
tx.operations[0].sourceAccount._present: false
tx.operations[0].body.type: PAYMENT
tx.operations[0].body.paymentOp.destination: GBAF6NXN3DHSF357QBZLTBNWUTABKUODJXJYYE32ZDKA2QBM2H33IK6O
tx.operations[0].body.paymentOp.asset: USD:GAZFEVBSEGJJ63WPVVIWXLZLWN2JYZECECGT6GUNP4FJDVZVNXWQWMYI
tx.operations[0].body.paymentOp.amount: 400004000 (40.0004e7)
tx.ext.v: 0
signatures.len: 1
signatures[0].hint: 4aa07ed0 (GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN signer for account GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN)
signatures[0].signature: defb4f1fad1c279327b55af184fdcddf73f4f7a8cb40e7e534a71d73a05124ba369db7a6d31b47cafd118592246a8575e6c249ab94ec3768dedb6292221ce50c

Implementation

Txrep is implemented by the Stellar transaction compiler, stc.