Bespoke-BNF

WIP

Bespoke-BNF (BBNF) is designed to be a more complete and extensible variant of BNF. It's syntax is designed with common conventions, compatibility, and readability in mind while allowing for a more complete, succinct, and descriptive specification of a language's syntax. It is mostly based on the BNF flavors: EBNF, ABNF and XBNF, but with some additional features and conventions.

Feature Comparisons

Features	ReGex	BNF	EBNF	ABNF	XBNF	BBNF
Production Defenition	`/y/`	`x ::= y`	`x ::= y`	`x = y`	`x = y`	`x ::= y` or `x = y` or `x: y`
Sequential Concatenation	`xy`	`x y`	`x , y`	`x y`	`x y`	`x y` or `x , y`
End of Rule Terminator	`/`	`;`	`;`	`;` or `\n\r` ¹	`;`	`;`
Rule Reference	X	`<rule>`	`rule`	`rule` or `<rule>`	`rule` ²	`rule` ² or `<rule>`
Token Reference	X	`<token>`	`token`	`token` or `<token>`	`TOKEN` ²	`TOKEN` ² or `<TOKEN>`
Character Literal	`c`	`"c"` or `'c'`	`"c"`	`"c"`	`"c"`	`'c'`
Unicode Literal	`\uHEX`	???	???	???	???	`'c'` or `\uHEX`
String Literal	`text`	`"text"` or `'text'`	`"text"` or `'text'`	`"text"` or `'text'`	`"text"`	`"text"`
Regex Literal	`regex`	X	`/regex/`	X	`/regex/`	`/regex/`
Decimal Literal	`123`	`123`	`123`	`123` or `%d123`	`123`	`123`
Hexidecimal Literal	`\xHEX`	???	???	`%x123` ³	???	`\xHEX`
Octal Literal	`\xOCT`	???	???	`%o123`	???	`\oOCT`
Numeric Range	???	X	X	`123-456`	X	`123-456` or `123..456`
Escape Sequence	???	`"\c"` or `'\c'`	`"\c"` or `'\c'`	`"\c"` or `'\c'`	`"\c"`	`\c` or `"\c"` or `'\c'` or `\c`
Comment / Ignored	???	`comment` ⁴	`(* comment *)`	`; comment`	`# comment` or `/* comment */`	`# comment` ⁵ or `/* comment */`
Alternate Choice	???	`x \| y` or: (`x = y` and `x = z`) ⁶	`x \| y`	`x / y` or: (`x = y` and `x =/ z`) ⁷	`x \| y`	`x \| y` or `x\|y` ⁸
Not / Inverse	???	X	X	X	X	`!x` or `x!`
Unordered Concatenation	???	X	X	X	X	`{x}`
Immediate Concatenation	???	X	X	X	`x . y`	`x . y`
Grouping	???	X	`( x y )`	`( x y )`	`( x y )`	`( x y )`
Optional	???	X	`[x]`	`[x]` or `*1x`	`x?`	`[x]` or `x?` or `?x`
Repeat Zero or More Times	???	`s = \| i s`	`{ }`	`*x`	`x*`	`x` or `x`
Repeat One or More Times	???	`s = i \| i s`	`{ }`	`1*x`	`x+`	`x+` or `+x`
Repeat Exactly N Times	???	X	X	`Nx`	X	`xN` or `Nx`
Repeat Between A and B Times	???	X	X	`A*Bx`	X	`xA-B` or `A..Bx` or `A-Bx` or `token..B*x` etc... ⁹
Repeat N or More Times	???	X	X	`N*x`	X	`xN+` or `N+x`
Repeat N or Less Times	???	X	X	`0*Nx`	X	`xN-` or `N-x`
Repeated and Comma Separated	???	X	X	`#x`	X	X
Special Sequence	???	X	X	`? x ?`	X	`{key}x`
Prose	???	`comment` ⁴	X	X	X	`; prose` ¹⁰
Spread Operator	???	X	X	X	X	`...rule`
Metadata Tags	???	X	X	X	X	`#tag` ⁵
Exceptions	???	X	X	`- error`	X	`#err`

Table Notes

X means that the feature is not explicitly defined in the given standard.
??? means that an investigation still needs to be done to determine the correct syntax.
All whitespace characters are optional.
In most³ examples x, y, and z are placeholders for any valid expression.
c is a placeholder for any valid character.
s is a placeholder for a valid BNF expression representing some kind of sequence, and e is a placeholder for a valid BNF element within that sequence.
123, 1234, N, A, and B are placeholders for any valid integer.
HEX is a placeholder for any valid hexadecimal number code.
OCT is a placeholder for any valid octal number code.
key, rule, token, text, tag, regex, comment, ...etc are also placeholders, but for their respective types.

Sources

https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form {XBNF}
https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form {EBNF}
https://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_form {ABNF}
http://www.faqs.org/rfcs/rfc2234.html {ABNF}
http://www.cs.man.ac.uk/~pjj/bnf/ebnf.html#NOTE {BNF, EBNF, ABNF, XBNF}
https://marketplace.visualstudio.com/items?itemName=Mai-Lapyst.xbnf {XBNF}
https://sabnf.com/docs/doc7.0/abnf.html {ABNF}
https://www.ietf.org/rfc/rfc1945.txt {ABNF}

Usage

Defining Grammars

Defining a grammar is done in two main steps: defining tokens and defining rules.

Defining Tokens

Defining Rules

Using Grammars

Syntax

Comments

Inline Comments

Inline comments are defined as starting with either:

a double forward slash (//),
or at least one hash symbol followed by a padding character (#+[ |\t]) and then the comment text.

// This is an inline comment
# This is another inline comment
#  This is a third inline comment
## This is a fourth inline comment
rule ::= token // This is an inline comment at the end of a line
    ; // make sure you don't comment out the semicolon!

Literals

Groups

Operators

TODO

Syntax Features
- update regex literal to use using / as delimiters.
- add block style comments.
- add other defenition assignment operators.
- add special sequences.
- add full repeat syntax.
- allow all suffixes as prefixes in BBnf.
- Add Unordered Sequence using: {}.
- Add Spread (...) operator as just prefix for templating and splatting.
- add a general numeric range syntax.
- add negative to the numeric literals.
- make the numeric literal base type into the literal type.
- add plus or minus tailing syntax to repeat.
- add 'spaced' boolean to the or (|) operator.
- investigate and add hexidecimal and octal literals? (or at least tags).
Tests
- Add tests for Creating a set of Tokens and Grammar rules from code
  - add Token parser tests
  - add Rule parser tests
Grammar Parser Implementation
- implement each rule's expression parser {0%}
- implement comment handling
- implement the rule based parser logic, state, etc.
- implement Tags
  - implement the whitespace tags
  - implement other built-in tags
    - indexer tag
  - implement custom tags
Documentation
- Finish the usage docs
  - Finish the quick start guide
- Finish the syntax documentation

Footnotes

ABNF - Newline and Carriage Return In ABNF, the end of a rule can be terminated by either a semicolon (;) or a newline that begins with no indentation (\n\r). ↩
EBNF - Capitalization vs Lowercase In some BNF variants; The capitalization used to reference other productions is significant. Productions for tokens are written in all uppercase, while productions for rules are written in all lowercase. ↩ ↩² ↩³ ↩⁴
ABNF - X Hex Keyword In this ABNF example; the x is not a placeholder for a production, but is instead a literal part of the required syntax. ↩ ↩²
BNF - Non-Delimited Literals as Prose In BNF, comments are not explicitly defined, but any text that is not part of a production is often considered prose... and can be a comment or human readable description of 'extraneous' logic that is not achievable by BNF's basic syntax and rules. ↩ ↩²
BBNF - Comments vs Tags Both inline comments and tags in BBNF use the # symbol as a prefix. Inline comments require a whitespace charachter after the # and before the comment text (spaced), while tags use immediate syntax and require there to be no whitespace after the symbol and before the key (unspaced). ↩ ↩²
BNF - Alternate Same In pure BNF, alternate choices to the same production can defined by separate rules with the same name. This is not the case in most BNF variants. ↩
ABNF - Alternate Equal In ABNF, alternate choices are defined by separate rules with the same name, but with the = and =/ operators. ↩
BBnf - Alternalte/Or Precidence In BBnf, you can use the spacing of the or/alternate operator (|) to specify the precidence of multiple or operations. The Alternate/Or operation has low precidence when spaced (x | y), and higher precidence when unspaced (x|y). ↩
BBnf - Range Options In BBNF; a range is defined using any combination of two Numbers [Ex; 123, -4], or Characters [Ex; a, b], or Token types that resolve to a any of the previously mentioned types (token..543). ↩
XBNF - Prose Comments Unlike BNF, XBNF differentiates between common documentation comments and 'prose'. Prose can only come after a semicolon at the end of a rule and before the next newline. While comments are entirely descriptive and could be ignored; Prose is human readable plain text that contains extra logic for a rule which could not be handled by BBnf's syntax or easily described using custom tags. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReadMe.md

ReadMe.md

Bespoke-BNF

Feature Comparisons

Table Notes

Sources

Usage

Defining Grammars

Defining Tokens

Defining Rules

Using Grammars

Syntax

Comments

Inline Comments

Literals

Groups

Operators

Tags

Built-in Tags

WhiteSpace Tags

Lexer Token Tags

Parser Expression Tags

Custom Tags

TODO

Files

ReadMe.md

Latest commit

History

ReadMe.md

File metadata and controls

Bespoke-BNF

Feature Comparisons

Table Notes

Sources

Usage

Defining Grammars

Defining Tokens

Defining Rules

Using Grammars

Syntax

Comments

Inline Comments

Literals

Groups

Operators

Tags

Built-in Tags

WhiteSpace Tags

Lexer Token Tags

Parser Expression Tags

Custom Tags

TODO

Footnotes