Skip to content

Latest commit

 

History

History
198 lines (153 loc) · 7.03 KB

JSONSelect.md

File metadata and controls

198 lines (153 loc) · 7.03 KB

WARNING: This document is a work in progress, just like JSONSelect itself. View or contribute to the latest version on github

JSONSelect

  1. introduction
  2. levels
  3. language overview
  4. grouping
  5. selectors
  6. pseudo classes
  7. combinators
  8. planned additions
  9. grammar
  10. conformance tests
  11. references

Introduction

JSONSelect defines a language very similar in syntax and structure to CSS3 Selectors. JSONSelect expressions are patterns which can be matched against JSON documents.

Potential applications of JSONSelect include:

  • Simplified programmatic matching of nodes within JSON documents.
  • Stream filtering, allowing efficient and incremental matching of documents.
  • As a query language for a document database.

Levels

The specification of JSONSelect is broken into three levels. Higher levels include more powerful constructs, and are likewise more complicated to implement and use.

JSONSelect Level 1 is a small subset of CSS3. Every feature is derived from a CSS construct that directly maps to JSON. A level 1 implementation is not particularly complicated while providing basic querying features.

JSONSelect Level 2 builds upon Level 1 adapting more complex CSS constructs which allow expressions to include constraints such as patterns that match against values, and those which consider a node's siblings. Level 2 is still a direct adaptation of CSS, but includes constructs whose semantic meaning is significantly changed.

JSONSelect Level 3 adds constructs which do not necessarily have a direct analog in CSS, and are added to increase the power and convenience of the selector language. These include aliases, wholly new pseudo class functions, and more blue sky dreaming.

Language Overview

patternmeaninglevel
*Any node1
TA node of type T, where T is one string, number, object, array, boolean, or null1
T.keyA node of type T which is the child of an object and is the value its parents key property1
T."complex key"Same as previous, but with property name specified as a JSON string1
T:rootA node of type T which is the root of the JSON document1
T:nth-child(n)A node of type T which is the nth child of an array parent1
T:nth-last-child(n)A node of type T which is the nth child of an array parent counting from the end2
T:first-childA node of type T which is the first child of an array parent (equivalent to T:nth-child(1)1
T:last-childA node of type T which is the last child of an array parent (equivalent to T:nth-last-child(1)2
T:only-childA node of type T which is the only child of an array parent2
T:emptyA node of type T which is an array or object with no child2
T UA node of type U with an ancestor of type T1
T > UA node of type U with a parent of type T1
T ~ UA node of type U with a sibling of type T2
S1, S2Any node which matches either selector S1 or S21
T:has(S)A node of type T which has a child node satisfying the selector S3

NOTE: Not all of the constructs on the above table are necessarily implemented in the reference implementation at the moment.

Grouping

Selectors

Pseudo Classes

Combinators

Planned Additions

  • (in level 3) A means of matching against node values. Such as string:val("Bulgarian"), or even string:expr(x = "Bulgarian") or maybe number:expr(10 < x < 10)
  • as little else as I can get away with.

Grammar

(Adapted from CSS3 and json.org)

selectors_group
  : selector [ `,` selector ]*
  ;

selector
  : simple_selector_sequence [ combinator simple_selector_sequence ]*
  ;

combinator
  : `>` | \s+
  ;

simple_selector_sequence
  /* why allow multiple HASH entities in the grammar? */
  : [ type_selector | universal ]
    [ hash | pseudo ]*
  | [ hash | pseudo ]+
  ;

type_selector
  : `object` | `array` | `number` | `string` | `boolean` | `null`
  ;

universal
  : '*'
  ;

hash
  : `.` name
  | `.` json_string
  ;

pseudo
  /* Note that pseudo-elements are restricted to one per selector and */
  /* occur only in the last simple_selector_sequence. */
  : `:` pseudo_class_name
  | `:` pseudo_function_name `(` expression `)`
  ;

pseudo_class_name
  : `root` | `first-child` | `last-child` | `only-child`

pseudo_function_name
  : `nth-child` | `nth-last-child`

expression
  /* expression is and of the form "an+b" */
  : TODO
  ;

json_string
  : `"` json_chars* `"`
  ;

json_chars
  : any-Unicode-character-except-"-or-\-or-control-character
  |  `\"`
  |  `\\`
  |  `\/`
  |  `\b`
  |  `\f`
  |  `\n`
  |  `\r`
  |  `\t`
  |   \u four-hex-digits 
  ;

name
  : nmstart nmchar*
  ;

nmstart
  : escape | [_a-zA-Z] | nonascii
  ;

nmchar
  : [_a-zA-Z0-9-]
  | escape
  | nonascii
  ;

escape 
  : \\[^\r\n\f0-9a-fA-F]
  ;

nonascii
  : [^\0-0177]
  ;

Conformance Tests

See https://github.com/lloyd/JSONSelect/tree/master/tests.

References

In no particular order.