blog

One-escape TAO

Darius JJ Chuck

2021-07-16

Previously we’ve defined TAO in one line of abstract grammar. Today we’ll instantiate this grammar to a variant of TAO which only ever requires a single character to be escaped, reducing escape friction to minimum:

TAO = *("`[" TAO "`]" / 1*symbol)
symbol = "`" any / (any - "`")

The Data TAO example encoded with this variant would look like this:

first name `[John`]
last name `[Smith`]
is alive `[true`]
age `[27`]
address `[
  street address `[21 2nd Street`]
  city `[New York`]
  state `[NY`]
  postal code `[10021-3100`]
`]
phone numbers `[
  `[
    type `[home`]
    number `[212 555-1234`]
  `]
  `[
    type `[office`]
    number `[646 555-4567`]
  `]
`]
children `[`]
spouse `[`]

The clear disadvantage is that because digraphs are used for bracketing, compactness is reduced. In exchange for that though the parser becomes even more regular and arbitrary strings of data can be inserted as leaves into TAO trees with escaping realized with an equivalent of a simple

string.replaceAll('`', '``') 

Since we selected ` as the escape character, escaping is rarely needed in practice, as it is on average the second least often used character. This is a major reason why it is part of the canonical grammar.

To go further and practically eliminate escape friction, we could use the ASCII Escape character (code 27 = 0x1B) – this might be good when using TAO for serialization/deserialization when we are certain that this character can’t occur in our data. In such case escaping is not necessary at all, so we gain extra speed. The cost is dramatically reduced portability, because of non-printability and copy-pasteability. Translation to a portable form is though trivial.

Extra-compact binary variants of TAO are not far-off from here.

Archive

One-escape TAO 2021-07-16 TAO in one line 2021-07-09 The best format for multiple-word identifiers 2021-07-07 Square brackets 2021-07-05 Fixing S-expressions: overnesting on the left 2021-06-29 xtao.org 2021-06-10 Operator, please dial the number 2021-06-04 Fixing CSV 2021-06-04 Streaming spreadsheets 2021-06-03 Nested query params 2021-06-02 No escaping 2021-06-01 TAO blog and public newsletter 2021-04-08