TAO: comparisons and examples

Comparisons

With XML

A compact piece of XML like this1:

<person 
    firstName="John" 
    lastName="Smith" 
    age="25" 
    address="New York" 
/>

may be expressed in TAO as:

first name [John] 
last name [Smith] 
age [25] 
address [New York]

Notice that a dictionary like this in TAO is open to simple extension by concatenation.

If we were now to extend the address value to include more structured information we may do it like so in XML:

<person 
    firstName="John" 
    lastName="Smith" 
    age="25"
>
    <address 
        streetAddress="21 2nd Street" 
        city="New York" 
        state="NY" 
        postalCode="10021" 
    />
</person>

We have to change the structure significantly because XML attributes are inextensible.

In TAO this is simply adding another level of nesting inside of the value:

first name [John] 
last name [Smith] 
age [25] 
address [
    street address [21 2nd Street] 
    city [New York] 
    state [NY] 
    postal code [10021]
]

With JSON

The same piece of data as above encoded in JSON:

{
  "first name": "John",
  "last name": "Smith",
  "age": 25,
  "address": {
    "street address": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postal code": "10021"
  }
}

It is an improvement upon XML in terms of compactness and extensibility.

However compared to TAO there are still obtrusive syntactical elements and subtle coupling of syntax and semantics.

An object containing a list of elements in JSON:

{
    "songs": [
        {
            "title": "Scarborough Fair / Canticle",
            "length": "3:10"
        },
        {
            "title": "Patterns",
            "length": "2:45"
        },
        {
            "title": "Cloudy",
            "length": "2:15"
        }
    ]
}

Can be encoded in TAO as:

songs [
    [
        title [Scarborough Fair / Canticle]
        length [3:10]
    ]
    [
        title [Patterns]
        length [2:45]
    ]
    [
        title [Cloudy]
        length [2:15]
    ]
]

TAO’s encoding overall translates into much better diffs than any other format.

With S-expressions

The following S-expression2:

(defun factorial (x)
   (if (zerop x)
       1
       (* x (factorial (- x 1)))))

can be translated into TAO as:

defun [factorial [x]
   if [zerop [x]
       1`
       * [x` factorial [-[x` 1]]]
    ]
]

TAO can be used to build a minimal programming language that solves the syntactical problem of Lisp by moving the first element (the car) of a list in front of the bracket for a more natural and readable encoding of function application.

A TAO-based programming language can also solve a major syntactical issue of Lisp and other programming languages by introduction of a natural naming convention with spaces in identifiers in place of the various compromise-based naming conventions.

At the same time TAO maintains the minimal spirit and power of S-expressions.

With HTML

An example above uses XML to encode raw data for which it was not originally designed, being a markup language.

HTML is a close relative which prevails as a good enough solution for the markup syntax of the World Wide Web. This is due to its ability to naturally mix structure into freeform text with low escape friction.

An HTML fragment like this for example:

<p>TAO (<a href="tao.html">Tree Annotation Operator</a>) is a unique and extremely simple 
<a href="tas.html">Tree Annotation Syntax</a> which can be used in 
<a href="contexts.html">all kinds of contexts</a>. It is easily readable and writable 
by humans and machines and has only three basic constructs</p>

looks like freeform text with parts marked up as hyperlinks. It is still a tree (since this is the ultimate syntactical structure) but less apparently so than a typical piece of data.

In this domain the way HTML appears fits more naturally. It still has problems, such as redundancy, which is one of the reasons it is being replaced in some contexts by more lightweight markup syntaxes. However these are much more specialized and not suitable as the underlying syntax of the World Wide Web.

A lightweight and generic syntax would therefore be ideal. This is what TAO might offer:

[p`>TAO ([a` href[tao.html]`>Tree Annotation Operator]) is a unique and extremely simple 
[a` href[tas.html]`>Tree Annotation Syntax] which can be used in 
[a` href[contexts.html]`>all kinds of contexts]. It is easily readable and writable 
by humans and machines and has only three basic constructs]

This is the same fragment as above, closely translated to TAO, without changing the structure or trying for extra compactness. A simple translation like this already removes some redundancy and opens up avenues for lifting of restrictions and slow evolution of HTML into a better markup language.

Escaping and raw taos

In TAO there are only 3 special symbols that ever need escaping:

`` or `[ or `]

equivalent JSON string:

"` or [ or ]"

Multiline strings are the default and only type in TAO. They are absent in JSON and are not part of standard S-expressions.

When even these are not sufficient, TAO introduces a simple idea of raw taos where the special characters need to be escaped only in a few edge cases.

raw taos can be used to embed documents in any format
for example JSON`: raw [
{
  "id": 1,
  "tags": [
    "Foo"
  ]
}
]
or XML`: raw [
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"></xs:schema>
]
or S-expressions`: raw [
(defun factorial (x)
  (if (zerop x) 1
    (* x (factorial (- x 1)))))
]

Other examples

See the interactive TAO parser for some more examples and to try out your own.


  1. Based on an example from Wikipedia↩︎

  2. Example from Wikipedia.↩︎