Syntax

Living in a structural world is nice, but sometimes we need to convert structural content to/from plain text representation, e.g. for file I/O and system clipboard. Neomacs provides some low-level hooks and a table-driven parser framework for converting to/from plain text.

(De)-serialization functions

Interface to parsers:

Variable: *dom-output*
A DOM node to accumulate result of parsers.
Function: read-dom (stream &optional recursive-p)
Read and build DOM nodes from STREAM using *syntax-table*.
If RECURSIVE-P is t, the call is expected to be made from within some function that has been itself called from read-dom, for example some function that is bound in *syntax-table*.
If *dom-output* is bound, append the results as children of *dom-output*. Otherwise, return the results as a list of DOM nodes.
Function: read-dom-from-file (file)
Read and build DOM nodes from FILE using *syntax-table*.
Read continues until all FILE content is consumed.
If *dom-output* is bound, append the results as children of *dom-output*. Otherwise, return the results as a list of DOM nodes.

The following hooks implement actual (de)-serialization:

Standard generic function: read-dom-aux (buffer stream)
Read and build DOM nodes from STREAM for BUFFER.
Resulting DOM nodes should be appended as children of *dom-output*. Reading can stop at whatever boundary that makes sense, i.e. multiple children can be built and appended.
Standard generic function: write-dom-aux (buffer node stream)
Serialize DOM NODE to STREAM for BUFFER.
The result should be able to be read back with read-dom-aux.

Note that parsers use append-child to append children to *dom-output* directly, i.e. they use low-level DOM edits instead of editing-primitives. It is therefore important to never bind *dom-output* to live DOM nodes in any buffer, which would result in corrupted state (inconsistency between Lisp-side and renderer-side DOM). Typically, one wants to call the parser to build DOM nodes outside any buffer, then use insert-nodes to insert them into some buffer if needed.

Table-driver parser

Neomacs provides a parser framework driven by syntax-tables, which binds characters to functions that consume some (more) characters and builds DOM nodes. The parser reads a character from input stream, looks it up in *syntax-table*, and calls the function with two arguments: the stream and the character. If the character is not bound to any function, the special value t is looked up in *syntax-table* next, and the function (if any) bound to t is used as default. If t does not have binding either, an error is signaled.

Function, setf-able: get-syntax-table (char table)
Get function bound to CHAR in TABLE.
Function: set-syntax-range (table beg end symbol)
Bind characters between char-code BEG and END in TABLE to SYMBOL.
Function: make-syntax-table (&rest bindings)
Make a syntax table using BINDINGS.
Variable: *syntax-table*
The syntax table currently in effect.

It's quite common to read consecutive characters of the same "category" as a string and process as a whole, possibly building a DOM element or inserting as text. The following functions help with such cases:

Function: read-constituent (stream symbol escape-chars)
Read consecutive characters from STREAM and return as string.
Read stops when a character not bound to SYMBOL in *syntax-table* is encountered, with one exception: a character in the list ESCAPE-CHARS makes the next character accepted unconditionally.
Function: append-text (parent string)
Append STRING as a text node to PARENT.
If PARENT already has a text node as last-child, concat into the text node instead.
Some functions suitable for binding to in syntax tables:
Function: read-newline (stream c)
Append a new line node (br element) to *dom-output*.
Function: read-text (stream c)
Read consecutive characters and append as text node to *dom-output*.
Function: read-ignore (stream c)
Does nothing.