Plenty has happened since I last blogged here, including banging my head against Haskell's text rendering libraries (hoped to be blogging about that...) & rejigging my plans for Rhapsode & similar browser engines. Now I'm actively taking steps towards turning Rhapsode into a decent browser you can install as an app from your package repositories, as opposed to a browser engine waiting for others to build upon. As part of that I'm commisioning art to have something to display onscreen!
This display will be delivered as a Parametric SVG file so I wrote some code to implement that SVG extension such that I can render it in a GTK UI. This code, written in Haskell to reuse the same XML parser I've been using & bridged over into Vala via their C foreign-function interfaces, lowers the parametric aspects before handing it off to GDK Pixbuf & rSVG.
It does so by traversing the XML tree (exposed as strings to Vala) interpreting every attribute with a supported namespace, generating a value to store in the corresponding unnamespaced attribute. Overriding the attribute if it already exists. The interpretor, where the bulk of the logic is, references a mapping of values populated from the Vala code.
Interpreting a formula is split into 3 phases: lex, parse, & evaluate. Lexing involves using reads
, span
, & character-checks to split the input text up into numbers, backtick-quoted strings, alphabetical identifiers, symbols, & equals-suffixed (comparison) symbols whilst ignoring whitespace. These all become strings again when parsed into an operator.
Lexing strings is more involved since:
To lex these string templates my implementation utilizes a co-recursive sublexer which treats this as syntactic sugar for a parenthesized #
concatenation operator. The normal lexer handles }
specially to hand the trailing text back to the template-string lexer.
Parsing is done using a Pratt/TDOP parser, which was a pleasure to write in Haskell! Since pattern-matching allowed me to more cleanly separate the layers-of-abstraction (parser vs lexer) than in a typical OO language, & Haskell's normal parsing weakness wasn't as much of a problem here.
A Top-Down Operator Precedance (TDOP) parser is a parsing technique very well suited to operator-heavy languages like mathematics (hence why I desugared template-strings!). It is centred around a core reentrant loop which parses an initial token & all subsequant tokens with a greater "binding power". Helper functions/methods are used to parse the initial token(s) (nud
), parse subsequant tokens (led
), & determine the binding power for those tokens (lbp
):
parse' rbp (tok:toks) = go $ nud tok toks
where
go (left, tok':toks')
| rbp < lbp tok' = go $ led left tok' toks'
| otherwise = (left, tok':toks')
This is paired with a helper function to simplify recursive calls:
parseWith rbp toks cb = let (expr, toks') = parse' rbp toks in (cb expr, toks')
Usually the led
function, & to a lesser degree the nud
function, wraps a recursive call back into that core loop whilst constructing an AST (Abstract Syntax Tree). Though there are exceptions for literal values, variable names, & postfix operators like !
for computing factorials. Also parentheses messes with the binding powers & expects to be closed, whether used as a prefix or (for function calls) infix operator.
To close all in-progress loops infinite EOF (which has the lowest binding power, alongside close-paren) tokens are appended to the lexing results. Unknown operators also have zero precedance such that they'll stop the parser & allow for their syntax errors to be detected.
That parser yields a binary tree to be postorder-traversed (if the parser didn't emit any Error
s) to yield boolean, floating point number, or string value. Or (upon failed typechecks or unknown variables) nulls. A Haskell-datatype is defined that can hold any of those. Numbers can be converted to strings, whilst booleans & nulls yields further nulls instead.
Usually it calls one of two helper function to map each operator/function to the corresponding Haskell-equivalent. Though the ternary (condition ? iftrue : iffalse
) operator is handled specially even if ?
& :
aren't by the parser. Same for the ,
(which I ensured was right-associative) & (
operators function calls consist of, these get restructured into a function name & list of pre-evaluated argument values to simplify the helper function.
The parser has minimal understanding of the different operators, mostly just enough to assign them a binding power.
I don't know whether Parametric SVG is a feature I'll ever expose to webdevs, but I enjoyed implementing it. And it makes it easier to design programmatic art which'll both make Rhapsode's visuals more interesting & more accessible. I look forward to sharing these visuals once they are ready!
Atypically Rhapsode's experience has been designed audio-first, with visuals only added due to the requirements of most app platforms!