~alcinnz/argonaut-constellation.org

argonaut-constellation.org/_posts/2023-03-19-text-layout.md -rw-r--r-- 8.1 KiB
01cbeef1 — Adrian Cochrane Publish Jaro's blogpost about inline text layout. 1 year, 7 months ago

#layout: post title: Building a text layout engine in Haskell author: Jaro date: 2023-03-19 21:28:02+1300

For about two months now, I have been building Balkón, a Haskell library for inline text layout to become part of the Argonaut Stack.

Balkón has now reached its first milestone where it is able to correctly handle line breaking.

In this blog post, I will summarise what that entailed and what I learned along the way.

#The building blocks

Balkón uses the HarfBuzz library to handle text shaping, which is the process of taking a Unicode input text and converting it to appropriate font glyphs. It involves kerning, ligatures, cursive forms, and other complex stuff. Adrian provides the Haskell language bindings for HarfBuzz.

Additionally, the International Components for Unicode libraries are used for querying character metadata (writing direction and script) and for finding appropriate line break boundaries. The Haskell language bindings for ICU are provided by Bryan O'Sullivan.

Balkón itself is then expected to work alongside CatTrap, which defines the boxes that text will be rendered in, and Typograffiti, which finally paints the glyphs onto an output medium.

#The task

Before a piece of text can be passed to HarfBuzz, all of its characters must share the same properties -- font, language, script, and direction. Balkón has to break the text down into smaller runs where these properties are constant, pass each run to HarfBuzz, then put all the results together.

In order to know how much space each text run takes up on a line, Balkón has to sum all glyph advances within each run.

When this sum of glyph advances exceeds a given maximum, Balkón needs to find an appropriate line break. This can happen at the boundary of two text runs, but it often happens in the middle of a text run, which then needs to be split in two.

When returning the shaped text to the caller, Balkón needs to use the glyph advances and the line breaking information to tell the caller how each run should be positioned in relation to the paragraph. This is necessary so that Typograffiti can draw each glyph in the correct position, as well as for hit testing.

#Using Haskell for the job

The layout engine's tasks can be described as a pure function, one which takes a piece of text and some properties as input, and produces positioned glyphs and some additional position information as its output. This makes it a good fit for a purely functional programming language like Haskell, but that also brings its own challenges.

Pros: You do not have to worry (too much) about calculating values that will not be used. Just describe all the possible options, and with some care, Haskell's laziness will only evaluate what is needed. Automated tests are easy to write and quick to run because there are no side-effects.

Cons: It can be difficult to come up with appropriate data structures and use them well. Pure Haskell has no mutable structures. Sometimes you just want to edit the end of a list, but instead, you have to traverse a list from its beginning, reconstruct it with a changed value, and then make sure to propagate the changed list to wherever it will be needed. I know that there are some ways to make this easier, but you cannot beat random access.

#Cultural bias

Thanks to Unicode, computers can handle text in many different languages and scripts.

Still, programming languages and libraries often show a bias, primarily for English -- a language written from left to right using the Latin script. English is used as the primary language of the standards, the technical documentation, the mathematical papers behind Haskell, and indeed, even this very blog post.

The left-to-right bias has been the most noticeable when implementing Balkón. Even though HarfBuzz accepts text input in the order in which it is read, the list of glyphs that it outputs is always ordered visually from left to right. This is "a matter of design" according to the documentation, although no rationale is given.

Even Haskell is complicit in the LTR bias: infix operator associativity is either "left" or "right", which assumes left-to-right writing, and traversing a list is often interpreted as going from left to right, which is supposed to mean from its first element (head) to its last element.

I try to at least not make things worse than they are, so I make an effort to only use the words "left" and "right" to refer to output coordinates, not to memory structures.

#Confusing output

The glyph positions in the output from HarfBuzz (for horizontal text) initially confused me, for two reasons.

First, what units do the x_advance values use? Are these dependent on font size, or should I scale them myself?

As it turns out, both of these options are possible.

By default, HarfBuzz uses the font's internal units. These "font units" are defined by the font manufacturer and stored as the units-per-em or UPEM constant within the font face, with common values being 1000 or 2048, and used to define the grid that glyph outlines are placed on. I could read this font constant and scale the glyphs myself, but a better way is to set the scale on the font structure directly. You tell HarfBuzz how large you want the EM square to be, in whatever units you want to use, and HarfBuzz scales all values accordingly. This lets you forget about font units completely.

The second reason for confusion was that besides x_advance, all the other positioning values were zero. How do I know how much vertical space the glyphs should take?

HarfBuzz does not say. But you can call through to get the font's intrinsic metrics, specifically its ascent and descent, which is what web browsers use to determine the "normal" line height.

You can either use the "normal" line height directly, or you can add space above and below to fit a line height set by the user.

In traditional CSS behaviour, the spaces above and below a line are each called a "half-leading" (pronounced /lɛdɪŋ/) and are equally sized. This is rather unconventional in the world of typography, and if you have ever struggled with aligning text in multiple fonts using CSS, this may be the reason. CSS has a new proposed property named text-edge, which should add more line positioning modes to make this easier for designers (and of course to make work more difficult for text engine developers).

#Dependency management

Finding versions of software that work well together is an eternal problem, and Haskell is no exception.

One of the first things that Adrian and I had to do when starting up was to agree on what versions of GHC and HarfBuzz we would be using. The version of HarfBuzz from both Debian and Ubuntu's stable distributions turned out to be too old! We are currently trying to make it work in the unreleased Debian 12.

#Moving forward

Now that Balkón outputs some numbers that look right, it would certainly be nice to turn them into graphical output and make sure that they look like proper text.

The next feature to add to Balkón will be splitting paragraphs into "screenfuls".

#Acknowledgements and further reading

"Text rendering hates you" and "Text layout is a loose hierarchy of segmentation" are two articles that have served as important reading material. Adrian has also written a lengthy write-up on the subject of Text Layout & Rendering.