~alcinnz/argonaut-constellation.org

ref: 8eaa3d2e60e76db668986bd88471646157251500 argonaut-constellation.org/_posts/2023-02-05-language-bindings.md -rw-r--r-- 12.0 KiB
8eaa3d2e — Adrian Cochrane Blog about my language binding efforts! 1 year, 9 months ago

#layout: post title: Writing Language Bindings author: Adrian Cochrane date: 2023-02-05 18:18:37+1300

I may be striving for simplicity in my browserengine development, but I will not do so at the cost of inclusivity. As such I've spent the past couple months (alongside coordinating a growing project) developing pure-functional language bindings for the best available font libraries covering the widest breadth of written langauges. Namely FontConfig & Harfbuzz. Excellent bindings already existed for LibICU, and I wrapped FreeType in an OpenGL-accelerated renderer.

FontConfig is the library used by your free desktop to query which fonts you have installed, selecting an appropriate fontfile for the family, style, size, etc you've selected. Harfbuzz determines which "glyphs" from that font should represent each few consecutive characters & positions them relatively to the previous glyph. Whilst FreeType can decode a wide variety of fontformats. And LibICU analyses Unicode text providing input on how to break lines or normalize writing direction. Together, alongside the line-wrapping abstraction of Pango, these are the libraries used to render any UI text in your GTK apps. Not to mention their heavy use elsewhere!

The challenge is that these libraries are implemented in the imperative language C (or C++) whereas I'm using the functional language Haskell. So the experience was much like adapting a book into a movie, taking a well-loved story and attempting to tell it in a very different medium. Somethings transliterated directly, other things didn't. I could have written straightforward imperative language bindings for these libraries to be used with Haskell's monads (like the FreeType language bindings I pulled in), I could've even automatically generated such source code. But then I'd loose the joy of writing idiomatic Haskell!

Furthermore when calling C code I lose the memory-safety guarantees provided by Haskell. I have to carefully avoid segfaults, etc. Especially if I'm promising Haskell that this code is "pure", with no side-effects. Thankfully all these APIs (except FreeType) promises us that they're thread-safe!

#Harfbuzz

Harfbuzz adapted very nicely to the functional paradigm, with its single central function doing the all the complex computation! Even if it took some effort to safely/conveniently adapt its datastructures. Several were substituted with common Haskell datastructures (including those from the containers, text, & bytestring hackages) with conversion routines.

Buffers largely corresponded to the Lazy Text type, but they have several additional fields. So I declared a pure-Haskell equivalent which could be conveneniently used via the Record syntax, with conversion to/from the C type. As well as routines to extract the attached output.

I took care to keep the C & Haskell heaps largely seperate, stack-allocating as much of the C data as I could, to ensure the Haskell's GC didn't free data Harfbuzz was actively processing. This took some confidence-building as I explored which techniques were safe to use!

Though I do wrap Harfbuzz Fonts & Faces in garbage-collected "Foreign Pointers". I expose all their getter routines normally, but to enforce an illusion of Haskell-like immutability the setters can only be called upon construction via "Font Option" records.

Harfbuzz exposes very few public C structs, most of which have fields all of the same bitwidth. This made language-binding them unusually trivial! Though for a couple I did resort to using the derive-storable hackage.

#FontConfig

FontConfig is essentially a domain-specific database engine. It was a bit harder to develop language bindings for FontConfig, though the skills I learnt from Harfbuzz transferred over. Heck it has some structs which I'm sure would've tripped up many language-binding autogenerators, requiring me to write a bit of C!

FontConfig consists mostly of imperative collection datatypes, which I wrote converters between containers or linear's datatypes for. Alongside some more primitive types these are stored within a "Value" tagged-union, for which I wrote a typeclass converting from select static types to these dynamic types. This typeclass did prove convenient!

In turn these Values are stored in multimap "Patterns", which may be gathered in a "Font Set". These also needed converters between the C types & Haskell equivalents (note: there's a heisenbug in the fontset decoder, when I remove certain debugging statements the bug I was attempting to squash segfaults; help is appreciated!). Only once I've got converters for all these types could I language-bind the functions I wished to use! I littered these bindings with calls to convert FontConfig's rare errors to exceptions.

Furthermore in these FontConfig language bindings I added a bridge from my own Haskell Stylist and bridge to both Harfbuzz (Unfortunate dependency conflict between a CSS lexer & my Harfbuzz language bindings upon the Text datastructure) & FontConfig (based on FcFT's code).

#"Typograffiti" & FreeType

It wasn't until I hooked these language-bindings up to a text-renderer that I could actually see that they were working! Not that I wasn't testing before-hand, but I had to guess that the output they were giving me looked reasonable. As such I'm surprised they worked pretty much unmodified!

Typograffiti is a pre-existing optimized text renderer built on FreeType & OpenGL. I attempted to contribute retrofits to Typograffiti a while back to support a broader range of written languages via Harfbuzz. Now I reimplemented it's foundations to do so, and more-or-less resurrected the old highlevel optimizations & API upon it. Exposing new font-specific text-formatting options offered by Harfbuzz, though a newer version of Harfbuzz was required to transfer some of this data to FreeType without incurring segfaults.

I made sure those new text-formatting APIs were convenient to use, though honestly I haven't tested them. Help would be appreciated there! As for figuring to align glyph baselines so they look less ugly. And I might have found a GHC bug when testing FontConfig alongside SDL, OpenGL, & FreeType...

Typograffiti uses machine-generated OpenGL (with its own minor abstractions) & FreeType language bindings. I didn't feel a need to write more Haskelly language-bindings here, since I was using them for I/O where Haskell's relatively weak.

#Conclusion

I strived to make these language binding's APIs feel like normal modules implemented entirely in Haskell, whilst being recognizable to developers who used the C APIs. I strived to make it safe to call these widely-used font libraries from pure-functional Haskell code. I am glad to see LibICU has already gotten similar treatment.

There's still issues that can be addressed, which I mentiond above, though seemingly not in the Harfbuzz bindings. Contributions are welcome!

But now me & Jaro can get on to enjoying writing pure-functional Haskell, benefitting from all the internationalization which has gone into these excellent libraries!