From 8eaa3d2e60e76db668986bd88471646157251500 Mon Sep 17 00:00:00 2001 From: Adrian Cochrane Date: Sun, 5 Feb 2023 18:21:16 +1300 Subject: [PATCH] Blog about my language binding efforts! --- _posts/2023-02-05-language-bindings.md | 50 ++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 _posts/2023-02-05-language-bindings.md diff --git a/_posts/2023-02-05-language-bindings.md b/_posts/2023-02-05-language-bindings.md new file mode 100644 index 0000000..727d362 --- /dev/null +++ b/_posts/2023-02-05-language-bindings.md @@ -0,0 +1,50 @@ +--- +layout: post +title: Writing Language Bindings +author: Adrian Cochrane +date: 2023-02-05 18:18:37+1300 +--- + +I may be striving for [simplicity](https://www.codementor.io/@joniwieben/on-the-importance-of-simplicity-in-software-uj5y3tnw6) in my browserengine development, but I **will not** do so at the [cost of inclusivity](https://www.baldurbjarnason.com/2022/ootsc-introduction/). As such I've spent the past couple months (alongside coordinating a growing project) developing [pure-functional](https://www.fpcomplete.com/blog/2017/04/pure-functional-programming/) [language bindings](https://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html) for the best available [font](https://www.computerhope.com/jargon/f/font.htm) [libraries](https://www.techopedia.com/definition/3828/software-library) covering the [widest breadth of written langauges](https://invidious.snopyta.org/watch?v=0j74jcxSunY). Namely [FontConfig](https://www.freedesktop.org/wiki/Software/fontconfig/) & [Harfbuzz](https://harfbuzz.github.io/). Excellent bindings [already existed](https://hackage.haskell.org/package/text-icu) for [LibICU](https://icu.unicode.org/), and I wrapped [FreeType](https://freetype.org/) in an [OpenGL](https://www.opengl.org/)-accelerated [renderer](https://faultlore.com/blah/text-hates-you/). + +FontConfig is the library used by your [free desktop](https://freedesktop.org/) to query which [fonts you have installed](https://linuxconfig.org/how-to-install-and-manage-fonts-on-linux), selecting an appropriate [fontfile](https://www.howtogeek.com/325644/whats-the-difference-between-a-font-a-typeface-and-a-font-family/) for the family, style, size, etc you've selected. Harfbuzz determines which "[glyphs](https://www.creativelive.com/blog/6-typography-terms-that-get-confused/)" from that font [should represent](https://harfbuzz.github.io/what-is-harfbuzz.html#what-is-text-shaping) each few consecutive characters & positions them relatively to the previous glyph. Whilst FreeType can decode a [wide variety of fontformats](https://freetype.org/freetype2/docs/index.html). And LibICU analyses [Unicode text](https://www.unicode.org/standard/WhatIsUnicode.html) providing input on how to [break lines or normalize writing direction](https://raphlinus.github.io/text/2020/10/26/text-layout.html). Together, alongside the [line-wrapping abstraction](https://docs.gtk.org/Pango/pango_rendering.html) of [Pango](https://pango.gnome.org/), these are the libraries used to render any UI text in your [GTK](https://gtk.org/) apps. Not to mention their [heavy use](https://en.wikipedia.org/wiki/HarfBuzz#Users) elsewhere! + +The challenge is that these libraries are implemented in the [imperative language](https://www.zachgollwitzer.com/posts/imperative-programming) [C](https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html) (or [C++](https://isocpp.org/)) whereas I'm using the functional language [Haskell](https://haskell.org/). So the experience was much like [adapting](https://mckellen.com/cinema/lotr/journal.htm) a book into a movie, taking a well-loved story and attempting to tell it in [a very different medium](https://www.helpingwritersbecomeauthors.com/5-important-ways-storytelling-different-books-vs-movies/). Somethings transliterated directly, other things didn't. I could have written straightforward imperative language bindings for these libraries to be used with [Haskell's monads](http://www.learnyouahaskell.com/a-fistful-of-monads) (like the [FreeType language bindings](https://hackage.haskell.org/package/freetype2) I pulled in), I could've even [automatically generated](https://hackage.haskell.org/package/HSFFIG) such source code. But then I'd loose the [joy of writing idiomatic Haskell](https://www.haskellforall.com/2020/10/why-i-prefer-functional-programming.html)! + +Furthermore when calling C code I lose the [memory-safety guarantees provided by Haskell](https://scribe.rip/pragmatic-programmers/haskell-safety-e7c8db58f542). I have to carefully [avoid segfaults, etc](https://nerdyelectronics.com/most-common-pitfalls-in-c/). Especially if I'm promising Haskell that this code is "[pure](https://codesweetly.com/pure-function-vs-impure-function/)", with [no side-effects](https://dzone.com/articles/side-effects-1). Thankfully all these [APIs](https://www.techopedia.com/definition/24407/application-programming-interface-api) (except FreeType) promises us that they're [thread-safe](https://web.mit.edu/6.005/www/fa15/classes/20-thread-safety/)! + +## Harfbuzz +[Harfbuzz adapted](https://hackage.haskell.org/package/harfbuzz-pure) *very* nicely to the functional paradigm, with its [single central function](https://harfbuzz.github.io/shaping-and-shape-plans.html) doing the all the [complex computation](https://harfbuzz.github.io/what-does-harfbuzz-do.html)! Even if it took some effort to safely/conveniently adapt [its datastructures](https://harfbuzz.github.io/harfbuzz-hb-common.html). Several were substituted with common Haskell datastructures (including those from the [`containers`](https://hackage.haskell.org/package/containers), [`text`](https://hackage.haskell.org/package/text), & [`bytestring`](https://hackage.haskell.org/package/bytestring) hackages) with conversion routines. + +[Buffers](https://harfbuzz.github.io/buffers-language-script-and-direction.html) largely corresponded to the [Lazy Text](https://hackage.haskell.org/package/text-2.0.1/docs/Data-Text-Lazy.html) type, but they have [several additional fields](https://harfbuzz.github.io/setting-buffer-properties.html). So I declared a pure-Haskell equivalent which could be conveneniently used via the [Record syntax](https://www.haskellforall.com/2020/07/record-constructors.html), with conversion to/from the C type. As well as routines to extract the [attached output](https://harfbuzz.github.io/harfbuzz-hb-buffer.html#hb-buffer-get-glyph-infos). + +I took care to keep the C & Haskell heaps largely seperate, [stack-allocating](https://hackage.haskell.org/package/base-4.17.0.0/docs/Foreign-Marshal-Alloc.html#v:alloca) as much of the C data as I could, to ensure the [Haskell's GC](https://wiki.haskell.org/GHC/Memory_Management) didn't [free data Harfbuzz was actively processing](https://owasp.org/www-community/vulnerabilities/Using_freed_memory). This took some confidence-building as I explored which techniques were safe to use! + +Though I do wrap Harfbuzz [Fonts & Faces](https://harfbuzz.github.io/fonts-and-faces.html) in garbage-collected "[Foreign Pointers](https://hackage.haskell.org/package/base-4.17.0.0/docs/Foreign-ForeignPtr.html)". I expose all their getter routines normally, but to enforce an illusion of Haskell-like [immutability](https://mmhaskell.com/blog/2017/1/9/immutability-is-awesome) the setters can only be called upon construction via "[Font Option](https://hackage.haskell.org/package/harfbuzz-pure-1.0.2.0/docs/Data-Text-Glyphize.html#t:FontOptions)" records. + +Harfbuzz exposes very few public [C structs](https://www.programiz.com/c-programming/c-structures), most of which have fields all of the same bitwidth. This made language-binding them [unusually trivial](https://www.toolsqa.com/data-structures/array-in-programming/)! Though for a couple I did resort to using the [`derive-storable`](https://hackage.haskell.org/package/derive-storable) hackage. + +## FontConfig +FontConfig is essentially a domain-specific database engine. It was a bit harder to develop [language bindings for FontConfig](https://hackage.haskell.org/package/fontconfig-pure), though the skills I learnt from Harfbuzz transferred over. Heck it has [some structs](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/x31.html#AEN61) which I'm sure [would've tripped up](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/x31.html#AEN53) many language-binding autogenerators, [requiring me](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/fcpatternadd.html) to write a bit of C! + +FontConfig consists mostly of [imperative collection datatypes](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/x31.html), which I wrote converters between `containers` or [`linear`](https://hackage.haskell.org/package/linear)'s datatypes for. Alongside some more primitive types these are stored within a "Value" [tagged-union](https://aayushacharya.com.np/blog/discriminated-unions-c/), for which I wrote a [typeclass](https://book.realworldhaskell.org/read/using-typeclasses.html) converting from select [static types to these dynamic types](https://instil.co/blog/static-vs-dynamic-types/). This typeclass *did* prove convenient! + +In turn these Values are stored in [multimap "Patterns"](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/x31.html#AEN58), which may be gathered in a "Font Set". These also needed converters between the C types & Haskell equivalents (note: there's a [heisenbug](https://en.wikipedia.org/wiki/Heisenbug) in the fontset decoder, when I remove certain [debugging statements](https://jvns.ca/blog/2022/12/08/a-debugging-manifesto/) the bug I was attempting to squash [segfaults](https://kb.iu.edu/d/aqsj); help is appreciated!). Only once I've got converters for all these types could I language-bind the functions I wished to use! I littered these bindings with calls to convert FontConfig's [rare errors](https://www.freedesktop.org/software/fontconfig/fontconfig-devel/x31.html#AEN93) to [exceptions](https://www.tweag.io/blog/2020-04-16-exceptions-in-haskell/). + +Furthermore in these FontConfig language bindings I added a bridge from my own [Haskell Stylist](https://hackage.haskell.org/package/stylist-traits) and bridge to both Harfbuzz (Unfortunate dependency conflict between a [CSS lexer](https://hackage.haskell.org/package/css-syntax) & my Harfbuzz language bindings upon the Text datastructure) & FontConfig (based on [FcFT](https://codeberg.org/dnkl/fcft/)'s code). + +## "Typograffiti" & FreeType +It wasn't until I hooked these language-bindings up to a text-renderer that I could actually see that they were working! Not that I wasn't testing before-hand, but I had to guess that the output they were giving me **looked** reasonable. As such I'm surprised they worked *pretty much unmodified*! + +[Typograffiti](https://hackage.haskell.org/package/typograffiti) is a pre-existing optimized text renderer built on FreeType & OpenGL. I attempted to contribute retrofits to Typograffiti a while back to support a broader range of written languages via Harfbuzz. Now I reimplemented it's foundations to do so, and more-or-less resurrected the old highlevel optimizations & API upon it. Exposing new font-specific text-formatting options offered by Harfbuzz, though a newer version of Harfbuzz was required to transfer some of this data to FreeType without incurring segfaults. + +I made sure those new text-formatting APIs were convenient to use, though honestly I haven't tested them. Help would be appreciated there! As for [figuring to align glyph baselines](https://stackoverflow.com/questions/62374506/how-do-i-align-glyphs-along-the-baseline-with-freetype#62375274) so they look less ugly. And I might have found a [GHC bug](https://downloads.haskell.org/~ghc/4.06/docs/users_guide/hard-core-debug.html) when testing FontConfig alongside [SDL](https://hackage.haskell.org/package/sdl2), OpenGL, & FreeType... + +Typograffiti uses machine-generated OpenGL (with its own minor abstractions) & FreeType language bindings. I didn't feel a need to write more Haskelly language-bindings here, since I was using them for [I/O where Haskell's relatively weak](https://book.realworldhaskell.org/read/io.html). + +## Conclusion +I strived to make these language binding's APIs feel like normal modules implemented entirely in Haskell, whilst being recognizable to developers who used the C APIs. I strived to make it safe to call these widely-used font libraries from pure-functional Haskell code. I am glad to see LibICU has already gotten similar treatment. + +There's still issues that can be addressed, which I mentiond above, though seemingly not in the Harfbuzz bindings. Contributions are welcome! + +But now me & Jaro can get on to enjoying writing pure-functional Haskell, benefitting from all the internationalization which has gone into these excellent libraries! -- 2.30.2