Planet Igalia

May 16, 2024

Andy Wingo

on hoot, on boot

I realized recently that I haven’t been writing much about the Hoot Scheme-to-WebAssembly compiler. Upon reflection, I have been too conscious of its limitations to give it verbal tribute, preferring to spend each marginal hour fixing bugs and filling in features rather than publicising progress.

In the last month or so, though, Hoot has gotten to a point that pleases me. Not to the point where I would say “accept no substitutes” by any means, but good already for some things, and worth writing about.

So let’s start today by talking about bootie. Boot, I mean! The boot, the boot, the boot of Hoot.

hoot boot: temporal tunnel

The first axis of boot is time. In the beginning, there was nary a toot, and now, through boot, there is Hoot.

The first boot of Hoot was on paper. Christine Lemmer-Webber had asked me, ages ago, what I thought Guile should do about the web. After thinking a bit, I concluded that it would be best to avoid compromises when building an in-browser Guile: if you have to pollute Guile to match what JavaScript offers, you might as well program in JavaScript. JS is cute of course, but Guile is a bit different in some interesting ways, the most important of which is control: delimited continuations, multiple values, tail calls, dynamic binding, threads, and all that. If Guile’s web bootie doesn’t pack all the funk in its trunk, probably it’s just junk.

So I wrote up a plan something to which I attributed the name tailification. In retrospect, this is simply a specific flavor of a continuation-passing-style (CPS) transmutation, late in the compiler pipeline. I’ll elocute more in a future dispatch. I did end up writing the tailification pass back then; I could have continued to target JS, but it was sufficiently annoying and I didn’t prosecute. It sat around unused for a few years, until Christine’s irresistable charisma managed to conjure some resources for Hoot.

In the meantime, the GC extension for WebAssembly shipped (woot woot!), and to boot Hoot, I filled in the missing piece: a backend for Guile’s compiler that tailified and then translated primitive operations to snippets of WebAssembly.

It was, well, hirsute, but cute and it did compute, so we continued to boot. From this root we grew a small run-time library, written in raw WebAssembly, used for slow-paths for the various primitive operations that are part of Guile’s compiler back-end. We filled out Guile primcalls, in minute commits, growing the WebAssembly runtime library and toolchain as we went.

Eventually we started constituting facilities defined in terms of those primitives, via a Scheme prelude that was prepended to all programs, within a nested lexical environment. It was never our intention though to drown the user’s programs in a sea of predefined bindings, as if the ultimate program were but a vestigial inhabitant of the lexical lake—don’t dilute the newt!, we would often say [ed: we did not]— so eventually when the prelude became unmanageable, we finally figured out how to do whole-program compilation of a set of modules.

Then followed a long month in which I would uproot the loot from the boot: take each binding from the prelude and reattribute it into an appropriate module. User code could import all the modules that suit, as long as they were known to Hoot, but no others; it was only until we added the ability for users to programmatically consitute an environment from their modules that Hoot became a language implementation of any repute.

Which brings us to the work of the last month, about which I cannot be mute. When you have existing Guile code that you want to distribute via the web, Hoot required you transmute its module definitions into the more precise R6RS syntax. Precise, meaning that R6RS modules are static, in a way that Guile modules, at least in absolute terms, are not: Guile programs can use first-class accessors on the module systems to pull out bindings. This is yet another example of what I impute as the original sin of 1990s language development, that modules are just mutable hash maps. You see it in Python, for example: because you don’t know for sure to what values global names are bound, it is easy for any discussion of what a particular piece of code means to end in dispute.

The question is, though, are the semantics of name binding in a language fixed and absolute? Once your language is booted, are its aspects definitively attributed? I think some perfection, in the sense of becoming more perfect or more like the thing you should be, is something to salute. Anyway, in Guile it would be coherent with Scheme’s lexical binding heritage to restitute some certainty as to the meanings of names, at least in a default compilation node. Lexical binding is, after all, the foundation of the Macro Writer’s Statute of Rights. Of course if you are making a build for development purposes, not to distribute, then you might prefer a build that marks all bindings as dynamic. Otherwise I think it’s reasonable to require the user to explicitly indicate which definitions are denotations, and which constitute locations.

Hoot therefore now includes an implementation of the static semantics of Guile’s define-module: it can load Guile modules directly, and as a tribute, it also has an implementation of the ambient (guile) module that constitutes the lexical soup of modules that aren’t #:pure. (I agree, it would be better if all modules were explicit about the language they are written in—their imported bindings and so on—but there is an existing corpus to accomodate; the point is moot.)

The astute reader (whom I salute!) will note that we have a full boot: Hoot is a Guile. Not an implementation to substitute the original, but more of an alternate route to the same destination. So, probably we should scoot the two implementations together, to knock their boots, so to speak, merging the offshoot Hoot into Guile itself.

But do I circumlocute: I can only plead a case of acute Hoot. Tomorrow, we elocute on a second axis of boot. Until then, happy compute!

by Andy Wingo at May 16, 2024 08:01 PM

May 15, 2024

Manuel A. Fernandez Montecelo

WiFi back-ends in SteamOS 3.6

As of NetworkManager v1.46.0 (and a very long while before that, since 1.12 released in 2018), there are two wifi.backend's supported: wpa_supplicant (default), and iwd (experimental).

iwd is still considered experimental after these years, see the list of issues with iwd backend, and specially this one.

In SteamOS, however, the default is iwd. It works relatively well in most scenarios; and the main advantage is that, when waking up the device (think of Steam Deck returning from sleep, "suspended to RAM" or S3 sleeping state), it regains network connectivity much faster -- typically 1-2 seconds vs. 5+ for wpa_supplicant.

However, if for some reason iwd doesn't work properly in your case, you can give wpa_supplicant a try.

With steamos-wifi-set-backend command

In 3.6, the back-end can be changed via command.

3.6 has been available in Main channel since late March and in Beta/Preview for several days, for example 20240509.100.

If you haven't done this before, to be able to run it, one must be able to either log in via SSH (not available out of the box), or to switch to desktop-mode, open a terminal and be able to type commands.

Assuming that one wants to change from the default iwd to wpa_supplicant, the command is:

steamos-wifi-set-backend wpa_supplicant

The script tries to ensure than things are in place in terms of some systemd units being stopped and others brought up as appropriate. After a few seconds, if it's able to connect correctly to the WiFi network, the device should get connectivity (e.g. ping works). It is not necessary to enter again the preferred networks and passwords, NetworkManager feeds the back-ends with the necessary config and credentials.

It is possible to switch back and forth between back-ends by using alternatively iwd and wpa_supplicant as many times as desired, and without restarting the system.

The current back-end as per configuration can be obtained with:

steamos-wifi-set-backend --check

And finally, this is the output of changing to wpa_supplicant and back to iwd (i.e., the output that you should expect):

(deck@steamdeck ~)$ steamos-wifi-set-backend --check
iwd

(deck@steamdeck ~)$ steamos-wifi-set-backend wpa_supplicant
INFO: switching back-end to 'wpa_supplicant'
INFO: stopping old back-end service and restarting NetworkManager,
      networking will be disrupted (hopefully only momentary blip, max 10 seconds)...
INFO: restarting done
INFO: checking status of services ...
      (sleeping for 2 seconds to catch-up with state)
INFO: status OK

(deck@steamdeck ~)$ steamos-wifi-set-backend --check
wpa_supplicant

(deck@steamdeck ~)$ ping -c 3 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=56 time=16.4 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=56 time=24.8 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=56 time=16.5 ms

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 16.370/19.195/24.755/3.931 ms

(deck@steamdeck ~)$ steamos-wifi-set-backend iwd
INFO: switching back-end to 'iwd'
INFO: stopping old back-end service and restarting NetworkManager,
      networking will be disrupted (hopefully only momentary blip, max 10 seconds)...
INFO: restarting done
INFO: checking status of services ...
      (sleeping for 2 seconds to catch-up with state)
INFO: status OK

(deck@steamdeck ~)$ steamos-wifi-set-backend --check
iwd

(deck@steamdeck ~)$ ping -c 3 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=56 time=16.0 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=56 time=16.5 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=56 time=21.3 ms

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 15.991/17.940/21.295/2.382 ms

By hand

⚠️ IMPORTANT ⚠️ : Please do this only if you know what you are doing, you don't fear minor breakage and you are confident that you can recover from the situation if something goes wrong. If unsure, better wait until the channel that you run supports the commands to handle this automatically.

If you are running a channel that still does not have this command, e.g. Stable at the time of writing this, you can achieve the same effect by executing the instructions run by this script by hand.

Note, however, that you might easily loose network connectivity if you make some mistake or something goes wrong. So be prepared to restore things by hand by switching to desktop-mode and having a terminal ready.

Now, the actual instructions. In /etc/NetworkManager/conf.d/wifi_backend.conf, change the string of wifi.backend, for example comment out the default one with iwd and add a line using wpa_supplicant as value:

[device]
#wifi.backend=iwd
wifi.backend=wpa_supplicant
wifi.iwd.autoconnect=yes

⚠️ IMPORTANT ⚠️ : Leave the other lines alone. If you don't have experience with the format you can easily cause the file to not be parseable, and then networking could stop working entirely.

After that, things get more complicated depending on the hardware on which the system runs.

If the device is the SteamDeck, there is a difference between the original LCD and the newer OLED model. With the OLED model, when you switch away from iwd, typically the network device is destroyed, so it's necessary to re-create it. This happens automatically when booting the system, so the easiest way if the network doesn't come up after 15 seconds after the following instructions, is to just restart the whole system.

If you're running SteamOS on models of the SteamDeck that don't exist at the time of writing this, or on other types of computers, the situation can be similar to the OLED or even more complex, depending on the WiFi hardware and drivers. So, again, tread carefully.

In the simplest scenario, LCD model, execute the following commands as root (or prepending sudo), with OLD_BACKEND being either iwd or wpa_supplicant, the one that you want to migrate away from:

systemctl stop NetworkManager
systemctl disable --now OLD_BACKEND
systemctl restart NetworkManager

If you have the OLED model and switching from wpa_supplicant to iwd, that should be enough as well.

But if you have the OLED model and you changed from iwd to wpa_supplicant, as explained above, your system will not be able to connect to the network, and the easiest is to restart the system.

However, if you want to try to solve the problem by hand without restarting, then execute these commands as root/sudo:

systemctl stop NetworkManager
systemctl disable --now OLD_BACKEND

if ! iw dev wlan0 info &>/dev/null; then iw phy phy0 interface add wlan0 type station; fi

systemctl restart NetworkManager

In any case, if after trying to change the back-end by hand and rebooting it's still not being able to connect, perhaps the best thing to do is to restore the previous configuration and restart the system again, and wait until your channel is upgraded to include this command.

Change back-ends via UI?

In the future, it is conceivable that the UI allows to change WiFi backends with the help of this command or by other means, but this is not possible at the moment.

by Manuel A. Fernandez Montecelo at May 15, 2024 09:21 AM

May 13, 2024

Andy Wingo

partitioning pitfalls for generational collectors

You might have heard of long-pole problems: when you are packing a tent, its bag needs to be at least as big as the longest pole. (This is my preferred etymology; there are others.) In garbage collection, the long pole is the longest chain of object references; there is no way you can algorithmically speed up pointer-chasing of (say) a long linked-list.

As a GC author, some of your standard tools can mitigate the long-pole problem, and some don’t apply.

Parallelism doesn’t help: a baby takes 9 months, no matter how many people you assign to the problem. You need to visit each node in the chain, one after the other, and having multiple workers available to process those nodes does not get us there any faster. Parallelism does help in general, of course, but doesn’t help for long poles.

You can apply concurrency: instead of stopping the user’s program (the mutator) to enumerate live objects, you trace while the mutator is running. This can happen on mutator threads, interleaved with allocation, via what is known as incremental tracing. Or it can happen in dedicated tracing threads, which is what people usually mean when they refer to concurrent tracing. Though it does impose some coordination overhead on the mutator, the pause-time benefits are such that most production garbage collectors do trace concurrently with the mutator, if they have the development resources to manage the complexity.

Then there is partitioning: instead of tracing the whole graph all the time, try to just trace part of it and just reclaim memory for that part. This bounds the size of a long pole—it can’t be bigger than the partition you trace—and so tracing a graph subset should reduce total pause time.

The usual way to partition is generational, in which the main program allocates into a nursery, and objects that survive a few collections then get promoted into old space. But there may be other partitions, for example to put “large objects” (for example, bigger than a few virtual memory pages) in their own section, to be managed with their own algorithm.

partitions, take one

And that brings me to today’s article: generational partitioning has a failure mode which manifests itself as spurious out-of-memory. For example, in V8, running this small JavaScript program that repeatedly allocates a 1-megabyte buffer grows to consume all memory in the system and eventually panics:

while (1) new Uint8Array(1e6);

This is a funny result, to say the least. Let’s dig in a bit to see why.

First off, note that allocating a 1-megabyte Uint8Array makes a large object, because it is bigger than half a V8 page, which is 256 kB on most systems There is a backing store for the array that will get allocated into the large object space, and then the JS object wrapper that gets allocated in the young generation of the regular object space.

Let’s imagine the heap consists of the nursery, the old space, and a large object space (lospace, for short). (See the next section for a refinement of this model.) In the loop, we cause allocation in the nursery and the lospace. When the nursery starts to get full, we run a minor collection, to trace the newly allocated part of the object graph, possibly promoting some objects to the old generation.

Promoting an object adds to the byte count in the old generation. You could say that promoting causes allocation in the old-gen, though it might happen just on an accounting level if an object is promoted in place. In V8, old-gen allocation is subject to a limit, and that limit is set by a gnarly series of heuristics. Exceeding the limit will cause a major GC, in which the whole heap is traced.

So, minor collections cause major collections via promotion. But what if a minor collection never promotes? This can happen in request-oriented workflows, in which a request comes in, you handle it in JS, write out the result, and then handle the next. If the nursery is at least as large as the memory allocated in one request, no object will ever survive more than one collection. But in our new Uint8Array(1e6) example above, we are allocating to newspace and the lospace. If we never cause promotion, we will always collect newspace but never trigger the full GC that would also collect the large object space, triggering this out-of-memory phenomenon.

partitions, take two

In general, the phenomenon we are looking for is nursery allocations that cause non-nursery allocations, where those non-nursery allocations will not themselves bring about a major GC.

In our example above, it was a typed array object with an associated lospace backing store, assuming the lospace allocation wouldn’t eventually trigger GC. But V8’s heap is not exactly like our model, for one because it actually has separate nursery and old-generation lospaces, and for two because allocation to the old-generation lospace does count towards the old-gen allocation limit. And yet, that simple does still blow through all memory. So what is the deal?

The answer is that V8’s heap now has around two dozen spaces, and allocations to some of those spaces escape the limits and allocation counters. In this case, V8’s sandbox makes the link between the JS typed array object and its backing store pass through a table of indirections. At each allocation, we make an object in the nursery, allocate a backing store in the nursery lospace, and then allocate a new entry in the external pointer table, storing the index of that entry in the JS object. When the object needs to get its backing store, it dereferences the corresponding entry in the external pointer table.

Our tight loop above would therefore cause an allocation (in the external pointer table) that itself would not hasten a major collection.

Seen this way, one solution immediately presents itself: maybe we should find a way to make external pointer table entry allocations trigger a GC based on some heuristic, perhaps the old-gen allocated bytes counter. But then you have to actually trigger GC, and this is annoying and not always possible, as EPT entries can be allocated in response to a pointer write. V8 hacker Michael Lippautz summed up the issue in a comment that can be summarized as “it’s gnarly”.

In the end, it would be better if new-space allocations did not cause old-space allocations. We should separate the external pointer table (EPT) into old and new spaces. Because there is at most one “host” object that is associated with any EPT entry—if it’s zero objects, the entry is dead—then the heuristics that apply to host objects will do the right thing with regards to EPT entries.

A couple weeks ago, I landed a patch to do this. It was much easier said than done; the patch went through upwards of 40 revisions, as it took me a while to understand the delicate interactions between the concurrent and parallel parts of the collector, which were themselves evolving as I was working on the patch.

The challenge mostly came from the fact that V8 has two nursery implementations that operate in different ways.

The semi-space nursery (the scavenger) is a parallel stop-the-world collector, which is relatively straightforward... except that there can be a concurrent/incremental major trace in progress while the scavenger runs (a so-called interleaved minor GC). External pointer table entries have mark bits, which the scavenger collector will want to use to determine which entries are in use and which can be reclaimed, but the concurrent major marker will also want to use those mark bits. In the end we decided that if a major mark is in progress, an interleaved collection will neither mark nor sweep the external pointer table nursery; a major collection will finish soon, and reclaiming dead EPT entries will be its responsibility.

The minor mark-sweep nursery does not run concurrently with a major concurrent/incremental trace, which is something of a relief, but it does run concurrently with the mutator. When we promote a page, we also need to promote all EPT entries for objects on that page. To keep track of which objects have external pointers, we had to add a new remembered set, built up during a minor trace, and cleared when the page is swept (also lazily / concurrently). Promoting a page iterates through that set, evacuating EPT entries to the old space. This is additional work during the stop-the-world pause, but hopefully it is not too bad.

To be honest I don’t have the performance numbers for this one. It rides the train this week to Chromium 126 though, and hopefully it fixes this problem in a robust way.

partitions, take three

The V8 sandboxing effort has sprouted a number of indirect tables: external pointers, external buffers, trusted objects, and code objects. Also recently to better integrate with compressed pointers, there is also a table for V8-to-managed-C++ (Oilpan) references. I expect the Oilpan reference table will soon grow a nursery along the same lines as the regular external pointer table.

In general, if you have a generational partition of your main object space, it would seem that you almost always need a generational partition of every other space. Otherwise either you cause new allocations to occur in what is effectively an old space, perhaps needlessly hastening a major GC, or you forget to track allocations in that space, leading to a memory leak. If the generational hypothesis applies for a wrapper object, it probably also applies for any auxiliary allocations as well.

fin

I have a few cleanups remaining in this area but I am relieved to have landed this patch, and pleased to have spent time under the hood of V8’s collector. Many many thanks to Samuel Groß, Michael Lippautz, and Dominik Inführ for their review and patience. Until the next cycle, happy allocating!

by Andy Wingo at May 13, 2024 01:03 PM

May 12, 2024

Alex Bradbury

Notes from the Carbon panel session at EuroLLVM 2024

Last month I had the pleasure of attending EuroLLVM which featured a panel session on the Carbon programming language. It was recorded and of course we all know automated transcription can be stunningly accurate these days, yet I still took fairly extensive notes throughout. I often take notes (or near transcriptions) as I find it helps me process. I'm not sure whether I'm adding any value by writing this up, or just contributing entropy but here it goes. You should of course assume factual mistakes or odd comments are errors in my transcription or interpretation, and if you keep an eye on the LLVM YouTube channel you should find the session recording uploaded there in the coming months.

First, a bit of background on the session, Carbon, and my interest in it. Carbon is a programming language started by Google aiming to be used in many cases where you would currently use C++ (billed as "an experimental successor to C++"). The Carbon README gives a great description of its goals and purpose, but one point I'll highlight is that ease of interoperability with C++ is a key design goal and constraint. The project recognises that this limits the options for the language to such a degree that it explicitly recommends that if you are able to make use of a modern language like Go, Swift, Kotlin, or Rust, then you should. Carbon is intended to be there for when you need that C++ interoperability. Of course, (and as mentioned during the panel) there are parallel efforts to improve Rust's ability to interoperate with C++.

Whatever other benefits the Carbon project is able to deliver, I think there's huge value in the artefacts being produced as part of the design and decision making process so far. This has definitely been true of languages like Swift and especially Rust, where going back through the development history there's masses of discussion on difficult decisions e.g. (in the case of Rust) green threads, the removal of typestate, or internal vs external iterators. The Swift Evolution and Rust Internals forums are still a great read, but obviously there's little ability to revisit fundamental decisions at this point. I try to follow Carbon's development due to a combination of its openness, the fact it's early stage enough to be making big decisions (e.g. lambdas in Carbon), and also because they're exploring different design trade-offs than those other languages (e.g. the Carbon approach to definition-checked generics). I follow because I'm a fan of programming language design discussion, not necessarily because of the language itself, which may or may not turn out to be something I'm excited to program in. As a programming language development lurker I like to imagine I'm learning something by osmosis if nothing else - but perhaps it's just a displacement activity...

If you want to keep up with Carbon, then check out their recently started newsletter, issues/PRs/discussions on GitHub and there's a lot going on on Discord too (I find group chat distracting so don't really follow there). The current Carbon roadmap is of course worth a read too. In case I wasn't clear enough already, I have no affiliation to the project. From where I'm standing, I'm impressed by the way it's being designed and developed out in the open, more than (rightly or wrongly) you might assume given the Google origin.

Notes from the panel session

The panel "Carbon: An experiment in different tradeoffs" took place on April 11th at EuroLLVM 2024. It was moderated by Hana Dusíková and the panel members were Chandler Carruth, Jon Ross-Perkins, and Richard Smith. Rather than go through each question in the order they were asked I've tried to group together questions I saw as thematically linked. Anything in quotes should be a close approximation of what was said, but I can't guarantee I didn't introduce errors.

Background and basics of Carbon

This wasn't actually the first question, but I think a good starting point is 'how would you sell Carbon to a C++ user?'. Per Chandler, "we can provide a better language, tooling, and ecosystem without needing to leave everything you built up in C++ (both existing code and the training you had). The cost of leveraging carbon should have as simple a ramp as possible. Sometimes you need performance, we can give more advanced tools to help you get the most performance from your code. In other cases, it's security and we'll be able to offer more tools here than C++. It's having the ability to unlock improvements in the language without having to walk away from your investments in C++, that of course isn't going anywhere."

When is it going to be done? Chandler: "When it's ready! We're trying to put our roadmap out there publicly where we can. It's a long term project. Our goal for this year is to get the toolchain caught up with the language design, get into building practical C++ interop this year. There are many unknowns in terms of how far we'll get this year, but next year I think you'll see a toolchain that works and you can do interesting stuff with and evaluate in a more concrete context."

As for the size of the team and how many companies are contributing, the conclusion was that Google is definitely the main backer right now but there are others starting to take a look. There are probably about 5-10 people active on Carbon in any different week, but it varies so this can be a different set of people from week to week.

Given we've established that Google are still the main backer, one line of questioning was about what Google see in it and how management were convinced to back it. Richard commented "I think we're always interested in exploring new possibilities for how to deal with our existing C++ codebase, which is a hugely valuable asset for us. That includes both looking at how we can keep the C++ language itself happy and our tools for it being good and maintainable, but also places we might take it in the future. For a long time we've been talking about if we can build something that works better for our use case than existing tools, and had an opportunity to explore that and went with it."

A fun question that followed later aimed to establish the favourite thing about Carbon from each of the panel members (of course, it's nice that much of this motivation overlaps with the reasons for my own interest!):

  • Jon: For me it's unlocking the ability to improve the developer experience. Improving the language syntax, making things more understandable. Also C++ is very slow to compile and we're hoping for 1/3rd of the compile time for a typical file vs Clang.
  • Richard: from a language design perspective, being able to go through and systematically evaluate every decision and look at them again using things that have been learned since then.
  • Chandler: It's not so much the systematic review for me. We're building this with an eye to C++, and to integrate with C++ need a fairly big complex language (generics, inheritance, overloading etc etc). But when we look at this stuff, we keep finding principled and small/focused designs that we can use to underpin the sort of language functionality that is in C++ but fits together in a coherent and principled way. And we write this down. I think a lot of people don't realise that Carbon is writing down all the design as we go rather than having some mostly complete starting point and then making changes to that using something like an RFC process.

Finally, (for this section) there was also some intriguing discussion about Carbon's design tradeoffs. This is covered mostly elsewhere, but I think Chandler's answer focusing on the philosophy of Carbon rather than technical implementation details fits well here: "One of the philosophical goals of our language design is we don't try to solve hard problems in the compiler. [In other languages] there's often a big problem and the compiler has to solve it, freeing the programmer from worrying about it. Instead we provide the syntax and ask the programmer to tell us. e.g. for type inference, you could imagine having Hindley-Milner type inference. That's a hard problem, even if it makes ergonomics better in some cases. So we use a simpler and more direct type system. You see this crop up all over the language design."

Carbon governance

The first question related to organisation of the project and its governance was about what has been done to make it easier for contributors to get involved. My interpretation of the responses is that although they've had some success in attracting new contributors, the feeling is that there's more that could be done here. e.g. from Jon "Things like maintaining the project documentation, writing a start guide on how to build. We're making some different infrastructure choices, e.g. bazel, but in general we're trying to use things people are familiar with: GitHub, GH actions etc. There's probably still room for improvement. Listening to the LLVM newcomers session a couple of days ago gave me some ideas. Right now there's probably a lot of challenge learning how the different model works in Carbon."

The governance model for Carbon in general was something Chandler had a lot to say about:

  • On getting it in place early: "I was keen we start off with it, seeing challenges with C++ or even LLVM where as a project gets big the governance model that worked well at the beginning might not work so well and changing it at that point can be really hard. So I wanted to set up something that could scale, and can be changed if needed.
  • Initial mistakes: "The initial model was terrible. It was very elaborate, a large core team of people that pulled in people that weren't even involved in the project. We'd have weekly/biweekly meetings, a complex consensus process. In some ways it worked, we were being intentional and it was clear there was a governance project. But its was so inefficient, so people tried to avoid it. So we revamped it and simplified it, distilling it to something a little closer to the benevolent dictator for life model with minimal patches. Three leads rather than one to add some redundancy, one top-level governance body, will eventually have a rotation there."
  • The model now: "It's not that we'll try to agree on everything - if a change is uncontroversial then any one of the leads can approve, which gives us 3x bandwidth. Except for the cases if there is controversy, then the three of us step back and discuss. I had my personal pet favourite feature of carbon overturned by this governance model. The model allows us to make decisions quickly and easily, and most decisions are reversible if they turn out to be wrong. So we don't try to optimise for getting to the best outcome, rather to minimise cost for getting to an outcome. We're still early - we have a plan for scaling but that's more in the future."

Further questioning led to additional points being made about the governance model, such as the fact there are now several community members unaffiliated to Google with commit privileges, that the project goals are documented which helps contributors understand if a proposal is likely to be aligned with Carbon or not, and that these goals are potentially changeable if people come in with good arguments as to why they should be updated.

Carbon vs other languages

Naturally, there were multiple questions related to Carbon vs Rust. In terms of high-level comparisons, the panel members were keen to point out that Rust is a great language and is mature, while Carbon remains at an early stage. There are commonalities in terms of the aim to provide modern type system features to systems programmers, but the focus on C++ interoperability as a goal driving the language design is a key differentiator.

Thoughts about Rust naturally lead to questions about Carbon's memory safety story, and whether Carbon plans to have something like Rust's borrow checker. Richard commented "I think it's possible to get something like that. We're not sure exactly what it will look like yet, we're planning to look at this more between the 0.1 and 0.2 milestone. Borrow checking is an interesting option to pursue but there are some others to explore."

Concerns were also raised about whether given that C++ interop is such a core goal of Carbon, it may have problems as C++ continues to evolve (perhaps with features that clash with features add in Carbon). Chandler's response was "As long as clang gets new C++ features, we'll get them. Similar to how swift's interop works, linking the toolchain and using that to understand the functionality. If C++ starts moving in ways addressing the safety concerns etc that would be fantastic. We could shut down carbon! I don't think that's very likely. In terms of functionality that would make interop not work well that's a bit of a concern, but of course if C++ adds something they need it to interop with existing C++ code so we face similar constraints. While Richard commented "C++ modules is a big one we need to keep an eye on. But at the moment as modules adoption hasn't taken off in a big way yet, we're still targeting header files." There was also a brief exchange about what if Carbon gets reflection before C++, with the hope that if it did happen it could help with the design process by giving another example for C++ to learn from (just as Carbon learns from Rust, Swift, Go, C++, and others).

Compiler implementation questions

EuroLLVM is of course a compilers conference, so there was a lot of curiousity about the implementation of the Carbon toolchain itself. In discussing trade-offs in the design, Jon leapt into a discussion of this "to go for SemIR vs a traditional AST. In Clang the language is represented by an AST, which is what you're typically taught about how compilers work. We have a semantic IR model where we produce lex tokens, then a parse tree (slightly different to an AST) an then it becomes SemIR. This is very efficient and fast to process, lowers to LLVM IR very easily, but it's a different model and a different way to think about writing a compiler. To the earlier point about newcomers, it's something people have to learn and a bit of a barrier because of that. We try to address it, but it is a trade-off." Jon since provided more insight on Carbon's implementation approach in a recent /r/ProgrammingLanguages thread.

Given what Carbon devs have learned about applying data oriented design to optimise the frontend (see talks like Modernizing Compiler Design for Carbon Toolchain), could the same ideas be applied to the backend? Chandler commented on the tradeoff he sees "By necessity with data-oriented design you have to specialise a lot. We think we can benefit from this a lot with the frontend at the cost of reusability. The trade-off might be different within LLVM."

When asked about whether the Carbon frontend may be reusable for other tools (reducing duplicated effort writing parsers for Carbon), Jon responded "I'm pretty confident at least through the parse tree it will be reusable. I'm more hesitant to make that claim through to SemIR. It may not be the best choice for everyone, and in some cases you may want more of an AST. But providing these tools as part of the language project like a formatter, migration tool [for language of API changes], something like clang-tidy, we're going to be taking on these costs ourselves which is going to incentivise us to find ways of amortising this and finding the right balance."


Article changelog
  • 2024-05-13: Various phrasing tweaks and fixed improper heading nesting.
  • 2024-05-12: Initial publication date.

May 12, 2024 12:00 PM

May 09, 2024

Brian Kardell

Brian Specific Failures

Brian Specific Failures

Through my work in Interop, I've been I've been learning stuff about WPT. Part of this has been driven by a desire to really understand the graph on the main page about browser specific failures (like: which tests?). You might think this is stuff I'd already know, but I didn't. So, I thought I'd share some of the things I learned along the way, just incase you didn't know either.

Did you know that wpt.fyi has lots of ways to query it? It does!

You can, for example, type nameOfBrowser: fail (or nameOfBrowser: pass) into the search box - and you can even string together several of them to ask interesting questions!

What tests fail in every browser?

It seemed like an interesting question: Are there tests that fail in every single browser?

A query for that might look like: chrome:fail firefox:fail edge:fail safari:fail (using the browsers that are part of Interop, for example). And, as of this writing, it tells me...

There are 2,724 tests (15,718 subtests) which fail in every single browser!

Wow. I didn't expect there to be zero, but that feels like kind of a lot of tests to fail in every browser, doesn't it? How can that even happen? To be totally honest, I don't know!

A chicken and egg problem?

The purpose of WPT is to test conformance to a standard, but this is... tricky. Tests can be used to discuss specs as they are being written - or highlight things that are yet to be implemented, and WPT doesn't have the same kind of guardrails as specs. Things can get committed there that aren't "real" or yet agreed upon. Frequently the first implementers and experimenters write tests. And, it's not uncommon that then things change - or sometimes the specs just never go anywhere. Maybe those tests sit in a limbo "hoping" to advance?

WPT asks that files of early things like this be placed in a /tentative directory or have a .tentative in the names of early things like that which don't have agreement.

Luckily thanks to the query API we can see which of the tests above fall into that category by adding is:tentative to our query: chrome:fail firefox:fail edge:fail safari:fail is:tentative). We can see that indeed

348 of the tests (1,911 subtests) that fail in every single browser are marked as tentative.

The query language supports negating parameters with an exclamation, so we can adjust the query (chrome:fail firefox:fail edge:fail safari:fail !is:tentative) and see...

2,372 tests (13,797 subtests) of the tests that fail in every single browser are not marked as tentative.

So, I guess .tentative doesn't explain most of it (or maybe some of those just didn't follow that guidance). What does? I don't know!

Poking around, I see there's even 1 ACID test that literally everyone fails. Maybe that's just a bad test at this point? Maybe all of the tests that match this query need reviewed? I don't know!

You can also query simply all(status:fail) which will show pretty much the same thing for whatever browsers you happen to be showing. I like the explicit version for introducing this though as it's very clear from the query itself which "all" I'm referring to about.

Are there any tests that only pass in one browser?

I also wondered: Hmm... If there are tests that fail in exactly one browser, and we've just shown there's a bunch that pass in none, how many are there that pass in exactly one? That's pretty interesting to look at too:

Engines and Browsers: Tricky

Often when we're showing results we're using the 3 flagship browsers as a proxy for engine capabilies. We do that a lot - like in the Browsers Specific Failures (BSF) graph at the top of wpt.fyi - but... It's an imperfect proxy.

For example, there are also tests that only fail in Edge. As of this writing 1,456 tests (1,727 subtests) of them.

Or, there's also tests that fail uniquely in WebKit GTK - 1,459 tests (1,740 subtests) of those.

Here's where there's a bit of a catch-22 though. BSF is really useful for the browser teams, so links like the above are actually very handy if you're on the Edge or GTK teams. But we can't add those to the BSF chart because it kind of really changes the meaning of the whole thing. What that kind of wants to be is "Engine Specific Failures", but that's not actually directly measurable.

Brian Specific Failures...

Below is an interesting table which links to query results that show actual tests that report as failing in only one of the three flagship browsers.

Browser Non-tentative Subtests which uniquely fail among these browsers
Chrome 1,345 tests (12,091 subtests)
Firefox 2,290 tests (16,791 subtests)
Safari 4,263 tests (14,867 subtests)

If that sounds like the data for the Browser Specific Failures (BSF) graph on the front page of wpt.fyi, yes. And no. I call this the "Brian Specific Failures" (BSF) table.

I think that this is about as close as we're probably going to get in the short term to a linkable set of tests that fail, if you'd like to explore them. The BSF graph is also, I believe, more disciminating than just "pass" and "fail" that we're showing here. Like, what do you do if a test times out? Are there intermediate or currently unknown statuses?

It was also kind of interesting, for me at least, while looking at the data, to realize just how different the situation looks depending on whether you are looking at tests, or subtests. Firefox fails the most subtests, but Safari fails the most tests. BSF "scores" subtests as decimal values of a whole test.

It's pretty complicated to pull any of this off actually. Things go wrong with tests and runs, and so on. I also realized that all of these scores are inherently a little imperfect.

For example, if you at the top of the page, it says (as I am writing this) there are 55,176 tests (1,866,346 subtests)

But if you look to the bottom of the screen, the table has a summary with the passing/total subtests for each browser. As I am writing this it says:

chrome: 1836812 / 1890027, edge: 1834777 / 1887771, firefox: 1786949 / 1868316, safari: 1787112 / 1879926

You might note that none of those lists the same number of subtests!

Anyway, it was an interesting rabbit hole to dive into! I have further thoughts on the WPT.fyi dashboard page which I'll share in another post, but I'd love to hear of any interesting queries that you come up with or questions you have. I probably don't know the answers, but I'll try to help find them!

May 09, 2024 04:00 AM

Igalia Compilers Team

MessageFormat 2.0: A domain-specific language?

code { word-break: normal; }

I recommend reading the prequel to this post, "MessageFormat 2.0: a new standard for translatable messages"; otherwise what follows won't make much sense!

This blog post represents the Igalia compilers team; all opinions expressed here are our own and may not reflect the opinions of other members of the MessageFormat Working Group or any other group.

What we've achieved #

In the previous post, I've just described the work done by the members of the MF2 Working Group to develop a specification for MF2, a process that spanned several years. When I joined the project in 2023, my task was to implement the spec and produce a back-end for the JavaScript Intl.MessageFormat proposal.

|-------------------------------|
| JavaScript code using the API |
|-------------------------------|
| browser engine                |
|-------------------------------|
| MF2 in ICU                    |
|-------------------------------|

For code running in a browser, the stack looks like this: web developers write JavaScript code that uses the API; their code runs in a JavaScript engine that's built into a Web browser; and that browser makes API calls to the ICU library in order to execute the JS code.

The bottom layer of the stack is what I've implemented as a tech preview. Since the major JavaScript engines are implemented in C++, I worked on ICU4C, the C++ version of ICU.

The middle level is not yet implemented. A polyfill makes it possible to try out the API in JavaScript now.

Understanding MF2 #

To understand the remainder of this post, it's useful to have read over a few more examples from the MF2 documentation. See:

Is MF2 a programming language? (And does it matter?) #

Different members of the working group have expressed different opinions on whether MF2 is a programming language:

  • Addison Phillips: "I think we made a choice (we can reconsider it if necessary, although I don't think we need to) that MF2 is not a resource format. I'm not sure if 'templating language' is the right characterization, but let's go with it for now." (comment on pull request 474, September 2023)
  • Stanisław Małolepszy: "...we’re neither resource format nor templating language. I tend to think we’re a storage format for variants." (comment on the same PR)
  • Mihai Niță: "For me, this is a DSL designed for i18n / l10n." (comment on PR 507)

MF2 lacks block structure and complex control flow: the .match construct can't be nested, and .local variable declarations are sequential rather than nested. Some people might expect a general-programming language to include these features.

But in implementing it, I have found that several issues arise that are familiar from my experience studying, and implementing, compilers and interpreters for more conventional programming languages.

I wrote in part 1 that when using either MF1 or MF2, messages can be separated from code -- but it turns out that maybe, messages are code! But they constitute code in a different programming language, which can be cleanly separated from the host programming language, just as with uninterpreted strings.

Abstract syntax #

As with a programming language, the syntax of MF2 is specified using a formal grammar, in this case using Augmented Backus-Naur Form. In most compilers and interpreters, a parser (that's either generated from a specification of the formal grammar, or written by hand to closely follow that grammar) reads the text of the program and generates an abstract syntax tree (AST), which is an internal representation of the program.

MF2 has a data model, which is like an AST. Users can either write messages in the text-based syntax (probably easiest for most people) or construct a data model directly using part of the API (useful for things like building external tools).

Parsers and ASTs are familiar concepts for programming language implementors, and writing the "front-end" for MF2 (the parser, and classes that define the data model) was not that different from writing a front-end for a programming language. Because it works on ASTs, the formatter itself (the part that does all the work once everything is parsed) also looks a lot like an interpreter. The "back-end", producing a formatted result, is currently quite simple since the end result is a flat string -- however, in the future, "formatting to parts" (necessary to easily support features like markup) will be supported.

For more details, see Eemeli Aro's FOSDEM 2024 talk on the data model.

Naming #

MF2 has local variable declarations, like the let construct in JavaScript and in some functional programming languages. That means that in implementing MF2, we've had to think about evaluation strategies (lazy versus eager evaluation), scope, and mutability. MF2 settled on a spec in which all local variables are immutable and no name shadowing is permitted except in a very limited way, with .input declarations. Free variables are permitted, and don't need to be declared explicitly: these correspond to runtime arguments provided to a message. Eager and lazy implementations can both conform to the spec, since variables are immutable, and lazy implementations are free to use call-by-need semantics (memoization). These design choices weren't obvious, and in some cases, inspired quite a lot of discussion.

Naming can have subtle consequences. In at least one case, the presence of .local declarations necessitates a dataflow analysis (albeit a simple one). The analysis is to check for "missing selector annotation" errors; the details are outside the scope of this post, but the point is just that error checking requires "looking through" a variable reference, possibly multiple chains of references, to the variable's definition. Sounds like a programming language!

Functions #

Placeholders in MF2 may be expressions with annotations. These expressions resemble function calls in a more conventional language. Unlike in most languages, functions cannot be implemented in MF2 itself, but must be implemented in a host language. For example, in the ICU4C implementation, the functions would either be written in C++ or (less likely) another language that has a foreign function interface with C++.

Functions can be built-in (required in the spec to be provided), or custom. Programmers using an implementation of the MF2 API can write their own custom functions that format data ("formatters"), that customize the behavior of the .match construct ("selectors"), or do both. Formatters and selectors have different interfaces.

The term "annotation" in the MF2 spec is suggestive. For example, a placeholder like {$x :number} can be read: "x should be formatted as a number." But custom formatters and selectors may also do much more complicated things, as long as they conform to their interfaces. An implementor could define a custom :squareroot function, for example, such that {$x :squareroot} is replaced with a formatted number whose value is the square root of the runtime value of $x. It's unclear how common use cases like this will be, but the generality of the custom function mechanism makes them possible.

Wherever there are functions, functional programmers think about types, explicit or implicit. MF2 is designed to use a single runtime type: it's unityped. The spec uses the term "resolved value" to refer to the type of its runtime values. Moreover, the spec tries to avoid constraining the structure of "resolved values", to allow the implementation as much freedom as possible. However, the implementation has to call functions written in a host language that may have a different type system. The challenge comes in defining the "function registry", which is the part of the MF2 spec that defines both the names and behavior of built-in functions, and how to specify custom functions.

Different host languages have different type systems. Specifying the interface between MF2 and externally-defined functions is tricky, since the goal is to be able to implement MF2 in any programming language.

While a language that can call foreign functions but not define its own functions is unusual, defining foreign function interfaces that bridge the gaps between disparate programming languages is a common task when designing a language. This also sounds like a programming language.

Pattern matching #

The .match construct in MF2 looks a lot like a case expression in Haskell, or perhaps a switch statement in C/C++, depending on your perspective. Unlike switch, .match allows multiple expressions to be examined in order to choose an alternative. And unlike case, .match only compares data against specific string literals or a wildcard symbol, rather than destructuring the expressions being scrutinized.

The ability provided by MF2 to customize pattern-matching is unusual. An implementation of a selector function takes a list of keys, one per variant, and returns a sorted list of keys in order of preference. The list is then used by the pattern matching algorithm in the formatter to pick the best-matching variant given that there are multiple selectors. An abstract example is:

.match {$x :X} {$y :Y}
A1 A2 {{1}}
B1 B2 {{2}}
C1 * {{3}}
* C2 {{4}}
* * {{5}}

In this example, the implementation of X would be called with the list of keys [A1, B1, C1] (the * key is not passed to the selector implementation) and returns a list of the same keys, arranged in any order. Likewise, the implementation of Y would be called with [A2, B2, C2].

I don't know of any existing programming languages with a pattern-matching construct like this one; usually, the comparison between values and patterns is based on structural equality and can't be abstracted over. But the ability to factor out the workings of pattern matching and swap in a new kind of matching (by defining a custom selector function) is a kind of abstraction that would be found in a general-purpose programming language. Of course, the goal here is not to match patterns in general but to select a grammatically correct variant depending on data that flows in at runtime.

But is it Turing-complete? #

MF2 has no explicit looping constructs or recursion, but it does have function calls. The details of how custom functions are realized are implementation-specific; typically, using a general-purpose programming language. That means that MF2 can invoke code that does arbitrary computation, but not express it. I think it would be fair to say that the combination of MF2 and a suitable registry of custom functions is Turing-complete, but MF2 itself is not Turing-complete. For example, imagine a custom function named eval that accepts an arbitrary JavaScript program as a string, and returns its output as a string. This is not how MF2 is intended to be used; the spec notes that "execution time SHOULD be limited" for invocations of custom functions. I'm not aware of any other languages whose Turing-completeness depends on their execution environment. (Though there is at least one lengthy discussion of whether CSS is Turing-complete.)

If custom functions were removed altogether and the registry of functions was limited to a small built-in set, MF2 would look much less like a programming language; its underlying "instruction set" would be much more limited.

Code versus data #

There is an old Saturday Night Live routine: “It’s a floor wax! It’s a dessert topping! It’s both!” XML is similar. “It’s a database! It’s a document! It’s both!” -- Philip Wadler, "XQuery: a typed functional language for querying XML" (2002)

The line between code and data isn't always clear, and the MF2 data model can be seen as a representation of the input data, rather than as an AST representing a program. Likewise, the syntax can be seen as a serialized format for representing the data, rather than as the syntax of a program.

There is also a "floor wax / dessert topping" dichotomy when it comes to functions. Is MF2 an active agent that calls functions, or does it define data passed to an API, whose implementation is what calls functions?

"Language" has multiple meanings in software: it can refer to a programming language, but the "L" in "HTML" and "XML" stands for "language". We would usually say that HTML and XML are markup languages, not programming languages, but even that point is debatable. After all, HTML embeds JavaScript code; the relationship between HTML and JavaScript resembles the relationship between MF2 and function implementations. Languages differ in how much computation they can directly express, but there are common aspects that unite different languages, like having a formal grammar.

I view MF2 as a domain-specific language for formatting messages, but another perspective is that it's a representation of data passed to an API. An API itself can be viewed as a domain-specific language: it provides verbs (functions or methods that can be called) and nouns (data structures that the functions can accept.)

Summing up #

As someone with a background in programming language design and implementation, I'm naturally inclined to see everything as a language design problem. A general-purpose programming language is complex, or at least can be used to solve complex problems. In contrast, those who have worked on the MF2 spec over the years have put in a lot of effort to make it as simple and focused on a narrow domain as possible.

One of the ways in which MF2 has veered towards programming language territory is the custom function mechanism, which was added to provide extensibility. The mechanism is general, but if there was a less general solution that still supported the desired range of use cases, these years of effort would have unearthed one. The presence of name binding (with all the complexities that brings up) and an unusual form of pattern matching also suggest to me that it's appropriate to consider MF2 a programming language, and to apply known programming language techniques to its design and implementation. This shows that programming language theory has interesting applications in internationalization, which is a new finding as far as I know.

What's left to do? #

The MessageFormat spec working group welcomes feedback during the tech preview period. User feedback on the MF2 spec and implementations will influence its future design. The current tech preview spec is part of LDML 45; the beta version, to be included in LDML 46, may include improvements suggested by users and implementors.

Igalia plans to continue collaboration to advance the Intl.MessageFormat proposal in TC39. Implementation in browser engines will be part of that process.

Acknowledgments #

Thanks to my colleagues Ben Allen, Philip Chimento, Brian Kardell, Eric Meyer, Asumu Takikawa, and Andy Wingo; and MF2 working group members Eemeli Aro, Elango Cheran, Richard Gibson, Mihai Niță, and Addison Phillips for their comments and suggestions on this post and its predecessor. Special thanks to my colleague Ujjwal Sharma, whose work I borrowed from in parts of the first post.

May 09, 2024 12:00 AM

May 07, 2024

Melissa Wen

Get Ready to 2024 Linux Display Next Hackfest in A Coruña!

We’re excited to announce the details of our upcoming 2024 Linux Display Next Hackfest in the beautiful city of A Coruña, Spain!

This year’s hackfest will be hosted by Igalia and will take place from May 14th to 16th. It will be a gathering of minds from a diverse range of companies and open source projects, all coming together to share, learn, and collaborate outside the traditional conference format.

Who’s Joining the Fun?

We’re excited to welcome participants from various backgrounds, including:

  • GPU hardware vendors;
  • Linux distributions;
  • Linux desktop environments and compositors;
  • Color experts, researchers and enthusiasts;

This diverse mix of backgrounds are represented by developers from several companies working on the Linux display stack: AMD, Arm, BlueSystems, Bootlin, Collabora, Google, GravityXR, Igalia, Intel, LittleCMS, Qualcomm, Raspberry Pi, RedHat, SUSE, and System76. It’ll ensure a dynamic exchange of perspectives and foster collaboration across the Linux Display community.

Please take a look at the list of participants for more info.

What’s on the Agenda?

The beauty of the hackfest is that the agenda is driven by participants! As this is a hybrid event, we decided to improve the experience for remote participants by creating a dedicated space for them to propose topics and some introductory talks in advance. From those inputs, we defined a schedule that reflects the collective interests of the group, but is still open for amendments and new proposals. Find the schedule details in the official event webpage.

Expect discussions on:

KMS Color/HDR
  • Proposal with new DRM object type:
    • Brief presentation of GPU-vendor features;
    • Status update of plane color management pipeline per vendor on Linux;
  • HDR/Color Use-cases:
    • HDR gainmap images and how should we think about HDR;
    • Google/ChromeOS GFX view about HDR/per-plane color management, VKMS and lessons learned;
  • Post-blending Color Pipeline.
  • Color/HDR testing/CI
    • VKMS status update;
    • Chamelium boards, video capture.
  • Wayland protocols
    • color-management protocol status update;
    • color-representation and video playback.
Display control
  • HDR signalling status update;
  • backlight status update;
  • EDID and DDC/CI.
Strategy for video and gaming use-cases
  • Multi-plane support in compositors
    • Underlay, overlay, or mixed strategy for video and gaming use-cases;
    • KMS Plane UAPI to simplify the plane arrangement problem;
    • Shared plane arrangement algorithm desired.
  • HDR video and hardware overlay
Frame timing and VRR
  • Frame timing:
    • Limitations of uAPI;
    • Current user space solutions;
    • Brainstorm better uAPI;
  • Cursor/overlay plane updates with VRR;
  • KMS commit and buffer-readiness deadlines;
Power Saving vs Color/Latency
  • ABM (adaptive backlight management);
  • PSR1 latencies;
  • Power optimization vs color accuracy/latency requirements.
Content-Adaptive Scaling & Sharpening
  • Content-Adaptive Scalers on display hardware;
  • New drm_colorop for content adaptive scaling;
  • Proprietary algorithms.
Display Mux
  • Laptop muxes for switching of the embedded panel between the integrated GPU and the discrete GPU;
  • Seamless/atomic hand-off between drivers on Linux desktops.
Real time scheduling & async KMS API
  • Potential benefits: lower latency input feedback, better VRR handling, buffer synchronization, etc.
  • Issues around “async” uAPI usage and async-call handling.

In-person, but also geographically-distributed event

This year Linux Display Next hackfest is a hybrid event, hosted onsite at the Igalia offices and available for remote attendance. In-person participants will find an environment for networking and brainstorming in our inspiring and collaborative office space. Additionally, A Coruña itself is a gem waiting to be explored, with stunning beaches, good food, and historical sites.

Semi-structured structure: how the 2024 Linux Display Next Hackfest will work

  • Agenda: Participants proposed the topics and talks for discussing in sessions.
  • Interactive Sessions: Discussions, workshops, introductory talks and brainstorming sessions lasting around 1h30. There is always a starting point for discussions and new ideas will emerge in real time.
  • Immersive experience: We will have coffee-breaks between sessions and lunch time at the office for all in-person participants. Lunches and coffee-breaks are sponsored by Igalia. This will keep us sharing knowledge and in continuous interaction.
  • Spaces for all group sizes: In-person participants will find different room sizes that match various group sizes at Igalia HQ. Besides that, there will be some devices for showcasing and real-time demonstrations.

Social Activities: building connections beyond the sessions

To make the most of your time in A Coruña, we’ll be organizing some social activities:

  • First-day Dinner: In-person participants will enjoy a Galician dinner on Tuesday, after a first day of intensive discussions in the hackfest.
  • Getting to know a little of A Coruña: Finding out a little about A Coruña and current local habits.

Participants of a guided tour in one of the sectors of the Museum of Estrella Galicia (MEGA). Source: mundoestrellagalicia.es

  • On Thursday afternoon, we will close the 2024 Linux Display Next hackfest with a guided tour of the Museum of Galicia’s favorite beer brand, Estrella Galicia. The guided tour covers the eight sectors of the museum and ends with beer pouring and tasting. After this experience, a transfer bus will take us to the Maria Pita square.
  • At Maria Pita square we will see the charm of some historical landmarks of A Coruña, explore the casual and vibrant style of the city center and taste local foods while chatting with friends.

Sponsorship

Igalia sponsors lunches and coffee-breaks on hackfest days, Tuesday’s dinner, and the social event on Thursday afternoon for in-person participants.

We can’t wait to welcome hackfest attendees to A Coruña! Stay tuned for further details and outcomes of this unconventional and unique experience.

May 07, 2024 02:33 PM

May 06, 2024

Igalia Compilers Team

MessageFormat 2.0: a new standard for translatable messages

code { word-break: normal; }
This blog post represents the Igalia compilers team; all opinions expressed here are our own and may not reflect the opinions of other members of the MessageFormat Working Group or any other group.

MessageFormat 2.0 is a new standard for expressing translatable messages in user interfaces, both in Web applications and other software. Last week, my work on implementing MessageFormat 2.0 in C++ was released in ICU 75, the latest release of the International Components for Unicode library.

As a compiler engineer, when I learned about MessageFormat 2.0, I began to see it as a programming language, albeit an unconventional one. The message formatter is analogous to an interpreter for a programming language. I was surprised to discover a programming language hiding in this unexpected place. Understanding my surprise requires some context: first of all, what "messages" are.

A story about message formatting #

Over the past 40 to 50 years, user interfaces (UIs) have grown increasingly complex. As interfaces have become more interactive and dynamic, the process of making them accessible in a multitude of natural languages has increased in complexity as well. Internationalization (i18n) refers to this general practice, while the process of localization (l10n) refers to the act of modifying a specific system for a specific natural language.

i18n history #

Localization of user interfaces (both command-line and graphical) began by translating strings embedded in code that implements UIs. Those strings are called "messages". For example, consider a text-based adventure game: messages like "It is pitch black. You are likely to be eaten by a grue." might appear anywhere in the code. As a slight improvement, the messages could all be stored in a separate "resource file" that is selected based on the user's locale. The ad hoc approaches to translating these messages and integrating them into code didn't scale well. In the late 1980s, C introduced a gettext() function into glibc, which was never standardized but was widely adopted. This function primarily provided string replacement functionality. While it was limited, it inspired the work that followed. Microsoft and Apple operating systems had more powerful i18n support during this time, but that's beyond the scope of this post.

The rise of the Web required increased flexibility. Initially, static documents and apps with mostly static content dominated. Java, PHP, and Ruby on Rails all brought more dynamic content to the web, and developers using these languages needed to tinker more with message formatting. Their applications needed to handle not just static strings, but messages that could be customized to dynamic content in a way that produced understandable and grammatically correct messages for the target audience. MessageFormat arose as one solution to this problem, implemented by Taligent (along with other i18n functionality).

The creators of Java had an interest in making Java the preferred language for implementing Web applications, so Sun Microsystems incorporated Taligent's work into Java in JDK 1.1 (1997). This version of MessageFormat supported pattern strings, formatting basic types, and choice format.

The MessageFormat implementation was then incorporated into ICU, the widely used library that implements internationalization functions. Initially, Taligent and IBM (which was one of Taligent's parent companies) worked with Sun to maintain parallel i18n implementations in ICU and in the JDK, but eventually the two implementations diverged. Moving at a faster pace, ICU added plural selection to MessageFormat and expanded other capabilities, based on feedback from developers and translators who had worked with the previous generations of localization tools. ICU also added a C++ port of this Java-based implementation; hence, the main ICU project includes ICU4J (implemented in Java) and ICU4C (implemented in C++ with some C APIs).

Since ICU MessageFormat was introduced, many much more complex Web UIs, largely based on JavaScript, have appeared. Complex programs can generate much more interesting content, which implies more complex localization. This necessitates a different model of message formatting. An update to MessageFormat has been in the works at least since 2019, and that work culminates with the recent release of the MessageFormat 2.0 specification. The goals are to provide more modularity and extensibility, and easier adaptation for different locales.

For brevity, we will refer to ICU MessageFormat, otherwise known simply as "MessageFormat", as "MF1" throughout. Likewise, "MF2" refers to MessageFormat 2.0.

Why message formatting? (Not just for translation) #

The work described in this post is motivated by the desire to make it as easy as possible to write software that is accessible to people regardless of the language(s) they use to communicate. But message formatting is needed even in user interfaces that only support a single natural language.

Consider this C code, where number is presumed to be an unsigned integer variable that's already been defined:

printf("You have %u files", number);

We've probably all used applications that informed us that we "have 1 files". While it's probably clear what that means, why should human readers have to mentally correct the grammatical errors of a computer? A programmer trying to improve things might write:

printf("You have %u file%s", number, number == 1 ? "" : "s");

which handles the case where n is 1. However, English is full of irregular plurals: consider "bunny" and "bunnies", "life" and "lives", or "mouse" and "mice". The code would be easier to read and maintain if it was easier to express "print the plural of 'file'", instead of a conditional expression that must be scrutinized for its meaning. While "printf" is short for "print formatted", its formatting is limited to simpler tasks like printing numbers as strings, not complex tasks like pluralizing "mouse".

This is just one example; another example is messages including ordinals, like "1st", "2nd", or "3rd", where the suffix (like "st") is determined in a complicated way from the number (consider "11th" versus "21st").

We've seen how messages can vary based on user input in non-local ways: the bug in "You have %u files" is that not only does the number of files vary, so does the word "files". So you might begin to see the value of a specialized notation for expressing such messages, which would make it easier to notice bugs like the "1 files" bug and even prevent them in the first place. That notation might even resemble a domain-specific language.

Why message formatting? #

Turning now to translation, you might be wondering why you would need a separate message formatter, instead of writing code in your programming language of choice that would look something like this (if your language of choice is C/C++):

// Supposing a char* "what" and int "planet" are already defined
printf("There was %s on planet %d", what, planet);

Here, the message is the first argument to printf and the other arguments are substituted into the message at runtime; no special message formatting functionality is used, beyond what the C standard library offers.

The problem is that if you want to translate the words in the message, translation may require changing the code, not just the text of the message. Suppose that in the target language, we needed to write the equivalent of:

printf("On planet %d, there was a %s", what, planet);

Oops! We've reordered the text, but not the other arguments to printf. When using a modern C/C++ compiler, this would be a compile-time error (since what is a string and not an integer). But it's easy to imagine similar cases where both arguments are strings, and a nonsense message would result. Message formatters are necessary not only for resilience to errors, but also for division of labor: people building systems want to decouple the tasks of programming and translation.

ICU MessageFormat #

In search of a better tool for the job, we turn to one of many tools for formatting messages: ICU MessageFormat (MF1). According to the ICU4C API documentation, "MessageFormat prepares strings for display to users, with optional arguments (variables/placeholders)".

The MF2 working group describes MF1 more succinctly as "An i18n-aware printf, basically."

Here's an expanded version of the previous printf example, expressed in MF1 (and taken from the "Usage Information" of the API docs):

"At {1,time,::jmm} on {1,date,::dMMMM}, there was {2} on planet {0,number}."

Typically, the task of creating a message string like this is done by a software engineer, perhaps one who specializes in localization. The work of translating the words in the message from one natural language to another is done by a translator, who is not assumed to also be a software developer.

The designers of MF1 made it clear in the syntax what content needs to be translated, and what doesn't. Any text occurring inside curly braces is non-translatable: for example, the string "number" in the last set of curly braces doesn't mean the word "number", but rather, is a directive that the value of argument 0 should be formatted as a number. This makes it easy for both developers and translators to work with a message. In MF1, a piece of text delimited by curly braces is called a "placeholder".

The numbers inside placeholders represent arguments provided at runtime. (Arguments can be specified either by number or name.) Conceptually, time, date, and number are formatting functions, which take an argument and an optional "style" (like ::jmm and ::dMMMM in this example). MF1 has a fixed set of these functions. So {1,time,::jmm} should be read as "Format argument 1 as a time value, using the format specified by the string ::jmm". (The details of how the format strings work aren't important for this explanation.) Since {2} has no formatting function specified, the value of argument 2 is formatted based on its type. (From context, we can suppose it has a string type, but we would need to look at the arguments to the message to know for sure.) The set of types that can be formatted in this way is fixed (users can't add new types of their own).

For brevity, I won't show how to provide the arguments that are substituted for the numbers 0, 1 and 2 in the message; the API documentation shows the C++ code to create those arguments.

MF1 addresses the reordering issue we noticed in the printf example. As long as they don't change what's inside the placeholders, a translator doesn't have to worry that their translation will disturb the relationship between placeholders and arguments. Likewise, a developer doesn't have to worry about manually changing the order of arguments to reflect a translation of the message. The placeholder {2} means the same thing regardless of where it appears in the message. (Using named rather than positional arguments would make this point even clearer.)

(The ICU User Guide contains more documentation on MF1. A tutorial by Mohammad Ashour also contains some useful information on MF1.)

Enter MessageFormat 2.0 #

To summarize a few of the shortcomings of MF1:

  • Lack of modularity: the set of formatters (mostly for numbers, dates, and times) is fixed. There's no way to add your own.
    • Like formatting, selection (choosing a different pattern based on the runtime contents of one or more arguments) is not customizable. It can only be done either based on plural form, or literal string matching. This makes it hard to express the grammatical structures of various human languages.
  • No separate data model: the abstract syntax is concrete syntax.
  • There is no way to declare a local variable so that the same piece of a message can be re-used without repeating it; all variables are external arguments.
  • Baggage: extending the existing spec with different syntax could change the meaning of existing messages, so a new spec is needed. (MF1 was not designed for forward-compatibility, and the new spec is backward-incompatible.)

MF2 represents the best efforts of a number of experts in the field to address these shortcomings and others.

I hope the brief history I gave shows how much work, by many tech workers, has gone into the problem of message formatting. Synthesizing the lessons of the past few decades into one standard has taken years, with seemingly small details provoking nuanced discussions. What follows may seem complex, but the complexity is inherent in the problem space that it addresses. The plethora of different competing tools to address the same set of problems is evidence for that.

"Inherent is Latin for not your fault." -- Rich Hickey, "Simple Made Easy"

A simple example #

Here's the same example in MF2 syntax:

At time {$time :datetime hour=numeric minute=|2-digit|}, on {$time :datetime day=numeric month=long},
there was {$what} on planet {$planet :number}

As in MF1, placeholders are enclosed in curly braces. Variables are prefixed with a $; since there are no local bindings for time, what, or planet, they are treated as external arguments. Although variable names are already delimited by curly braces, the $ prefix helps make it even clearer that variable names should not be translated.

Function names are now prefixed by a :, and options (like hour) are named. There can be multiple options, not just one as in MF1. This is a loose translation of the MF1 version, since in MF2, the skeleton option is not part of the alpha version of the spec. (It is possible to write a custom function that takes this option, and it may be added in the future as part of the built-in datetime function.) Instead, the :datetime formatter can take a variety of field options (shown in the example) or style options (not shown) The full options are specified in the Default Registry portion of the spec.

Literal strings that occur within placeholders, like 2-digit, are quoted. The MF2 syntax for quoting a literal is to enclose it in vertical bars (|). The vertical bars are optional in most cases, but are necessary for the literal 2-digit because it includes a hyphen. This syntax was chosen instead of literal quotation marks (") because some use cases for MF2 messages require embedding them in a file format that assigns meaning to quotation marks. Vertical bars are less commonly used in this way.

A shorter version of this example is:

At time {$time :time}, on {$time :date}, there was {$what} on planet {$planet :number}

:time formats the time portion of its operand with default options, and likewise for :date. Implementations may differ on how options are interpreted and their default values. Thus, the formatted output of this version may be slightly different from that of the first version.

A more complex example: selection and custom functions #

So far, the examples are single messages that contain placeholders that vary based on runtime input. But sometimes, the translatable text in a message also depends on runtime input. A common example is pluralization: in English, consider "You have one item" versus "You have 5 items." We can call these different strings "variants" of the same message.

While you can imagine code that selects the right variant with a conditional, that violates the separation between code and data that we previously discussed.

Another motivator for variants is grammatical case. English has a fairly simple case system (consider "She told me" versus "I told her"; the "she" of the first sentence changes to "her" when that pronoun is the object of the transitive verb "tell".) Some languages have much more complex case systems. For example, Russian has six grammatical cases; Basque, Estonian, and Finnish all have more than ten. The complexity of translating messages into and between languages like these is further motivation for organizing all the variants of a message together. The overall goal is to make messages as self-contained as possible, so that changing them doesn't require changing the code that manipulates them. MF2 makes that possible in more situations. While MF1 supports selection based on plural categories, MF2 also supports a general, customizable form of selection.

Here's a very simple example (necessarily simple since it's in English) of using custom selectors in MF2 to express grammatical case:

.match {$userName :hasCase}
vocative {{Hello, {$userName :person case=vocative}!}}
accusative {{Please welcome {$userName :person case=accusative}!}}
* {{Hello!}}

The keyword .match designates the beginning of a matcher.

How to read a matcher #

Although MF2 is declarative, you can read this example imperatively as follows:

  1. Apply :hasCase to the runtime value of $userName to get a string c representing a grammatical case.
  2. Compare c to each of the keys in the three variants ("vocative", "accusative", and the wildcard "*", which matches any string.)
  3. Take the matching variant v and format its pattern.

This example assumes that the runtime value of the argument $userName is a structure whose grammatical case can be determined. This example also assumes that custom :hasCase and :person functions have been defined (the details of how those functions are defined are outside the scope of this post). In this example, :person is a formatting function, like :datetime in the simpler example. :hasCase is a different kind of function: a selector function, which may extract a field from its argument or do arbitrarily complicated computation to determine a value to be matched against.

In general: a matcher includes one or more selectors and one or more variants. A variant includes a list of keys, whose length must equal the number of selectors in the matcher; and a single pattern.

In this example, the selector of the matcher is the placeholder {$userName :hasCase}. Selectors appear between the .match keyword and the beginning of the first variant. There are three variants, each of which has a single key. The strings delimited by double sets of curly braces are patterns, which in turn contain other placeholders. The selectors are used to select a pattern based on whether the runtime value of the selector matches one of the keys.

Formatting the selected pattern #

Supposing that :hasCase maps the value of $userName onto "vocative", the formatted pattern consists of the string "Hello, " concatenated with the result of formatting this placeholder:

{$userName :person case=vocative}

concatenated with the string "!". (This also supposes that we are formatting to a single string result; future versions of MF2 may also support formatting to a list of "parts", in which case the result strings would be returned in a more complicated data structure, not concatenated.)

You can read this placeholder like a function call with arguments: "Call the :person function on the value of $userName with an additional named argument, the name "case" bound to the string "vocative". We also elide the details of :person, but you can suppose it uses additional fields in $userName to format it as a personal name. Such fields might include a title, first name, and last name (incidentally, see "Falsehoods Programmers Believe About Names" for why formatting personal names is more complicated than you might think.)

Note that MF2 provides no constructs that mutate variables. Once a variable is bound, its value doesn't change. Moreover, built-in functions don't have side effects, so as long as custom functions are written in a reasonable way (that is, without side effects that cross the boundary between MF2 and function execution), MF2 has no side effects.

That means that $userName represents the same value when it appears in the selector and when it appears in any of the patterns. Conceptually, :hasCase returns a result that is used for selection; it doesn't change what the name $userName is bound to.

Abstraction over function implementations #

A developer could plug in an implementation of :hasCase that requires its argument to be an object or record that contains grammatical metadata, and then simply returns one of the fields of this object. Or we could plug in an implementation that can accept a string and uses a dictionary for the target language to guess its case. The message is structured the same way regardless, though to make it work as expected, the structure of the arguments must match the expectations of whatever functions consume them. Effectively, the message is parameterized over the meanings of its arguments and over the meanings of any custom functions it uses. This parameterization is a key feature of MF2.

Summing up #

This example couldn't be expressed in MF1, since MF1 has no custom functions. The checking for different grammatical cases would be done in the underlying programming language, with ad hoc code for selecting between the different strings. This would be error-prone, and would force a code change whenever a message changes (just as in the simple example shown previously). MF1 does have an equivalent of .match that supports a few specific kinds of matching, like plural matching. In MF2, the ability to write custom selector functions allows for much richer matching.

Further reading (or watching) #

For more background both on message formatting in general and on MF2, I recommend my teammate Ujjwal Sharma's talk at FOSDEM 2024, on which portions of this post are based. A recent MessageFormat 2 open house talk by Addison Phillips and Elango Cheran also provides some great context and motivation for why a new standard is needed.

You can also read a detailed argument in favor of a new standard by visiting the spec repository on GitHub: "Why MessageFormat needs a successor".

Igalia's work on MF2, or: Who are we and what are we doing here? #

So far, I've provided a very brief overview of the syntax and semantics of MF2. More examples will be provided via links in the next post.

A question I haven't answered is why this post is on the Igalia compilers team blog. I'm a member of the Igalia compilers team; together with Ujjwal Sharma, I have been collaborating with the MF2 working group, a subgroup of the Unicode CLDR technical committee. The working group is chaired by Addison Phillips and has many other members and contributors.

Part of our work on the compilers team has been to implement the MF2 specification as a C++ API in ICU. (Mihai Niță at Google, also a member of the working group, implemented the specification as a Java API in ICU.)

So why would the compilers team work on internationalization?

Part of our work as a team is to introduce and refine proposals for new JavaScript (JS) language features and work with TC39, the JS standards committee, to advance these proposals with the goal of inclusion in the official JS specification.

One such proposal that the compilers team has been involved with is the Intl.MessageFormat proposal.

The implementation of MessageFormat 2 in ICU provides support for browser engines (the major ones are implemented in C++) to implement this proposal. Prototype implementations are part of the TC39 proposal process.

But there's another reason: as I hinted at the beginning, the more I learned about MF2, the more I began to see it as a programming language, at least a domain-specific language. In the next post on this blog, I'll talk about the implementation work that Igalia has done on MF2 and reflect on that work from a programming language design perspective. Stay tuned!

May 06, 2024 12:00 AM

May 04, 2024

Brian Kardell

Known Knowns

Known Knowns

Fun with the DOM, the parser, illogical trees and "unknowns"...

HTML has this very tricky parser that does 'corrections' and things on the fly. So if you create a page like:

<table>
   Hello
   <td>
      Look below...
   </td>
</table>

And load it in your browser, what you'll actually get parsed as a tree will be

  • HTML
    • HEAD
    • BODY
      • #text: Hello
      • TABLE
        • TBODY
          • TR
            • TD
              • #text: Look below...
            • #text:

Things can literally be moved around, implied elements added and so on.

Illogical trees

But it's not impossible to create trees that are impossible to create with the parser itself, if you do it dynamically. With the DOM API, you can create whatever wild constructs you want: Paragraphs inside of paragraphs? Sure, why not.

Or, text that is a direct child of a table. Note this still renders the text in every browser engine.

You can even add children to 'void' elements that way too. Here's an interesting one: An <hr> with a text node child. Again, It renders the text in every browser engine (the color varies).

You can also put unknown elements into your markup and the text content is shown... By default, it is basically a <span>.

In most cases, HTML wants to show something... or at least leave CSS in control. For example, you can dynamically add children to a <script>. While that won't be shown by default, it's simply because the UA style sheet has script set to display: none;. If we change that, we can totally see it.

But this isn't universal: In some cases there are other renderers that are in control - mainly when it comes to form controls. But also, for example, if you switch the <hr> in the example above to a <br> it won't render the text. It doesn't generate a box that you can do anything with with CSS as far as I can tell, except make it display: none (useful if they’re in white-space: pre blocks to keep them from forcing open extra lines).

SVG

The HTML parser has fix ups for embedding SVG too, there are kind of integration points. But, in SVG you can have an unknown element too... For example:

<svg>
    <unknown>Test<unknown>
    <rect width="150" height="150" x="10" y="10" style="fill:blue;stroke:pink;stroke-width:5;opacity:0.5" />
</svg>

The unknown element won't render the text, nor additional SVG children inside it. For example:

<svg id=one width="300" height="170">
  <unknown><ellipse cx="120" cy="80" rx="100" ry="50" style="fill:yellow;stroke:green;stroke-width:3" /></unknown>
  <rect width="150" height="150" x="10" y="10" style="fill:blue;stroke:pink;stroke-width:5;opacity:0.5" />
</svg>

MathML

As you might expect, there are parser integrations for MathML too, and you can have an unknown elements in MathML too. In MathML, all elements (including unknown ones) generate a "math content box", but only token elements (like <mi>, <mo>, <mn>) render text. For example, the <math> element itself - if you try to put text in it, the text won't render, but it will still generate a box and other content.

<math>
   Not a token. Doesn't render.
   <mi>X</mi>
</math>

MathML has other elements too like <mrow> and <mspace> and <mphantom> which are just containers. Same story there, if you try to put text in them, the text won't render...

<math>
   <mrow>Not a token. Doesn't render.</mrow>
   <mi>X</mi>
</math>

But if you put the text inside a token element (like <mi>) inside that same <mrow>, then the text will render..

<math>
   <mrow><mi>ok</mi></mrow>
   <mi>X</mi>
</math>

In MathML, unknown elements are basically treated just like <mrow>. In the above examples, you could replace <mrow> with <unknown> and it'd be the same.

Unknown Unknowns

Ok, here's something you don't think about every day: Given this markup:

<unknown id=one>One</unknown>
<math>
  <unknown id=two>Two</unknown>
</math>
<svg>
  <unknown id=three>Three</unknown>
</svg>

One, Two and Three are three different kinds of unknowns!

console.log(one.namespaceURI, one.constructor.name)
// logs 'http://www.w3.org/1999/xhtml HTMLUnknownElement'

console.log(two.namespaceURI, two.constructor.name)
// logs 'http://www.w3.org/1998/Math/MathML MathMLElement'

console.log(three.namespaceURI, three.constructor.name)
// logs 'http://www.w3.org/2000/svg SVGElement`

In CSS, these can also (theoretically) be styled via namespaces. The following will only style the first of those:

@namespace html url(http://www.w3.org/1999/xhtml);
html|unknown { color: blue; }

Under-defined Unknowns

Remember how in the beginning we created nonsensical constructs dynamically? Well, we can do that here too. We can move an unknown MathMLElement right into HTML, or an unknown HTMLElement right into MathML, and so on - and it's not currently well-defined, universal, or consistent what actually happens here.

Here's an interesting example moves an unknown MathMLElement and a SVGElement into HTML, and an HTMLElement into MathML and so on

See the Pen Untitled by вкαя∂εℓℓ (@briankardell) on CodePen.

Here's how that renders in the various today:

Left to right: Chrome, Firefox, Safari all render differently

So, I guess we should probably fix that. I'll have to start creating some issues and tentative tests (feel free to beat me to it 😉)

Semi-related Rabbit Holes

Not specifically these issues, but related namespace stuff has caused a lot of problems that we're remedying as shown by a recent flurry of activity started by my colleague Luke Warlow to fix MathML-Core support in various libraries/frameworks:

Who's next? :)

May 04, 2024 04:00 AM

April 30, 2024

Enrique Ocaña

Dissecting GstSegments

During all these years using GStreamer, I’ve been having to deal with GstSegments in many situations. I’ve always have had an intuitive understanding of the meaning of each field, but never had the time to properly write a good reference explanation for myself, ready to be checked at those times when the task at hand stops being so intuitive and nuisances start being important. I used the notes I took during an interesting conversation with Alba and Alicia about those nuisances, during the GStreamer Hackfest in A Coruña, as the seed that evolved into this post.

But what are actually GstSegments? They are the structures that track the values needed to synchronize the playback of a region of interest in a media file.

GstSegments are used to coordinate the translation between Presentation Timestamps (PTS), supplied by the media, and Runtime.

PTS is the timestamp that specifies, in buffer time, when the frame must be displayed on screen. This buffer time concept (called buffer running-time in the docs) refers to the ideal time flow where rate isn’t being had into account.

Decode Timestamp (DTS) is the timestamp that specifies, in buffer time, when the frame must be supplied to the decoder. On decoders supporting P-frames (forward-predicted) and B-frames (bi-directionally predicted), the PTS of the frames reaching the decoder may not be monotonic, but the PTS of the frames reaching the sinks are (the decoder outputs monotonic PTSs).

Runtime (called clock running time in the docs) is the amount of physical time that the pipeline has been playing back. More specifically, the Runtime of a specific frame indicates the physical time that has passed or must pass until that frame is displayed on screen. It starts from zero.

Base time is the point when the Runtime starts with respect to the input timestamp in buffer time (PTS or DTS). It’s the Runtime of the PTS=0.

Start, stop, duration: Those fields are buffer timestamps that specify when the piece of media that is going to be played starts, stops and how long that portion of the media is (the absolute difference between start and stop, and I mean absolute because a segment being played backwards may have a higher start buffer timestamp than what its stop buffer timestamp is).

Position is like the Runtime, but in buffer time. This means that in a video being played back at 2x, Runtime would flow at 1x (it’s physical time after all, and reality goes at 1x pace) and Position would flow at 2x (the video moves twice as fast than physical time).

The Stream Time is the position in the stream. Not exactly the same concept as buffer time. When handling multiple streams, some of them can be offset with respect to each other, not starting to be played from the begining, or even can have loops (eg: repeating the same sound clip from PTS=100 until PTS=200 intefinitely). In this case of repeating, the Stream time would flow from PTS=100 to PTS=200 and then go back again to the start position of the sound clip (PTS=100). There’s a nice graphic in the docs illustrating this, so I won’t repeat it here.

Time is the base of Stream Time. It’s the Stream time of the PTS of the first frame being played. In our previous example of the repeating sound clip, it would be 100.

There are also concepts such as Rate and Applied Rate, but we didn’t get into them during the discussion that motivated this post.

So, for translating between Buffer Time (PTS, DTS) and Runtime, we would apply this formula:

Runtime = BufferTime * ( Rate * AppliedRate ) + BaseTime

And for translating between Buffer Time (PTS, DTS) and Stream Time, we would apply this other formula:

StreamTime = BufferTime * AppliedRate + Time

And that’s it. I hope these notes in the shape of a post serve me as reference in the future. Again, thanks to Alicia, and especially to Alba, for the valuable clarifications during the discussion we had that day in the Igalia office. This post wouldn’t have been possible without them.

by eocanha at April 30, 2024 06:00 AM

April 26, 2024

Gyuyoung Kim

Web Platform Test and Chromium

I’ve been working on tasks related to web platform tests since last year. In this blog post, I’ll introduce what web platform tests are and how the Chromium project incorporates these tests into its development process.

1. Introduction

The web-platform-tests project serves as a cross-browser test suite for the Web platform stack. By crafting tests that can run seamlessly across all browsers, browser projects gain assurance that their software aligns with other implementations. This confidence extends to future implementations, ensuring compatibility. Consequently, web authors and developers can trust the Web platform to fulfill its promise of seamless functionality across browsers and devices, eliminating the need for additional layers of abstraction to address gaps introduced by specification editors and implementors.

For your information, the Web Platform Test Community operates a dashboard that tracks the pass ratio of tests across major web browsers. The chart below displays the number of failing tests in major browsers over time, with Chrome showing the lowest failure rate.

[Caption] Test failures graph on major browsers
And, the below chart shows the interoperability of the web platform technology for 2023 among major browsers. The interoperability has been improving.

[Caption] Interoperability among the major browsers 2023

2. Test Suite Design

The majority of the test suite is made up of HTML pages that are designed to be loaded in a browser. These pages may either generate results programmatically or provide a set of steps for executing the test and obtaining the outcome. Overall, the tests are concise, cross-platform, and self-contained, making them easy to run in any browser.

2.1 Test Layout

Most primary directories within the repository are dedicated to tests associated with specific web standards. For W3C specifications, these directories typically use the short name of the spec, which is the name used for snapshot publications under the /TR/ path. For WHATWG specifications, the directories are usually named after the spec’s subdomain, omitting “.spec.whatwg.org” from the URL. Other specifications follow a logical naming convention.

The css/ directory contains test suites specifically designed for the CSS Working Group specifications.

Within each specification-specific directory, tests are organized in one of two common ways: a flat structure, sometimes used for shorter specifications, or a nested structure, where each subdirectory corresponds to the ID of a heading within the specification. The nested structure, which provides implicit metadata about the tested section of the specification based on its location in the filesystem, is preferred for larger specifications.

For example, tests related to “The History interface” in HTML can be found in html/browsers/history/the-history-interface/.

Many directories also include a file named META.yml, which may define properties such as:
  • spec: a link to the specification covered by the tests in the directory
  • suggested_reviewers: a list of GitHub usernames for individuals who are notified when pull requests modify files in the directory
Various resources that tests rely on are stored in common directories, including images, fonts, media, and general resources.

2.2 Test Types

Tests in this project employ various approaches to validate expected behavior, with classifications based on how expectations are expressed:

  1. Rendering Tests:
    • Reftests: Compare the graphical rendering of two (or more) web pages, asserting equality in their display (e.g., A.html and B.html must render identically). This can be done manually by users switching between tabs/windows or through automated scripts.
    • Visual Tests: Evaluate a page’s appearance, with results determined either by human observation or by comparing it with a saved screenshot specific to the user agent and platform.
  2. JavaScript Interface Tests (testharness.js tests):
    • Ensure that JavaScript interfaces behave as expected. Named after the JavaScript harness used to execute them.
  3. WebDriver Browser Automation Protocol Tests (wdspec tests):
    • Written in Python, these tests validate the WebDriver browser automation protocol.
  4. Manual Tests rely on a human to run them and determine their result.

3. Web Platform Test in Chromium

The Chromium project conducts web tests utilized by Blink to assess various components, encompassing, but not limited to, layout and rendering. Generally, these web tests entail loading pages in a test renderer (content_shell) and comparing the rendered output or JavaScript output with an expected output file. The Web Tests include “Web platform tests”(WPT) located at web_tests/external/wpt. Other directories are for Chrome-specific tests only. In the web_tests/external/wpt, there are around 50,000 web platform tests currently.

3.1 Web Tests in the Chromium development process

Every patch must pass all tests before it can be merged into the main source tree. In Gerrit, trybots execute a wide array of tests, including the Web Test, to ensure this. We can initiate these trybots with a specific patch, and each trybot will then run the scheduled tests using that patch.

[Caption] Trybots in Gerrit
For example, the linux-rel trybot runs the web tests in the blink_web_tests step as below,

[Caption] blink_web_tests in the linux-rel trybot

3.2 Internal sequence for running the Web Test in Chromium

Let’s take a look at how Chromium internally runs the web tests simply. Basically, there are 2 major components to run the web tests in Chromium. One is run_web_tests.py acting as a server. The other one is content_shell acting as a client that loads a passed test and returns the result. The input and output of content_shell are assumed to follow the run_web_tests protocol through pipes that connect stdin and stdout of run_web_tests.py and content_shell as below,

[Caption] Sequence how to execute a test between run_web_test.py and content_shell

4. In conclusion

We just explored what Web Tests are and how Chromium runs these tests for web platform testing. Web platform tests play a crucial role in ensuring interoperability among various web browsers and platforms. In the following blog post, I will share how did I support and implement to run of web tests on iOS for the Blink port.

References

by gyuyoung at April 26, 2024 08:48 AM

April 24, 2024

Stephen Chenney

CSS Custom Properties in Highlight Pseudos

The CSS highlight inheritance model describes the process for inheriting the CSSproperties of the various highlight pseudo elements:

  • ::selection controlling the appearance of selected content
  • ::spelling-error controlling the appearance of misspelled word markers
  • ::grammar-error controlling how grammar errors are marked
  • ::target-text controlling the appearance of the string matching a target-text URL
  • ::highlight defining the appearance of a named highlight, accessed via the CSS highlight API

The inheritance model is intended to generate more intuitive behavior for examples like the one below, where we would expect the second line to be entirely blue because the <em> is a child of the blue paragraph that typically would inherit properties from the paragraph:

<style>
::selection /* = *::selection (universal) */ {
color: lightgreen;
}
.blue::selection {
color: blue;
}
</style>
<p>Some <em>lightgreen</em> text</p>
<p class="blue">Some <em>lightgreen</em> text that one would expect to be blue</p>
<script>
range = new Range();
range.setStart(document.body, 0);
range.setEnd(document.body, 3)
document.getSelection().addRange(range);
</script>

Before this model was standardized, web developers achieved the same effect using CSS custom properties. The selection highlight could make use of a custom property defined on the originating element (the element that is selected). Being defined on the originating element tree, those custom properties were inherited, so a universal highlight would effectively inherit the properties of its parent. Here’s what it looks like:

<style>
:root {
--selection-color: lightgreen;
}
::selection /* = *::selection (universal) */ {
color: var(--selection-color);
}
.blue {
--selection-color: blue;
}
</style>
<p>Some <em>lightgreen</em> text</p>
<p class="blue">Some <em>lightgreen</em> text that is also blue</p>
<script>
range = new Range();
range.setStart(document.body, 0);
range.setEnd(document.body, 3)
document.getSelection().addRange(range);
</script>

In the example, all selection highlights use the value of --selection-color as the selection text color. The <em> element inherits the property value of blue from its parent <p> and hence has a blue highlight.

This approach to selection inheritance was promoted in Stack Overflow posts and other places, even in posts that did not discuss inheritance. The problem is that the new highlight inheritance model, as previously specified, broke this behavior in a way that required significant changes for web sites making use of the former approach. At the prompting of developer advocates, the CSS Working Group decided to change the specified behavior of custom properties with highlight pseudos to support the existing recommended approach.

Custom Properties for Highlights #

The updated behavior for custom properties in highlight pseudos is that highlight properties that have var(...) references take the property values from the originating element. In addition, custom properties defined in the highlight pseudo itself will be ignored. The example above works with this new approach, so content making use of it will not break when highlight inheritance is enabled in browsers.

Note that this explicitly conflicts with the previous guidance, which was to define the custom properties on the highlight pseudos, with only the root highlight inheriting custom properties from the document root element.

The change is implemented behind a flag in Chrome 126 and higher, as is expected to be enabled by default as early as Chrome 130. To see the effects right now, enable “Experimental Web Platform features” via chrome://flags in M126 Chrome (Canary at the time of writing).

A Unified Approach for Derived Properties in Highlights #

There are other situations in which properties are disallowed on highlight pseudos yet those properties are necessary to resolve variables. For example, lengths that use font based units, such as 0.5em cannot use font-size from a highlight pseudo because font-size is not allowed in highlights. Similarly container-based units or viewport units. The general rule is that the highlight will get whatever value it needs from the originating element, and that now includes custom property values.

April 24, 2024 12:00 AM

April 22, 2024

Brian Kardell

Mirror Effects

Mirror Effects

Today I had this thought about "AI" that felt more like a short blog post than a social media thread, so I thought I'd share it.

I love learning and thinking about weird indirect connections like how thing X made things Y and Z suddenly possible, and given some time these had secondary or tertiary impacts that were really unexpected.

There are so many possible examples: We didn't really expect the web and social media to give rise to the Arab Spring. Nor to such disinformation. Nor the 2016 American election (or several others similarly around the world). Maybe Carrier didn't expect that the invention of air conditioning would help reshape the American political map, but it did.

One of the things that came to my mind today was a thing I read about mirrors. Ian Mortimer explains how improvements to and the spread of mirrors radically changed lots of things. Here's a quote from the piece:

The very act of a person seeing himself in a mirror or being represented in a portrait as the center of attention encouraged him to think of himself in a different way. He began to see himself as unique. Previously the parameters of individual identity had been limited to an individual’s interaction with the people around him and the religious insights he had over the course of his life. Thus individuality as we understand it today did not exist; people only understood their identity in relation to groups—their household, their manor, their town or parish—and in relation to God... The Mirror Effect

It's pretty interesting to look back and observe all of the changes that this began to stir - from the way people literally lived (more privacy), to changes in the types of writing and art, and how we thought about fitting in to the larger world (and thus the societies we built).

So, today I had this random thought that I wonder what sorts of effects like these will come from all of the "AI" focus. That is, not the common/direct sorts of things we're talking about but the ones that maybe have very little to do with any actual technology even. Which things will some future us look back on in 20 or 100 years and say "huh, interesting".

In particular, the thing that brought this to mind is that I am suddenly seeing lots of more people suddenly having conversations like "What is consciousness though, really?" and "What is intelligence, really?". I don't mean from technologists, I just mean it seems to be causing lots more people to suddenly think about that kind of thing.

So it made me think: I wonder if we will see increased signups for philosophy courses? Or, sales of more books along these lines? Could that ultimately lead to another, similar sort of changing in how we collectively see ourselves? I wonder what effects this has in the long term on literature, film, or even science and government.

This isn't a "take" - it's not trying to be optimistic or pessimistic. It's more of a "huh... I hadn't thought much about that before, but it's kind of interesting to think about." Don't you think? Outside of the normal sorts of things we're obviously thinking about - what are some you could imagine?

As a kind of final interesting note: Stephen Johnson is a great storyteller, and his work is full of connections like these. If you want to read more, check out his books. Part of the reason that I mention him is that interestingly, it seems that he has recently gone to work with Google on a LM project about writing himself. I bet he's got some notes and ideas on this already too.

April 22, 2024 04:00 AM

April 16, 2024

Philippe Normand

From WebKit/GStreamer to rust-av, a journey on our stack’s layers

In this post I’ll try to document the journey starting from a WebKit issue and ending up improving third-party projects that WebKitGTK and WPEWebKit depend on.

I’ve been working on WebKit’s GStreamer backends for a while. Usually some new feature needed on WebKit side would trigger work on GStreamer. That’s quite common and healthy actually, by improving GStreamer (bug fixes or implementing new features) we make the whole stack stronger (hopefully). It’s not hard to imagine other web-engines, such as Servo for instance, leveraging fixes made in GStreamer in the context of WebKit use-cases.

Sometimes though we have to go deeper and this is what this post is about!

Since version 2.44, WebKitGTK and WPEWebKit ship with a WebCodecs backend. That backend leverages the wide range of GStreamer audio and video decoders/encoders to give low-level access to encoded (or decoded) audio/video frames to Web developers. I delivered a lightning talk at gst-conf 2023 about this topic.

There are still some issues to fix regarding performance and some W3C web platform tests are still failing. The AV1 decoding tests were flagged early on while I was working on WebCodecs, I didn’t have time back then to investigate the failures further, but a couple weeks ago I went back to those specific issues.

The WebKit layout tests harness is executed by various post-commit bots, on various platforms. The WebKitGTK and WPEWebKit bots run on Linux. The WebCodec tests for AV1 currently make use of the GStreamer av1enc and dav1ddec elements. We currently don’t run the tests using the modern and hardware-accelerated vaav1enc and vaav1dec elements because the bots don’t have compatible GPUs.

The decoding tests were failing, this one for instance (the ?av1 variant). In that test both encoding and decoding are tested, but decoding was failing, for a couple reasons. Rabbit hole starts here. After debugging this for a while, it was clear that the colorspace information was lost between the encoded chunks and the decoded frames. The decoded video frames didn’t have the expected colorimetry values.

The VideoDecoderGStreamer class basically takes encoded chunks and notifies decoded VideoFrameGStreamer objects to the upper layers (JS) in WebCore. A video frame is basically a GstSample (Buffer and Caps) and we have code in place to interpret the colorimetry parameters exposed in the sample caps and translate those to the various WebCore equivalents. So far so good, but the caps set on the dav1ddec elements didn’t have those informations! I thought the dav1ddec element could be fixed, “shouldn’t be that hard” and I knew that code because I wrote it in 2018 :)

So let’s fix the GStreamer dav1ddec element. It’s a video decoder written in Rust, relying on the dav1d-rs bindings of the popular C libdav1d library. The dav1ddec element basically feeds encoded chunks of data to dav1d using the dav1d-rs bindings. In return, the bindings provide the decoded frames using a Dav1dPicture Rust structure and the dav1ddec GStreamer element basically makes buffers and caps out of this decoded picture. The dav1d-rs bindings are quite minimal, we implemented API on a per-need basis so far, so it wasn’t very surprising that… colorimetry information for decoded pictures was not exposed! Rabbit hole goes one level deeper.

So let’s add colorimetry API in dav1d-rs. When working on (Rust) bindings of a C library, if you need to expose additional API the answer is quite often in the C headers of the library. Every Dav1dPicture has a Dav1dSequenceHeader, in which we can see a few interesting fields:

typedef struct Dav1dSequenceHeader {
...
    enum Dav1dColorPrimaries pri; ///< color primaries (av1)
    enum Dav1dTransferCharacteristics trc; ///< transfer characteristics (av1)
    enum Dav1dMatrixCoefficients mtrx; ///< matrix coefficients (av1)
    enum Dav1dChromaSamplePosition chr; ///< chroma sample position (av1)
    ...
    uint8_t color_range;
    ...
...
} Dav1dSequenceHeader;

After sharing a naive branch with rust-av co-maintainers Luca Barbato and Sebastian Dröge, I came up with a couple pull-requests that eventually were shipped in version 0.10.3 of dav1d-rs. I won’t deny matching primaries, transfer, matrix and chroma-site enum values to rust-avs Pixel enum was a bit challenging :P Anyway, with dav1d-rs fixed up, rabbit hole level goes up one level :)

Now with the needed dav1d-rs API, the GStreamer dav1ddec element could be fixed. Again, matching the various enum values to their GStreamer equivalent was an interesting exercise. The merge request was merged, but to this date it’s not shipped in a stable gst-plugins-rs release yet. There’s one more complication here, ABI broke between dav1d 1.2 and 1.4 versions. The dav1d-rs 0.10.3 release expects the latter. I’m not sure how we will cope with that in terms of gst-plugins-rs release versioning…

Anyway, WebKit’s runtime environment can be adapted to ship dav1d 1.4 and development version of the dav1ddec element, which is what was done in this pull request. The rabbit is getting out of his hole.

The WebCodec AV1 tests were finally fixed in WebKit, by this pull request. Beyond colorimetry handling a few more fixes were needed, but luckily those didn’t require any fixes outside of WebKit.

Wrapping up, if you’re still reading this post, I thank you for your patience. Working on inter-connected projects can look a bit daunting at times, but eventually the whole ecosystem benefits from cross-project collaborations like this one. Thanks to Luca and Sebastian for the help and reviews in dav1d-rs and the dav1ddec element. Thanks to my fellow Igalia colleagues for the WebKit reviews.

by Philippe Normand at April 16, 2024 08:14 PM

April 04, 2024

Jesse Alama

The decimals around us: Cataloging support for decimal numbers

A cat­a­log of sup­port for dec­i­mal num­bers in var­i­ous pro­gram­ming lan­guages

Dec­i­mals num­bers are a data type that aims to ex­act­ly rep­re­sent dec­i­mal num­bers. Some pro­gram­mers may not know, or ful­ly re­al­ize, that, in most pro­gram­ming lan­guages, the num­bers that you en­ter look like dec­i­mal num­bers but in­ter­nal­ly are rep­re­sent­ed as bi­na­ry—that is, base-2—float­ing-point num­bers. Things that are to­tal­ly sim­ple for us, such as 0.1, sim­ply can­not be rep­re­sent­ed ex­act­ly in bi­na­ry. The dec­i­mal data type—what­ev­er its stripe or fla­vor—aims to rem­e­dy this by giv­ing us a way of rep­re­sent­ing and work­ing with dec­i­mal num­bers, not bi­na­ry ap­prox­i­ma­tions there­of. (Wikipedia has more.)

To help with my work on adding dec­i­mals to JavaScript, I've gone through a list of pop­u­lar pro­gram­ming lan­guages, tak­en from the 2022 Stack­Over­flow de­vel­op­er sur­vey. What fol­lows is a brief sum­ma­ry of where these lan­guages stand re­gard­ing dec­i­mals. The in­ten­tion is to keep things sim­ple. The pur­pose is:

  1. If a lan­guage does have dec­i­mals, say so;
  2. If a lan­guage does not have dec­i­mals, but at least one third-par­ty li­brary ex­ists, men­tion it and link to it. If a dis­cus­sion is un­der­way to add dec­i­mals to the lan­guage, link to that dis­cus­sion.

There is no in­ten­tion to fil­ter out an lan­guage in par­tic­u­lar; I'm just work­ing with a slice of lan­guages found in in the Stack­Over­flow list linked to ear­li­er. If a lan­guage does not have dec­i­mals, there may well be mul­ti­ple third-part dec­i­mal li­braries. I'm not aware of all li­braries, so if I have linked to a mi­nor li­brary and ne­glect to link to a more high-pro­file one, please let me know. More im­por­tant­ly, if I have mis­rep­re­sent­ed the ba­sic fact of whether dec­i­mals ex­ists at all in a lan­guage, send mail.

C

C does not have dec­i­mals. But they're work­ing on it! The C23 stan­dard (as in, 2023) stan­dard pro­pos­es to add new fixed bit-width data types (32, 64, and 128) for these num­bers.

C#

C# has dec­i­mals in its un­der­ly­ing .NET sub­sys­tem. (For the same rea­son, dec­i­mals also ex­ist in Vi­su­al Ba­sic.)

C++

C++ does not have dec­i­mals. But—like C—they're work­ing on it!

Dart

Dart does not have dec­i­mals. But a third-par­ty li­brary ex­ists.

Go

Go does not have dec­i­mals, but a third-par­ty li­brary ex­ists.

Java

Java has dec­i­mals.

JavaScript

JavaScript does not have dec­i­mals. We're work­ing on it!

Kotlin

Kotlin does not have dec­i­mals. But, in a way, it does: since Kotlin is run­ning on the JVM, one can get dec­i­mals by us­ing Java's built-in sup­port.

PHP

PHP does not have dec­i­mals. An ex­ten­sion ex­ists and at least one third-par­ty li­brary ex­ists.

Python

Python has dec­i­mals.

Ruby

Ruby has dec­i­mals. De­spite that, there is some third-par­ty work to im­prove the built-in sup­port.

Rust

Rust does not have dec­i­mals, but a crate ex­ists.

SQL

SQL has dec­i­mals (it is the DECIMAL data type). (Here is the doc­u­men­ta­tion for, e.g., Post­greSQL, and here is the doc­u­men­ta­tion for MySQL.)

Swift

Swift has dec­i­mals

Type­Script

Type­Script does not have dec­i­mals. How­ev­er, if dec­i­mals get added to JavaScript (see above), Type­Script will prob­a­bly in­her­it dec­i­mals, even­tu­al­ly.

April 04, 2024 12:52 PM

Getting started with Lean 4, your next programming language

I had the plea­sure of learn­ing about Lean 4 with David Chris­tiansen and Joachim Bre­it­ner at their tu­to­r­i­al at BOBKonf 2024. I‘m plan­ning on do­ing a cou­ple of for­mal­iza­tions with Lean and would love to share what I learn as a to­tal new­bie, work­ing on ma­cOS.

Need­ed tools

I‘m on ma­cOS and use Home­brew ex­ten­sive­ly. My sim­ple go-to ap­proach to find­ing new soft­ware is to do brew search lean. This re­vealed lean as well as sur­face elan. Run­ning brew info lean showed me that that pack­age (at the time I write this) in­stalls Lean 3. But I know, out-of-band, that Lean 4 is what I want to work with. Run­ning brew info elan looked bet­ter, but the out­put re­minds me that (1) the in­for­ma­tion is for the elan-init pack­age, not the elan cask, and (2) elan-init con­flicts with both the elan and the afore­men­tioned lean. Yikes! This strikes me as a po­ten­tial prob­lem for the com­mu­ni­ty, be­cause I think Lean 3, though it still works, is pre­sum­ably not where new Lean de­vel­op­ment should be tak­ing place. Per­haps the Home­brew for­mu­la for Lean should be up­dat­ed called lean3, and a new lean4 pack­age should be made avail­able. I‘m not sure. The sit­u­a­tion seems less than ide­al, but in short, I have been suc­cess­ful with the elan-init pack­age.

Af­ter in­stalling elan-init, you‘ll have the elan tool avail­able in your shell. elan is the tool used for main­tain­ing dif­fer­ent ver­sions of Lean, sim­i­lar to nvm in the Node.js world or pyenv.

Set­ting up a blank pack­age

When I did the Lean 4 tu­to­r­i­al at BOB, I worked en­tire­ly with­in VS Code (…) and cre­at­ed a new stand­alone pack­age us­ing some in-ed­i­tor func­tion­al­i­ty. At the com­mand line, I use lake init to man­u­al­ly cre­ate a new Lean pack­age. At first, I made the mis­take of run­ning this com­mand, as­sum­ing it would cre­ate a new di­rec­to­ry for me and set up any con­fig­u­ra­tion and boil­er­plate code there. I was sur­prised to find, in­stead, that lake init sets things up in the cur­rent di­rec­to­ry, in ad­di­tion to cre­at­ing a sub­di­rec­to­ry and pop­u­lat­ing it. Us­ing lake --help, I read about the lake new com­mand, which does what I had in mind. So I might sug­gest us­ing lake new rather than lake init.

What‘s in the new di­rec­to­ry? Do­ing tree foobar re­veals

foobar ├── Foobar │   └── Basic.lean ├── Foobar.lean ├── Main.lean ├── lakefile.lean └── lean-toolchain

Tak­ing a look there, I see four .lean files. Here‘s what they con­tain:

Main.lean

import «Foobar» def main : IO Unit := IO.println s!"Hello, {hello}!"

Foobar.lean

-- This module serves as the root of the `Foobar` library. -- Import modules here that should be built as part of the library. import «Foobar».Basic

Foobar/Basic.lean

def hello := "world"

lakefile.lean

import Lake open Lake DSL package «foobar» where -- add package configuration options here lean_lib «Foobar» where -- add library configuration options here @[default_target] lean_exe «foobar» where root := `Main

It looks like there‘s a lit­tle mod­ule struc­ture here, and a ref­er­ence to the iden­ti­fi­er hello, de­fined in Foobar/Basic.lean and made avail­able via Foobar.lean. I’m not go­ing to touch lakefile.lean for now; as a new­bie, it looks scary enough that I think I’ll just stick to things like Basic.lean.

There‘s also an au­to­mat­i­cal­ly cre­at­ed .git there, not shown in the di­rec­to­ry out­put above.

Now what?

Now that you‘ve got Lean 4 in­stalled and set up a pack­age, you‘re ready to dive in to one of the of­fi­cial tu­to­ri­als. The one I‘m work­ing through is David‘s Func­tion­al Pro­gram­ming in Lean. There‘s all sorts of ad­di­tion­al things to learn, such as all the dif­fer­ent lake com­mands. En­joy!

April 04, 2024 01:48 AM

April 03, 2024

Brian Kardell

The Blessing of the Strings

The Blessing of the Strings

Trusted Types have been a proposal by Google for quite some time at this point, but it's currently getting a lot of attention and work in all browsers (Igalia is working on implementations in WebKit and Gecko, sponsored by Salesforce and Google, respectively). I've been looking at it a lot and thought it's probably something worth writing about.

The Trusted Types proposal is about preventing Cross-site scripting (XSS), and rides atop Content Security Policy (CSP) and allows website maintainers to say "require trusted-types". Once required, lots of the Web Platform's dangerous API surfaces ("sinks") which currently require a string will now require... well, a different type.

myElement.innerHTML (and a whole lot of other APIs) for example, would now require a TrustedHTML object instead of just a string.

You can think of TrustedHTML as an interface indicating that a string has been somehow specially "blessed" as safe... Sanitized.

the Holy Hand grenade scene from Monty Python's Holy Grail
And Saint Attila raised the string up on high, saying, 'O Lord, bless this thy string, that with it we may trust that it is free of XSS...' [ref].

Granting Blessings

The interesting thing about this is how one goes about blessing strings, and how this changes the dynamics of development and safety to protect from XSS.

To start with, there is a new global trustedTypes object (available in both window and workers) with a method called .createPolicy which can be used to create "policies" for blessing various kinds of input (createHTML, createScript, and createScriptURL). Trusted Types comes with the concept of a default policy, and the ability for you to register a specially named "default"...

//returns a policy, but you 
// don't really need to do anything 
// with the default one
trustedTypes.createPolicy(
    "default", 
    {
      createHTML: s => { 
          return DOMPurify.sanitize(s) 
      } 
    }
);

And now, the practical upshot is that all attempts to set HTML will be sanitized... So if there's some code that tries to do:

// if str contains
// `&lt;img src="no" onerror="<em>dangerous code</em>" &gt;`;
target.innerHTML =  str;

Then the onerror attribute will be automatically stripped (sanitized) before .innerHTML gets it.

Hey that's pretty cool!

one of the scenes where the castle guard is mocking arthur and his men
It's almost like you just put defenses around all that stuff and can just peer over the wall at would be attackers and make faces at them....

But wait... can't someone come along then and just create a more lenient policy called default?

No! That will throw an exception!

Also, you don't have to create a default. If you don't, and someone tries to use one of those methods to assign a string, it will throw.

The only thing this enforcement cares about is that it is one of these "blessed" types. Website administrators can also provide (in the header) the name of 1 or more policies which should be created.

Any attempts to define a policy not in that list will throw (it's a bit more complicated than that, see Name your Policy below). Let's imagine that in the header we specified that a policy named "sanitize" is allowed to be created.

Maybe you can see some of why that starts to get really interesting. In order to use any of those APIs (at all), you'd need access to a policy in order to bless the string. But because the policy which can do that blessing is a handle, it's up to you what code you give it to...

{
  const sanitizerPolicy = 
      trustedTypes.createPolicy(
        "sanitize",
        {
          createHTML: s => { 
            return DOMPurify.sanitize(s) 
        } 
  );


    // give someOtherModule access to a sanitization policy
    someOtherModule.init(sanitizerPolicy)

    // yetAnotherModule can't even sanitize, any use of those
    // APIs will throw
    yetAnotherModule.foo()
}

// Anything out here also doesn't have 
// access to a sanitization policy

What's interesting about this is that the thing doing the trusting on the client, is actually on the client as well - but the pattern ensures that this becomes a considerably more finite problem. It is much easier to audit whether the "trust" is warranted. That is, we can look at the above to see that there is only one policy and it only supports creating HTML. We can see that the trust there is placed in DOMPurify, and even that amount of trust is only provided to select modules.

Finally, most importantly: It is a pattern that is machine enforceable. Anything that tries to use any of those APIs without a blessed string (a Trusted Type) will fail... Unless you ask it not to.

Don't Throw, Just Help?

Shutting down all of those APIs after the fact is hard because all of those dangerous APIs are also really useful and therefore widely used. As I said earlier, auditing to find and understand all uses of them all is pretty difficult. Chances are pretty good that there might just be a lot more unsafe stuff floating around in your site than you expected.

Instead of Content-Security-Policy CSP headers, you can send Content-Security-Policy-Report-Only and include a directive that includes report-to /csp-violation-report-endpoint/ where /csp-violation-report-endpoint/ is an endpoint path (on the same origin). If set, whenever violations occur, browsers should send a request to report a violation to that endpoint (JSON formatted with lots of data).

The general idea is that it is then pretty easy to turn this on and monitor your site to discover where you might have some problems, and begin to work through them. This should be especially good for your QA environment. Just keep in mind that the report doesn't actually prevent the potentially bad things from happening, it just lets you know they exist.

Shouldn't there just be a standard santizer too?

Yes!! That is also a thing that is being worked on.

Name Your Policy

I'm not going to lie, I found CSP/headers to be both a little confusing to read and to figure out their relationships. You might see a header set up to report only....

Content-Security-Policy-Report-Only: report-uri /csp-violation-report-endpoint; default-src 'self'; require-trusted-types-for 'script'; trusted-types one two;

Believe it or not that's a fairly simple one. Basically though, you split it up on semi-colons and each of those is a directive. The directive has a name like "report-uri" followed by whitespace and then a list of values (potentially containing only 1) which are whitespace separated. There are also keyword values which are quoted.

So, the last two parts of this are about Trusted Types. The first, require-trusted-types-for is about what gets some kind of enforcement and really the only thing you can put there currently is the keyword 'script'. The second, trusted-types is about what policies can be created.

Note that I said "some kind of enforcement" because the above is "report only" which means those things will report, but not actually throw, while if we just change the name of the header from Content-Security-Policy-Report-Only to Content-Security-Policy lots of things might start throwing - which didn't greatly help my exploration. So, here's a little table that might help..

If the directives are... then...
(missing) You can create whatever policies you want (except duplicates), but they aren't enforced in any way.
require-trusted-types-for 'script'; You can create whatever policies you want (except duplicates), and they are enforced. All attempts to assign strings to those sinks will throw. This means if you create a policy named default, it will 'bless' strings through that automatically, but it also means anyone can create any policy to 'bless' strings too.
trusted-types You cannot create any policies whatsoever. Attempts to will throw.
trusted-types 'none' Same as with no value.
trusted-types a b You can call createPolicy with names 'a' and 'b' exactly once. Attempts to call with other names (including 'default'), or repeatedly will throw.
trusted-types default You can call createPolicy with names 'default' exactly once. Attempts to call with other names, or repeatedly will throw.
require-trusted-types-for 'script'; trusted-types a You can call createPolicy with names 'a' exactly once. Attempts to call with other names (including default), or repeatedly will throw. All attempts to assign strings to those sinks will throw unless they are 'blessed' from a function in a policy named 'a'

April 03, 2024 04:00 AM

April 02, 2024

Maíra Canal

Linux 6.8: AMD HDR and Raspberry Pi 5

The Linux kernel 6.8 came out on March 10th, 2024, bringing brand-new features and plenty of performance improvements on different subsystems. As part of Igalia, I’m happy to be an active part of many features that are released in this version, and today I’m going to review some of them.

Linux 6.8 is packed with a lot of great features, performance optimizations, and new hardware support. In this release, we can check the Intel Xe DRM driver experimentally, further support for AMD Zen 5 and other upcoming AMD hardware, initial support for the Qualcomm Snapdragon 8 Gen 3 SoC, the Imagination PowerVR DRM kernel driver, support for the Nintendo NSO controllers, and much more.

Igalia is widely known for its contributions to Web Platforms, Chromium, and Mesa. But, we also make significant contributions to the Linux kernel. This release shows some of the great work that Igalia is putting into the kernel and strengthens our desire to keep working with this great community.

Let’s take a deep dive into Igalia’s major contributions to the 6.8 release:

AMD HDR & Color Management

You may have seen the release of a new Steam Deck last year, the Steam Deck OLED. What you may not know is that Igalia helped bring this product to life by putting some effort into the AMD driver-specific color management properties implementation. Melissa Wen, together with Joshua Ashton (Valve), and Harry Wentland (AMD), implemented several driver-specific properties to allow Gamescope to manage color features provided by the AMD hardware to fit HDR content and improve gamers’ experience.

She has explained all features implemented in the AMD display kernel driver in two blog posts and a 2023 XDC talk:

Async Flip

André Almeida worked together with Simon Ser (SourceHut) to provide support for asynchronous page-flips in the atomic API. This feature targets users who want to present a new frame immediately, even if after missing a V-blank. This feature is particularly useful for applications with high frame rates, such as gaming.

Raspberry Pi 5

Raspberry Pi 5 was officially released on October 2023 and Igalia was ready to bring top-notch graphics support for it. Although we still can’t use the RPi 5 with the mainline kernel, it is superb to see some pieces coming upstream. Iago Toral worked on implementing all the kernel support needed for the V3D 7.1.x driver.

With the kernel patches, by the time the RPi 5 was released, it already included a fully 3.1 OpenGL ES and Vulkan 1.2 compliant driver implemented by Igalia.

GPU stats and CPU jobs for the Raspberry Pi 4/5

Apart from the release of the Raspberry Pi 5, Igalia is still working on improving the whole Raspberry Pi environment. I worked, together with José Maria “Chema” Casanova, implementing the support for GPU stats on the V3D driver. This means that RPi 4/5 users now can access the usage percentage of the GPU and they can access the statistics by process or globally.

I also worked, together with Melissa, implementing CPU jobs for the V3D driver. As the Broadcom GPU isn’t capable of performing some operations, the Vulkan driver uses the CPU to compensate for it. In order to avoid stalls in the job submission, now CPU jobs are part of the kernel and can be easily synchronized though with synchronization objects.

If you are curious about the CPU job implementation, you can check this blog post.

Other Contributions & Fixes

Sometimes we don’t contribute to a major feature in the release, however we can help improving documentation and sending fixes. André also contributed to this release by documenting the different AMD GPU reset methods, making it easier to understand by future users.

During Igalia’s efforts to improve the general users’ experience on the Steam Deck, Guilherme G. Piccoli noticed a message in the kernel log and readily provided a fix for this PCI issue.

Outside of the Steam Deck world, we can check some of Igalia’s work on the Qualcomm Adreno GPUs. Although most of our Adreno-related work is located at the user-space, Danylo Piliaiev sent a couple of kernel fixes to the msm driver, fixing some hangs and some CTS tests.

We also had contributions from our 2023 Igalia CE student, Nia Espera. Nia’s project was related to mobile Linux and she managed to write a couple of patches to the kernel in order to add support for the OnePlus 9 and OnePlus 9 Pro devices.

If you are a student interested in open-source and would like to have a first exposure to the professional world, check if we have openings for the Igalia Coding Experience. I was a CE student myself and being mentored by a Igalian was a incredible experience.

Check the complete list of Igalia’s contributions for the 6.8 release

Authored (57):

André Almeida (2)

Danylo Piliaiev (2)

Guilherme G. Piccoli (1)

Iago Toral Quiroga (4)

Maíra Canal (17)

Melissa Wen (27)

Nia Espera (4)

Signed-off-by (88):

André Almeida (4)

Danylo Piliaiev (2)

Guilherme G. Piccoli (1)

Iago Toral Quiroga (4)

Jose Maria Casanova Crespo (2)

Maíra Canal (28)

Melissa Wen (43)

Nia Espera (4)

Acked-by (4):

Jose Maria Casanova Crespo (2)

Maíra Canal (1)

Melissa Wen (1)

Reviewed-by (30):

André Almeida (1)

Christian Gmeiner (1)

Iago Toral Quiroga (20)

Maíra Canal (4)

Melissa Wen (4)

Tested-by (1):

Guilherme G. Piccoli (1)

April 02, 2024 11:00 AM

March 20, 2024

Jani Hautakangas

Bringing WebKit back to Android.

It’s been quite a while since the last blog post about WPE-Android, but that doesn’t mean WPE-Android hasn’t been in development. The focus has been on stabilizing the runtime and implementing the most crucial core features to make it easier to integrate new APIs into WPEView.

Main building blocks #

WPE-Android has three main building blocks:

  • Cerbero
    • Cross-platform build aggregator that is used to build WPE WebKit and all of its dependencies to Android.
  • WPEBackend-Android
    • Implements Android specific graphics buffer support and buffer sharing between WebKit UIProcess and WebProcess.
  • WPEView
    • Allows displaying web content in activity layout using WPEWebKit.

WPE-Android high-level design

What’s new #

The list of all work completed so far would be quite long, as there have been no official releases or public announcements, with the exception of the last blog post. Since that update, most of the efforts have been focused on ‘under the hood’ improvements, including enhancing stability, adding support for some core WebKit features, and making general improvements to the development infrastructure.

Here is a list of new features worth mentioning:

  • Based on WPE WebKit 2.42.1, the project was branched for the 2.42.x series. Future work will continue in the main branch for the next release.
  • Dropped support for 32-bit platforms (x86 and armv7). Only arm64-v8a and x86_64 are supported
  • Integration to Android main loop so that WPE WebKit GLib main loop is driven by Android main loop
  • Process-Swap On Navigation aka PSON
  • Added ASharedMemory support to WebKit SharedMemory
  • Hardware-accelerated multimedia playback
  • Fullscreen support
  • Cookies management
  • ANGLE based WebGL
  • Cross process fence insertion for composition synchronization with Surface Flinger
  • WebDriver support
  • GitHub Actions build bots
  • GitHub Actions WebDriver test bots

Demos #

WPEWebkit powered web view (WPEView) #

Demo uses WPE-Android MiniBrowser sample application to show basic web page loading and touch-based scrolling, usage of a simple cookie manager to clear the page’s date usage, and finally loads the popular “Aquarium sample” to show a smooth (60FPS) WebGL animation running thanks to HW acceleration support.

WebDriver #

Demo shows how to run WebDriver test with with emulator. Detailed instructions how to run WebDriver test can be found in README.md

Test requests a website through the Selenium remote webdriver. It then replaces the default behavior of window.alert on the requested page by injecting and executing a JavaScript snippet. After loading the page, it performs a click() action on the element that calls the alert. This results in the text ‘cheese’ being displayed right below the ‘click me’ link.

Test contains three building blocks:

  • WPE WebDriver running on emulator
  • HTTP server serving test.html web page
  • Selenium python test script executing the test

What’s next #

We have ambitious plans for the coming months regarding the development of WPE Android, focusing mainly on implementing additional features, stabilization, and performance improvements.

As for implementing new features, now that the integration with the WPE WebKit runtime has reached a more robust state, it’s time to start adding more of the APIs that are still missing in WPEView compared to other webviews on Android and to enable other web-facing features supported by WebKit. This effort, along with adding support for features like HTTP/2 and the remote Web Inspector, will be a major focus.

As for stabilization and performance, having WebDriver support will be very helpful as it will enable us to identify and fix issues promptly and thus help make WPE Android more stable and feature-complete. We would also like to focus on conformance testing compared to other web views on Android, which should help us prioritize our efforts.

The broader goal for this year is to develop WPE-Android into an easy-to-use platform for third-party projects, offering a compelling alternative to other webviews on the Android platform. We extend many thanks to the NLNet Foundation, whose support for WPE Android through a grant from the NGI Zero Core fund will be instrumental in helping us achieve this goal!

Try it yourself #

WPE-Android is still considered a prototype, and it’s a bit difficult to build. However, if you want to try it out, you can follow the instructions in the README.md file. Additionally, you can use the project’s issue tracker to report problems or suggest new features.

March 20, 2024 12:00 AM

March 17, 2024

Qiuyi Zhang (Joyee)

Memory leak regression testing with V8/Node.js, part 3 - heap iteration-based testing

In the previous blog post, I described the heap snapshot trick as an “abuse” of the heap snapshot API, because heap snapshots are not designed to interact with the finalizers run in the heap

March 17, 2024 07:28 PM

Memory leak regression testing with V8/Node.js, part 1 - memory usage-based testing

Like many other relatively big piece of software, Node.js is no stranger to memory leaks, and with them, fixes and regression tests

March 17, 2024 07:24 PM

Memory leak regression testing with V8/Node.js, part 2 - finalizer-based testing

In the previous blog post, I talked about how Node.js used memory usage measurement to test against memory leaks. Sometimes that’s good enough to provide valid tests

March 17, 2024 07:20 PM

March 13, 2024

Brian Kardell

How We Fund the Web Ecosystem

How We Fund the Web Ecosystem

On Tuesday (March 12th, 2024), Robin Berjon and Eric Meyer and I organized, led and scribed a session during W3C breakouts day about how we fund the web ecosystem…

Every year the W3C has a week long giant set of in-person meetings called TPAC. A nice feature of those meetings has always been “Breakouts Day” which is a day where people can propose sessions about pretty much anything and we try to organize a schedule around the ones that seem interesting to enough people.

This year, the W3C decided to try a second Breakouts Day that is not at the same time as TPAC, and was purely online.

Over the last several years, I’ve written several pieces about different aspects of the health of the web ecosystem and led a podcast series with quite a few episodes about that. In those pieces I’ve argued that while the web ecosystem has become the infrastructure for nearly everything, our models for funding and prioritization of the last 20 years have proven not only inadequate, and problematic, but ultimately fragile and cannot last. The only questions, I’ve argued, are how soon and what happens next. Are we ready for it? (hint: no).

So, I talked to a few people and we proposed this as a topic. It was well attended. We began with a short presentation (we made a very detailed outline together, but credit goes to Robin for the great slides).

We organized the presentation into sort of 2 parts. First I presented explaining the problems and why we believe this requires our attention and action. First we have to admit that we have a problem, right? And that this is a problem that we should be concerned with… If not us, who? If not now, when?

Then I outlined that there are many possible solutions and elements of solutions that we can discuss (or try), but all of them share some common elements:

  1. We need a way to take in common money, and a way to actually encourage money into the pot.
  2. We need a way to efficiently and fairly prioritize the money in the pot toward actual work.

I highlighted that there are existing things we can already try (and are trying), and that we should really start trying more.

After this, Robin presented a bigger possible vision we tried to lay out with lots of still fuzzy areas and questions - but effectively: We create an institution which is (through one of a few possibilities) able to compel participation into a system which enforces more sustainable (and fairer) characteristics which guarantee support for the infrastructure of the web.

You can get a very good idea of what was actually presented from the detailed outline that we shared too.

But all of this was only the initial short presentation which I think was only maybe 5-10 minutes. The rest was the point of the breakout: Actual discussion.

I think it was very positive actually. The main thing that impressed me is that there was seemingly no push back or questioning at at all in the premise. We agree with the fundamentals - that as I explained in Webrise, it’s fragile from this perspective, and we need to care about it.

Rick Beyers (from Google, but not speaking for Google) mentioned what they observed in Chromium contributions and that they also had concerns about diversity, both in terms of contributions to a single engine, and multiple engines. He also mentioned that Chromium was spinning up a new collective idea (not yet announced).

Just this morning, we helped launch the Servo collective. The timing is purely coincidental. I’d also note that in part of my presentation I mentioned exploring ways that governments can incentivize and forgot to mention that there have been interesting developments in some open source funding happening this way recently, and to note that the White House recently made a statement that Future Software Should Be Memory Safe. If anyone has a good ‘in’ at the White House please make the case that if you want to know a good place to invest to be sure a lot of future stuff is memory safe, it’s probably browsers - and especially the one written in Rust :). The collective would be happy to accept the White House’s check.

There were also some interesting questions about whether Web Monetization could be related to this, or is just a wholly separate problem, about how the advertising model is exceptionally progressive, and where other investment comes from currently.

Happily minutes are available if you’re interested - and we’ll be trying to organize some immediate discussions on where we go from here through this repo which also has a rough outline of how one solution might work.

March 13, 2024 04:00 AM

March 11, 2024

Brian Kardell

The Darkening

The Darkening

Some additions to my half-light library for styling Shadow DOM.

In case you haven't followed, a number of my recent posts have been thinking about how Shadow DOM falls short, and how we can can use the tools we have today to explore potential solutions and improvements. This has led to good feedback and quick iteration on a tiny library called half-light.

This library lets page authors selectively "push" styles down into shadow roots as adopted @layers. It plugs into CSS's media queries, and allows selectivity on both ends: Which styles and which specific shadow roots. I'm not going to recap the whole concept and interface here because it would just be regurgitating the stuff that's already in the link above. What I want to write about is the latest set of improvements to half-light.

no-light

An important point "feature" of half-light is that it doesn't require web components to be built in a special way in order to achieve this. Alternative/Previous ideas required elements to subclass a special element, for example, to say "Yes, I am styleable from above". However, I don't believe that that seems practical for reasons I explained in Lovely Trees. Effectively, it is very difficult to grow iterations on that approach naturally, and there are already so many wonderful components out there which a number of people say "I wish I could use, but alas I cannot provide some basic styling". This library just lets them do exactly that: Provide styling from the outside (with some caveats below), to see what it's like to actually live with that possibility. Do they love it? Ultimately regret it?

I don't believe that this is some kind of breach of contract to allow that kind of styling from the outside in open shadow roots. The simple fact that the library has to do hardly anything shows just how easy it is, technically, for any page author to do it already today. No "new powers" have been added.

It's harder to make this case with closed shadow roots. While it is entirely possible for page authors to take control of closed shadow roots too, they have to achieve this by changing their nature and, effectively saying "sorry, no your closed shadow roots will be open roots in my page". However much I am not a fan of the closed roots, I do think that someone who made their root closed gave a pretty strong opinion that you're not supposed to touch the inside, even if you think you want to.

What I hadn't considered is that open roots could exist inside of closed roots, and with my pattern you could still have styled them the same way. That felt wrong, so I fixed that by making the whole subtree "darkened". That is, half-light can't get any light past it.

I also made a check for an attribute (or property) called darkened which can achieve this for a light DOM as well. You set it on the shadow host. Thus, if you write myElement.darkened = true or <my-element darkened> it will prevent half-light from applying to the whole subtree. That trick can be used by both custom elements and page authors directly if they find it helpful. Similarly, it's benign otherwise, so component authors can start adding it if they want to and whether page authors happen to be using half-light or not doesn't matter.

Optimizations

The very first edition of half-light was a little greedy as to processing and probably was doing a little too much work. I didn't hear anyone say that they actually experienced a problem, but it was clearly not as efficient as it could be, so, I improved that generally.

But I also built half-light to be a little resilient to where in the head it was loaded, and to conveniently to work with dev tools. That means you can go ahead and live-change the CSS its aware of that and will propagate changes down into your components. This is achieved via a MutationObserver.

Now, in practice I don't think this is likely to be much of a problem in most cases. Once the document has settled down enough to start rendering, I'm not sure how heavy head mutations are these days - but it seems pretty reasonable to be able to say "you don't have to keep observing", so I added that ability too. If you include the disable-live-half-light attribute on the script tag that you use to include half-light, it will stop monitoring.

There's a practical upshot to that as well: This means it can also disengage some book keeping which technically leaks memory (this won't practically cause you issues on your blog or something, it's really mainly if you are doing a lot of dynamic stuff in a long-lived application).

Feedback and evolution

What I've appreciated most about this effort is the feedback and iteration. It's kind of amazing to look over such a brief period of time all of the evolution and improvements toward solving a problem. I hope that more and more people find it valuable to explore and let us know how it goes. Real world experimentation and feedback is so valuable toward ultimately developing a standard solution. Thanks to everyone who has reached out, filed issues, written a blog post, or discussed it on a podcast.

If you haven't already, please leave an emoji or a comment on this github issue to help me collect sentiment toward a solution like this in a central place.

March 11, 2024 04:00 AM

March 05, 2024

José Dapena

Maintaining downstreams of Chromium: why downstream?

Chromium, the web browser open source project Google Chrome is based on, can be considered nowadays the reference implementation of the web platform. As such, it is the first choice when implementing the web platform in a software platform or product.

Why is it like this? In this blog post I am going to introduce the topic, and then review the different reasons why a downstream Chromium is used in different projects.

A series of blog posts

This is the first of a series of blog posts, where I am going through several aspects and challenges of maintaining a downstream project using Chromium as its upstream project.

They will be mostly based on the discussions in two events. First, on The Web Engines Hackfest 2023 break out session with same title that I chaired in A Coruña. Then, on my BlinkOn 18 talk in November 2023, at Sunnyvale.

Some definitions

Before starting the discussion of the different aspects, let’s clarify how I will use several terms.

Repository vs. project

I am going to refer to a repository as a version controlled storage of code strictly. Then, a project (specifically, a software project) is the community of people that share goals and some kind of organization, to maintain one or several software products.

So, a project may use several repositories for their goals.

In this discussion I will talk about Chromium, an open source project that targets the implementation of the web platform user agent, a web browser, for different platforms. As such, it uses a number of repositories (src, v8 and more).

Downstream and upstream

I will use the downstream and upstream terms, referred to the relationship of different software projects version control repositories.

If there is a software project repository (typically open source), and a new repository is created that contains all or part of the original repository, then:

  • Upstream project will be the original repository.
  • Downstream project is the new repository.

It is important to highlight that different things can happen to the downstream repository:

  • This copy could be a one time event, so the downstream repository becomes an independent fork, and there may be no interest in tracking the upstream evolution. This happens often for abandoned repositories, where a different set of people start an independent project. But there could be other reasons.
  • There is the intent to track the upstream repository changes. So the downstream repository evolves as the upstream repository does too, but with some specific differences maintained on top of the original repository.

Why using Chromium?

Nowadays, web platform is a solid alternative for providing contents to the users. It allows modern user interfaces, based on well known standards, and integrate well with local and remote services. The gap between native applications and web contents has been reduced, so it is a good alternative quite often.

But, when integrating web contents, product integrators need an implementation of the web platform. It is no surprise that Chromium is the most used, for a number of reasons:

  • It is open source, with a license that allows to adapt it for new product needs.
  • It is well maintained and up to date. Even pushing through standardization for improving it continuously.
  • It is secure, both from architecture and maintenance model point of view.
  • It provides integration points to tailor the implementation to ones needs.
  • It supports the most popular software platforms (Windows, Android, Linux, …) for integrating new products.
  • On top of the web platform itself, it provides an implementation for many of the components required to build a modern web browser.

Still, there are other good alternate choices for integrating the web, as WebKit (specially WPE for the embedded use cases), or using the system-provided web components (Android or iOS web view, …).

Though, in this blog post I will focus on the Chromium case.

Why downstreaming Chromium?

But, why do different projects need to use downstream Chromium repositories?

The main simple reason the project needs a downstream repository is adding changes that are not upstream. This can be for a variety of reasons:

  • Downstream changes that are not allowed by upstream. I.e. because they will make upstream project harder to maintain, or it will not be tested often.
  • Downstream changes that downstream project does not want to add to upstream.

Let’s see some examples of changes of both types.

Hardware and OS adaptation

This is when downstream adds support for a hardware target or OS that is not originally supported in upstream Chromium project.

Chromium upstream provides an abstraction layer for that purpose named Ozone, that allows to adapt it to the OS, desktop environment, and system graphics compositor. But there are other abstraction layers for media acceleration, accessibility or input methods.

The Wayland protocol adaptation started as a downstream effort, as upstream Chromium did not intend to support Wayland at that time. Eventually it evolved into an upstream official Ozone backend.

An example? LGE webOS Chromium port.

Differentiation

The previous case mostly forces to have a downstream project or repository. But there are also some cases where this is intended. There is the will to have some features in the downstream repository and not in upstream, an intended differentiation.

Why would anybody want that? Some typical examples:

  • A set of features that the downstream project owners consider to make the project better in some way, and want them to be kept downstream. This can happen when a new browser is shipped, and it contains features that make the product offering different, and, in some ways, better, than upstream Chrome. That can be a different user experience, some security features, better privacy…
  • Adaptation to a different product brand. Each browser or browser-based product will want to have its specific brand instead of upstream Chromium brand.

Examples of this:

  • Brave browser, with completely different privacy and security choices.
  • ARC browser, with an innovative user experience.
  • Microsoft Edge, with tight Windows OS integration and corporate features.

Hybrid application runtimes

And one last interesting case: integrating the web platform for developing hybrid applications: those that mix parts of the user interface implemented in a native toolkit, and parts implemented using the web platform.

Though Chromium includes official support for hybrid applications for Android, with the Android Web View, other toolkits provide also web applications support, and the integration of those belong, in Chromium case, to downstream projects.

Examples?

What’s next?

In this blog post I presented different reasons why projects end up maintaining a downstream fork of Chromium.

In the next blog post I will present one of the main challenges when maintaining a downstream of Chromium: the different rebase and upgrade strategies.

by José Dapena Paz at March 05, 2024 03:54 PM

February 26, 2024

Andy Wingo

on the impossibility of composing finalizers and ffi

While poking the other day at making a Guile binding for Harfbuzz, I remembered why I don’t much do this any more: it is impossible to compose GC with explicit ownership.

Allow me to illustrate with an example. Harfbuzz has a concept of blobs, which are refcounted sequences of bytes. It uses these in a number of places, for example when loading OpenType fonts. You can get a peek at the blob’s contents back with hb_blob_get_data, which gives you a pointer and a length.

Say you are in LuaJIT. (To think that for a couple years, I wrote LuaJIT all day long; now I can hardly remember.) You get a blob from somewhere and want to get its data. You define a wrapper for hb_blob_get_data:

local hb = ffi.load("harfbuzz")
ffi.cdef [[
typedef struct hb_blob_t hb_blob_t;

const char *
hb_blob_get_data (hb_blob_t *blob, unsigned int *length);
]]

Presumably you then arrange to release LuaJIT’s reference on the blob when GC collects a Lua wrapper for a blob:

ffi.cdef [[
void hb_blob_destroy (hb_blob_t *blob);
]]

function adopt_blob(ptr)
  return ffi.gc(ptr, hb.hb_blob_destroy)
end

OK, so let’s say we get a blob from somewhere, and want to copy out its contents as a byte string.

function blob_contents(blob)
   local len_out = ffi.new('unsigned int')
   local contents = hb.hb_blob_get_data(blob, len_out)
   local len = len_out[0];
   return ffi.string(contents, len)
end

The thing is, this code is as correct as you can get it, but it’s not correct enough. In between the call to hb_blob_get_data and, well, anything else, GC could run, and if blob is not used in the future of the program execution (the continuation), then it could be collected, causing the hb_blob_destroy finalizer to release the last reference on the blob, freeing contents: we would then be accessing invalid memory.

Among GC implementors, it is a truth universally acknowledged that a program containing finalizers must be in want of a segfault. The semantics of LuaJIT do not prescribe when GC can happen and what values will be live, so the GC and the compiler are not constrained to extend the liveness of blob to, say, the entirety of its lexical scope. It is perfectly valid to collect blob after its last use, and so at some point a GC will evolve to do just that.

I chose LuaJIT not to pick on it, but rather because its FFI is very straightforward. All other languages with GC that I am aware of have this same issue. There are but two work-arounds, and neither are satisfactory: either develop a deep and correct knowledge of what the compiler and run-time will do for a given piece of code, and then pray that knowledge does not go out of date, or attempt to manually extend the lifetime of a finalizable object, and then pray the compiler and GC don’t learn new tricks to invalidate your trick.

This latter strategy takes the form of “remember-this” procedures that are designed to outsmart the compiler. They have mostly worked for the last few decades, but I wouldn’t bet on them in the future.

Another way to look at the problem is that once you have a system working—though, how would you know it’s correct?—then you either never update the compiler and run-time, or you become fast friends with whoever maintains your GC, and probably your compiler too.

For more on this topic, as always Hans Boehm has the first and last word; see for example the 2002 Destructors, finalizers, and synchronization. These considerations don’t really apply to destructors, which are used in languages with ownership and generally run synchronously.

Happy hacking, and be safe out there!

by Andy Wingo at February 26, 2024 10:05 AM

February 23, 2024

Manuel Rego

Servo at FOSDEM 2024

Following the trend started on my last blog post this is a blog post about Servo presence in a new event, this time FOSDEM 2024. This was my first time at FOSDEM, which is a special and different conference with lots of people and talks; it was quite an experience!

Picture of Rakhi during her Servo talk at FOSDEM with Tauri experiment slide in the background.
Servo talk by Rakhi Sharma at FOSDEM 2024

Embedding Servo in Rust projects #

My colleague Rakhi Sharma gave a great talk about Servo in the Rust devroom. The talk went deep into how Servo can be embedded by other Rust projects, showing the work on the Servo’s Minibrowser, and the latest experiments related to the integration with Tauri through their cross-platform WebView library called wry. It’s worth to highlight that the latter has been sponsored by NLnet Foundation, big thanks for your support! Attending FOSDEM was nice as we also met Daniel Thompson-Yvetot from Tauri, which we have been working with as part of this collaboration, and discussed about improvements and plans for the future.

Slides and video of the talk are already available online. Don’t miss it if you’re interested in the more recent news around Servo development and how to embedded it in other projects.

Servo at Linux Foundation Europe stand #

Servo is a Linux Foundation Europe project, and they have a stand at FOSDEM. This allowed us to show Servo to people passing by there and explain the ongoing development efforts on the project. There were also some nice Servo swag that people really liked. The stand gave us the chance to talk to different people about Servo. Most of them were very interested in the project revival and are hopeful about a promising future for the project.

Picture of Linux Foundation Europe stand at FOSDEM 2024 with different Servo swag (stickers, flyers, buttons, cups, caps, teddy bears). Also in the picture there is a screen showing a video of Servo rendering WebGPU content.
Servo swag at Linux Foundation Europe stand at FOSDEM 2024

As part of our attendance to FOSDEM we had the chance to meet some people we have been working with recently. We met Taym Haddadi one active Servo contributor, it was nice to talk to people contributing to the project and meet them in real life. In addition, we were also talking briefly with NLnet folks, they have one of the most popular stands with stickers of all the projects they fund. Furthermore, we also have the chance to talk to Outreachy folks, as we’re working into getting Servo back into Outreachy internship program this year.

Other Igalia talks at FOSDEM 2024 #

Finally, I’d like to highlight that Igalia presence at FOSDEM was much bigger than only Servo related stuff. There were a large groups of igalians participating in the event, and we have a bunch of talks there:

February 23, 2024 12:00 AM

February 20, 2024

Carlos García Campos

A Clarification About WebKit Switching to Skia

In the previous post I talked about the plans of the WebKit ports currently using Cairo to switch to Skia for 2D rendering. Apple ports don’t use Cairo, so they won’t be switching to Skia. I understand the post title was confusing, I’m sorry about that. The original post has been updated for clarity.

by carlos garcia campos at February 20, 2024 06:11 PM

Alex Bradbury

Clarifying instruction semantics with P-Code

I've recently had a need to step through quite a bit of disassembly for different architectures, and although some architectures have well-written ISA manuals it can be a bit jarring switching between very different assembly syntaxes (like "source, destination" for AT&T vs "destination, source" for just about everything else) or tedious looking up different ISA manuals to clarify the precise semantics. I've been using a very simple script to help convert an encoded instruction to a target-independent description of its semantics, and thought I may as well share it as well as some thoughts on its limitations.

instruction_to_pcode

The script is simplicity itself, thanks to the pypcode bindings to Ghidra's SLEIGH library which provides an interface to convert an input to the P-Code representation. Articles like this one provide an introduction and there's the reference manual in the Ghidra repo but it's probably easiest to just look at a few examples. P-Code is used as the basis of Ghidra's decompiler and provides a consistent human-readable description of the semantics of instructions for supported targets.

Here's an example aarch64 instruction:

$ ./instruction_to_pcode aarch64 b874c925
-- 0x0: ldr w5, [x9, w20, SXTW #0x0]
0) unique[0x5f80:8] = sext(w20)
1) unique[0x7200:8] = unique[0x5f80:8]
2) unique[0x7200:8] = unique[0x7200:8] << 0x0
3) unique[0x7580:8] = x9 + unique[0x7200:8]
4) unique[0x28b80:4] = *[ram]unique[0x7580:8]
5) x5 = zext(unique[0x28b80:4])

In the above you can see that the disassembly for the instruction is dumped, and then 5 P-Code instructions are printed showing the semantics. These P-Code instructions directly use the register names for architectural registers (as a reminder, AArch64 has 64-bit GPRs X0-X30 with the bottom halves acessible through W-W30). Intermediate state is stored in unique[addr:width] locations. So the above instruction sign-extends w20, adds to x9, and reads a 32-bit value from the resulting address, then zero-extends to 64-bits when storing into x5.

The output is somewhat more verbose for architectures with flag registers, e.g. cmpb $0x2f,-0x1(%r11) produces:

./instruction_to_pcode x86-64 --no-reverse-input "41 80 7b ff 2f"
-- 0x0: CMP byte ptr [R11 + -0x1],0x2f
0) unique[0x3100:8] = R11 + 0xffffffffffffffff
1) unique[0xbd80:1] = *[ram]unique[0x3100:8]
2) CF = unique[0xbd80:1] < 0x2f
3) unique[0xbd80:1] = *[ram]unique[0x3100:8]
4) OF = sborrow(unique[0xbd80:1], 0x2f)
5) unique[0xbd80:1] = *[ram]unique[0x3100:8]
6) unique[0x28e00:1] = unique[0xbd80:1] - 0x2f
7) SF = unique[0x28e00:1] s< 0x0
8) ZF = unique[0x28e00:1] == 0x0
9) unique[0x13180:1] = unique[0x28e00:1] & 0xff
10) unique[0x13200:1] = popcount(unique[0x13180:1])
11) unique[0x13280:1] = unique[0x13200:1] & 0x1
12) PF = unique[0x13280:1] == 0x0

But simple instructions that don't set flags do produce concise P-Code:

$ ./instruction_to_pcode riscv64 "9d2d"
-- 0x0: c.addw a0,a1
0) unique[0x15880:4] = a0 + a1
1) a0 = sext(unique[0x15880:4])

Other approaches

P-Code was an intermediate language I'd encountered before and of course benefits from having an easy to use Python wrapper and fairly good support for a range of ISAs in Ghidra. But there are lots of other options - angr (which uses Vex, taken from Valgrind) compares some options and there's more in this paper. Radare2 has ESIL, but while I'm sure you'd get used to it, it doesn't pass the readability test for me. The rev.ng project uses QEMU's TCG. This is an attractive approach because you benefit from more testing and ISA extension support for some targets vs P-Code (Ghidra support is lacking for RVV, bitmanip, and crypto extensions).

Another route would be to pull out the semantic definitions from a formal spec (like Sail) or even an easy to read simulator (e.g. Spike for RISC-V). But in both cases, definitions are written to minimise repetition to some degree, while when expanding the semantics we prefer explicitness, so would want to expand to a form that differs a bit from the Sail/Spike code as written.

February 20, 2024 12:00 PM

February 19, 2024

Carlos García Campos

WebKitGTK and WPEWebKit Switching to Skia for 2D Graphics Rendering

In recent years we have had an ongoing effort to improve graphics performance of the WebKit GTK and WPE ports. As a result of this we shipped features like threaded rendering, the DMA-BUF renderer, or proper vertical retrace synchronization (VSync). While these improvements have helped keep WebKit competitive, and even perform better than other engines in some scenarios, it has been clear for a while that we were reaching the limits of what can be achieved with a CPU based 2D renderer.

There was an attempt at making Cairo support GPU rendering, which did not work particularly well due to the library being designed around stateful operation based upon the PostScript model—resulting in a convenient and familiar API, great output quality, but hard to retarget and with some particularly slow corner cases. Meanwhile, other web engines have moved more work to the GPU, including 2D rendering, where many operations are considerably faster.

We checked all the available 2D rendering libraries we could find, but none of them met all our requirements, so we decided to try writing our own library. At the beginning it worked really well, with impressive results in performance even compared to other GPU based alternatives. However, it proved challenging to find the right balance between performance and rendering quality, so we decided to try other alternatives before continuing with its development. Our next option had always been Skia. The main reason why we didn’t choose Skia from the beginning was that it didn’t provide a public library with API stability that distros can package and we can use like most of our dependencies. It still wasn’t what we wanted, but now we have more experience in WebKit maintaining third party dependencies inside the source tree like ANGLE and libwebrtc, so it was no longer a blocker either.

In December 2023 we made the decision of giving Skia a try internally and see if it would be worth the effort of maintaining the project as a third party module inside WebKit. In just one month we had implemented enough features to be able to run all MotionMark tests. The results in the desktop were quite impressive, getting double the score of MotionMark global result. We still had to do more tests in embedded devices which are the actual target of WPE, but it was clear that, at least in the desktop, with this very initial implementation that was not even optimized (we kept our current architecture that is optimized for CPU rendering) we got much better results. We decided that Skia was the option, so we continued working on it and doing more tests in embedded devices. In the boards that we tried we also got better results than CPU rendering, but the difference was not so big, which means that with less powerful GPUs and with our current architecture designed for CPU rendering we were not that far from CPU rendering. That’s the reason why we managed to keep WPE competitive in embeeded devices, but Skia will not only bring performance improvements, it will also simplify the code and will allow us to implement new features . So, we had enough data already to make the final decision of going with Skia.

In February 2024 we reached a point in which our Skia internal branch was in an “upstreamable” state, so there was no reason to continue working privately. We met with several teams from Google, Sony, Apple and Red Hat to discuss with them about our intention to switch from Cairo to Skia, upstreaming what we had as soon as possible. We got really positive feedback from all of them, so we sent an email to the WebKit developers mailing list to make it public. And again we only got positive feedback, so we started to prepare the patches to import Skia into WebKit, add the CMake integration and the initial Skia implementation for the WPE port that already landed in main.

We will continue working on the Skia implementation in upstream WebKit, and we also have plans to change our architecture to better support the GPU rendering case in a more efficient way. We don’t have a deadline, it will be ready when we have implemented everything currently supported by Cairo, we don’t plan to switch with regressions. We are focused on the WPE port for now, but at some point we will start working on GTK too and other ports using cairo will eventually start getting Skia support as well.

by carlos garcia campos at February 19, 2024 01:27 PM

February 16, 2024

Melissa Wen

Keep an eye out: We are preparing the 2024 Linux Display Next Hackfest!

Igalia is preparing the 2024 Linux Display Next Hackfest and we are thrilled to announce that this year’s hackfest will take place from May 14th to 16th at our HQ in A Coruña, Spain.

This unconference-style event aims to bring together the most relevant players in the Linux display community to tackle current challenges and chart the future of the display stack.

Key goals for the hackfest include:

  • Releasing the power of collaboration: We’ll work to remove bottlenecks and pave the way for smoother, more performant displays.
  • Problem-solving powerhouse: Brainstorming sessions and collaborative coding will target issues like HDR, color management, variable refresh rates, and more.
  • Building on past commitments: Let’s solidify the progress made in recent years and push the boundaries even further.

The hackfest fosters an intimate and focused environment to brainstorm, hack, and design solutions alongside fellow display experts. Participants will dive into discussions, tinker with code, and contribute to shaping the future of the Linux display stack.

More details are available on the official website.

Stay tuned! Keep an eye out for more information, mark your calendars and start prepping your hacking gear.

February 16, 2024 05:25 PM

February 14, 2024

Lucas Fryzek

A Dive into Vulkanised 2024

Vulkanised sign at google’s office
Vulkanised sign at google’s office

Last week I had an exciting opportunity to attend the Vulkanised 2024 conference. For those of you not familar with the event, it is “The Premier Vulkan Developer Conference” hosted by the Vulkan working group from Khronos. With the excitement out of the way, I decided to write about some of the interesting information that came out of the conference.

A Few Presentations

My colleagues Iago, Stéphane, and Hyunjun each had the opportunity to present on some of their work into the wider Vulkan ecosystem.

Stéphane and Hyujun presenting
Stéphane and Hyujun presenting

Stéphane & Hyunjun presented “Implementing a Vulkan Video Encoder From Mesa to Streamer”. They jointly talked about the work they performed to implement the Vulkan video extensions in Intel’s ANV Mesa driver as well as in GStreamer. This was an interesting presentation because you got to see how the new Vulkan video extensions affected both driver developers implementing the extensions and application developers making use of the extensions for real time video decoding and encoding. Their presentation is available on vulkan.org.

Iago presenting
Iago presenting

Later my colleague Iago presented jointly with Faith Ekstrand (a well-known Linux graphic stack contributor from Collabora) on “8 Years of Open Drivers, including the State of Vulkan in Mesa”. They both talked about the current state of Vulkan in the open source driver ecosystem, and some of the benefits open source drivers have been able to take advantage of, like the common Vulkan runtime code and a shared compiler stack. You can check out their presentation for all the details.

Besides Igalia’s presentations, there were several more which I found interesting, with topics such as Vulkan developer tools, experiences of using Vulkan in real work applications, and even how to teach Vulkan to new developers. Here are some highlights for some of them.

Using Vulkan Synchronization Validation Effectively

John Zulauf had a presentation of the Vulkan synchronization validation layers that he has been working on. If you are not familiar with these, then you should really check them out. They work by tracking how resources are used inside Vulkan and providing error messages with some hints if you use a resource in a way where it is not synchronized properly. It can’t catch every error, but it’s a great tool in the toolbelt of Vulkan developers to make their lives easier when it comes to debugging synchronization issues. As John said in the presentation, synchronization in Vulkan is hard, and nearly every application he tested the layers on reveled a synchronization issue, no matter how simple it was. He can proudly say he is a vkQuake contributor now because of these layers.

6 Years of Teaching Vulkan with Example for Video Extensions

This was an interesting presentation from a professor at the university of Vienna about his experience teaching graphics as well as game development to students who may have little real programming experience. He covered the techniques he uses to make learning easier as well as resources that he uses. This would be a great presentation to check out if you’re trying to teach Vulkan to others.

Vulkan Synchronization Made Easy

Another presentation focused on Vulkan sync, but instead of debugging it, Grigory showed how his graphics library abstracts sync away from the user without implementing a render graph. He presented an interesting technique that is similar to how the sync validation layers work when it comes ensuring that resources are always synchronized before use. If you’re building your own engine in Vulkan, this is definitely something worth checking out.

Vulkan Video Encode API: A Deep Dive

Tony at Nvidia did a deep dive into the new Vulkan Video extensions, explaining a bit about how video codecs work, and also including a roadmap for future codec support in the video extensions. Especially interesting for us was that he made a nice call-out to Igalia and our work on Vulkan Video CTS and open source driver support on slide (6) :)

Thoughts on Vulkanised

Vulkanised is an interesting conference that gives you the intersection of people working on Vulkan drivers, game developers using Vulkan for their graphics backend, visual FX tool developers using Vulkan-based tools in their pipeline, industrial application developers using Vulkan for some embedded commercial systems, and general hobbyists who are just interested in Vulkan. As an example of some of these interesting audience members, I got to talk with a member of the Blender foundation about his work on the Vulkan backend to Blender.

Lastly the event was held at Google’s offices in Sunnyvale. Which I’m always happy to travel to, not just for the better weather (coming from Canada), but also for the amazing restaurants and food that’s in the Bay Area!

Great bay area food
Great bay area food

February 14, 2024 05:00 AM

February 13, 2024

Víctor Jáquez

GstVA library in GStreamer 1.22 and some new features in 1.24

I know, it’s old news, but still, I was pending to write about the GstVA library to clarify its purpose and scope.I didn’t want to have a library for GstVA, because to maintain a library is a though duty. I learnt it in the bad way with GStreamer-VAAPI. Therefore, I wanted GstVA as simple as possible, direct in its libva usage, and self-contained.

As far as I know, the main usage of GStreamer-VAAPI library was to have access to the VASurface from the GstBuffer, for example, with appsink. Since the beginning of GstVA, a mechanism to access the surface ID was provided, as `GStreamer OpenGL** offers access to the texture ID: through a flag when mapping a GstGL memory backed buffer. You can see how this mechanism is used in the one of the GstVA sample apps.

Nevertheless, later another use case appeared which demanded a library for GstVA: Other GStreamer elements needed to produce or consume VA surface backed buffers. Or to say it more concretely: they needed to use the GstVA allocator and buffer pool, and the mechanism to share the GstVA context along the pipeline too. The plugins with those VA related elements are msdk and qsv, when they operate on Linux. Both elements use Intel OneVPL, so they are basically competitors. The main difference is that the first is maintained by Intel, and has more features, specially for Linux, whilst the former in maintained by Seungha Yang, and it appears to be better tested in Windows.

These are the objects exposed by the GstVA library API:

GstVaDisplay #

GstVaDisplay represents a VADisplay, the interface between the application and the hardware accelerator. This class is abstract, and it’s supposed not to be instantiated. Instantiation has to go through it derived classes, such as

This class is shared among all the elements in the pipeline via GstContext, so all the plugged elements share the same connection the hardware accelerator. Unless the plugged element in the pipeline has a specific name, as in the case of multi GPU systems.

Let’s talk a bit about multi GPU systems. This is the gst-inspect-1.0 output of a system with an Intel and an AMD GPU, both with VA support:

$ gst-inspect-1.0 va
Plugin Details:
Name va
Description VA-API codecs plugin
Filename /home/igalia/vjaquez/gstreamer/build/subprojects/gst-plugins-bad/sys/va/libgstva.so
Version 1.23.1.1
License LGPL
Source module gst-plugins-bad
Documentation https://gstreamer.freedesktop.org/documentation/va/
Binary package GStreamer Bad Plug-ins git
Origin URL Unknown package origin

vaav1dec: VA-API AV1 Decoder in Intel(R) Gen Graphics
vacompositor: VA-API Video Compositor in Intel(R) Gen Graphics
vadeinterlace: VA-API Deinterlacer in Intel(R) Gen Graphics
vah264dec: VA-API H.264 Decoder in Intel(R) Gen Graphics
vah264lpenc: VA-API H.264 Low Power Encoder in Intel(R) Gen Graphics
vah265dec: VA-API H.265 Decoder in Intel(R) Gen Graphics
vah265lpenc: VA-API H.265 Low Power Encoder in Intel(R) Gen Graphics
vajpegdec: VA-API JPEG Decoder in Intel(R) Gen Graphics
vampeg2dec: VA-API Mpeg2 Decoder in Intel(R) Gen Graphics
vapostproc: VA-API Video Postprocessor in Intel(R) Gen Graphics
varenderD129av1dec: VA-API AV1 Decoder in AMD Radeon Graphics in renderD129
varenderD129av1enc: VA-API AV1 Encoder in AMD Radeon Graphics in renderD129
varenderD129compositor: VA-API Video Compositor in AMD Radeon Graphics in renderD129
varenderD129deinterlace: VA-API Deinterlacer in AMD Radeon Graphics in renderD129
varenderD129h264dec: VA-API H.264 Decoder in AMD Radeon Graphics in renderD129
varenderD129h264enc: VA-API H.264 Encoder in AMD Radeon Graphics in renderD129
varenderD129h265dec: VA-API H.265 Decoder in AMD Radeon Graphics in renderD129
varenderD129h265enc: VA-API H.265 Encoder in AMD Radeon Graphics in renderD129
varenderD129jpegdec: VA-API JPEG Decoder in AMD Radeon Graphics in renderD129
varenderD129postproc: VA-API Video Postprocessor in AMD Radeon Graphics in renderD129
varenderD129vp9dec: VA-API VP9 Decoder in AMD Radeon Graphics in renderD129
vavp9dec: VA-API VP9 Decoder in Intel(R) Gen Graphics

22 features:
+-- 22 elements

As you can observe, each card registers the supported element. The first GPU, the Intel one, doesn’t insert renderD129 after the va prefix, while the AMD Radeon, does. As you could imagine, renderD129 is the device name in /dev/dri:

$ ls /dev/dri
by-path card0 card1 renderD128 renderD129

The appended device name expresses the DRM device the elements use. And only after the second GPU the device name is appended. This allows to use the untagged elements either for the first GPU or to allow the usage of a wrapped display injected by the user application, such as VA X11 or Wayland connections.

Notice that sharing DMABuf-based buffers between different GPUs is theoretically possible, but not assured.

Keep in mind that, currently, nvidia-vaapi-driver is not supported by GstVA, and the driver will be ignored by the plugin register since 1.24.

VA allocators #

There are two types of memory allocators in GstVA:

  • VA surface allocator. It allocates a GstMemory that wraps a VASurfaceID, which represents a complete frame directly consumable by VA-based elements. Thus, a GstBuffer will hold one and only one GstMemory of this type. These buffers are the very same used by the system memory buffers, since the surfaces generally can map their content to CPU memory (unless they are encrypted, but GstVA haven’t been tested for that use case).
  • VA-DMABuf allocator. It descends from GstDmaBufAllocator, though it’s not an allocator in strict sense, since it only imports DMABufs to VA surfaces, and exports VA surfaces to DMABufs. Notice that a single VA surface can be exported as multiple DMABuf-backed GstMemories, and the contrary, a VA surface can be imported from multiple DMABufs.

Also, VA allocators offer a couple methods that can be useful for applications that use appsink, for example gst_va_buffer_get_surface and gst_va_buffer_peek_display.

One particularity of both allocators is that they keep an internal pool of the allocated VA surfaces, since the mapping of buffers/memories is not always 1:1, as in the case of DMABuf; to reuse surfaces even if the buffer pool ditches them, and to avoid surface leaks, keeping track of them all the time.

GstVaPool #

This class is only useful for other GStreamer elements capable of consume VA surfaces, so they could instantiate, configure and propose a VA buffer pool, but not for applications.

And that’s all for now. Thank you for bear with me up to here. But remember, the API of the library is unstable, and it can change any time. So, no promises are made :)

February 13, 2024 12:00 AM

February 12, 2024

Brian Kardell

StyleSheet Parfait

StyleSheet Parfait

In this post I'll talk about some interesting things (some people might pronounce this 'footguns') around adoptedStyleSheets, conversations and thoughts around open styling problems and @layer.

If you're not familiar with adopted stylesheets, they're a way that your shadow roots can share literal styesheet instances by reference. That's a cool idea, right? If you have 10 fancy-inputs on the same page, it makes no sense for each of them to have their own copy of the whole stylesheet.

It's fairly early days for this still (in standards and support terms, at least) and we'll put improvements on top of this, but for now it is a pretty basic JavaScript API: Every shadow root now has an .adoptedStyleSheets property, which is an array. You can push stylesheets onto it or just assign an array of stylesheets. Currently those stylesheets have to be instances created via the recently introduced constructor new CSSStyleSheet().

Cool.

Steve Orvell opened an issue suggesting that I make half-light use adopted stylesheets. Sure, why not. In practice what this really saves is mainly the parse time, since browsers are pretty good at optimizing this otherwise, but that's still important and, in fact, it made the code more concise as well.

Whoopsie

However, there is an important bit about adopted stylesheets that I forgot about when I implemented this initially (which is strange because I am on the record discussing it in CSSWG): adopted stylesheets are treated as if they come after any stylesheets in the (shadow) root.

Previously, half-light (and earlier experiments) took great care to put stylesheets from the outer page before any that the component itself. That seems right to me, and what the adopted stylesheets were doing now with adopted stylesheets seemed wrong...

Enter: Layers

A solution to this newly created problem that's fairly easy is to wrap the rules that are adopted with @layer. Then, if your component has a style element, the rules in there will, by default, win. And that's true even if the rules that the page author pushed in had higher specificity! That's a pretty nice improvement. If some code helps you understand, here's a pen that illustrates it all:

See the Pen adoptedstylesheets and layers by вкαя∂εℓℓ (@briankardell) on CodePen.

Layers... Like an ogre.

Eric and I recently did a podcast on the Open Styleable shadow roots topic with Mia. Some of the thoughts she shared, and later conversations that followed with Westbrook and Nolan, convinced me to try to explore how we could use layers in half-light.

It had me thinking that the way that adopted stylesheets and layers both work seems to allow that we could develop some kind of 'shadow styling protocol' here. Maybe the simplest way to do this is to just give the layer a well-known name: I called it --crossroot

Now, when a page author uses half-light to set a style like:

@media --crossroot { 
  h1 { ... }
}

This is adopted into shadow roots as:

@layer --crossroot { 
  h1 { ... }
}

That is a lower layer than anything in the default layer of a component's shadow root (both stylesheets or adopted stylesheets).

The maybe interesting part this adds is that it means that the component itself can consciously manage it's layers if it chooses to do so! For example..

this.shadowRoot.innerHTML = `
  <style>
    @layer base, --crossroot, main;
    ... 
    /* add rules to those layers  */
    ... 
  </style>
  ${some_html}
`

If that sounds a little confusing, it's really not too bad - what it means is that, going from least to most specific, rules would evaluate roughly like:

  1. User Agent styles.
  2. Page authored rules that inherit into the Shadow DOM.
  3. @layers (including --crossroot half-light provides) in the shadow
  4. Rules in style elements in the shadow that aren't in a layer.

Combining ideas...

There was also some unrelated feedback from Nolan that this approach was a non-starter for some use cases - like pushing Bootstrap down to all of the shadow roots. The previous shadow-boxing library would have supported that better. Luckily, though we've built this on Media Queries, so CSS has that pretty well figured out - all we have to do is add support to half-light to make Media Queries work in link and style tags (in the head) as well. Easy enough, and I agree that's a good improvement. So, now you can also write markup like this:

<link rel="stylesheet" href="../prism.css" media="screen, --crossroot"></link>

<!-- or to target shadows of specific elements, add a selector... -->
<link rel="stylesheet" href="../prism.css" media="screen, (--crossroot x-foo)"></link>

So... That's it, this is all in half-light now... And guess what? It's still only 95 lines of code. Thanks for all the feedback so far! So, wdyt? Don't forget to leave me an indication of how you're feeling about it with an emoji (and/or comment) on the Emoji sentiment or short comment issue.

Very special thanks to everyone who has commented and shared thoughts constructively along the way, even when they might not agree. If you actually voted in the emoji sentiment poll: ❤. Thanks especially to Mia who has been great to discuss/review and improve ideas with.

February 12, 2024 05:00 AM

February 05, 2024

Eric Meyer

Bookmarklet: Load All GitHub Comments

What happened was, Brian and I were chatting about W3C GitHub issues and Brian mentioned how really long issues are annoying to search and read, because GitHub has this thing where if there are too many comments on an issue, it snips out the middle with a “Load more…” button that’s very tastefully designed and pretty easy to miss if you’re quick-scrolling to try to catch up.  The squiggle-line would be a good marker, if it weren’t so tasteful as to blend into the background in a way that makes the Baby WCAG cry.

And what’s worse, from this perspective, is that if the issue has been discussed to a very particular kind of death, the “Load more…” button can have more “Load more…” buttons hiding within.  So even if you know there was an interesting comment, and you remember a word or two of it, page-searching in your browser will do no good if the comment in question is buried one or more XMLHTTPRequest calls deep.

“I really wish GitHub had an ‘expand all comments’ button at the top or something,” Brian said (or words to that effect).

Well, it was a Friday afternoon and I was feeling code-hacky, so I wrote a bookmarklet.  Here it is in easy-to-save hyperlink form:

0){setTimeout(start,5000)}}setTimeout(start,500);void(20240130);">0){setTimeout(start,5000)}}setTimeout(start,500);void(20240130);">GitHub issue loader

It waits half a second after you activate it to find all the buttons on the page (in my test runs, usually six hundred of them).  Then it looks through all the buttons to find the ones that have a textContent of “Load more…” and dispatches a click event to each one.  With that done, it waits five seconds and does it all again, waits five seconds to do it again, and so on.  Once it finds there are zero buttons with the “Load more…” textContent, it exits.  And, if five seconds is too quick due to slow loading times, you can always invoke the bookmarklet again should you come across a “Load more…” button.

If you want this ability for yourself, just drag the link above into your bookmark toolbar or bookmarks menu, and whenever you load up a mega-thread GitHub issue, fire the bookmarklet to load all the comments.  I imagine there may be cleaner ways to do this, but I was able to codeslam this in about 15 minutes using ViolentMonkey on live GitHub pages, and it does the thing.

I did consider complexifying the ViolentMonkey script so that any GitHub page is scanned for the “Load more…” button, and if one is present, then a “Load all comments” button is plopped into the top of the page, but I knew that would take at least another 15 minutes and my codeslam window was closing.  Also, it would require anyone using it to run ViolentMonkey (or equivalent) all the time, whereas the bookmarlet has zero impact unless the user invokes it.  If you want to extend this into something more than it is and share your solution with the world, by all means feel free.

The point of all this being, if you too wish GitHub had an easy way to load all the comments without you having to search for the “Load more…” button yourself, now there’s a bookmarklet made just for you.  Enjoy!

#ghload {border: 1px solid; padding: 0.75em; background: #FED3; border-radius: 0.75em; display: block; margin-inline: auto; max-width: max-content; margin-block: 1.5em; box-shadow: 0.25em 0.33em 0.67em #0003; text-indent: 0;}

Have something to say to all that? You can add a comment to the post, or email Eric directly.

by Eric Meyer at February 05, 2024 02:49 PM

January 29, 2024

Stephen Chenney

The CSS Highlight Inheritance Model

NOTE: The information here regarding CSS Custom Properties and Highlight Pseudos is out of date. See Custom Properties for Highlight Pseudos

The CSS highlight inheritance model describes the process for inheriting the CSS properties of the various highlight pseudo elements:

  • ::selection controlling the appearance of selected content
  • ::spelling-error controlling the appearance of misspelled word markers
  • ::grammar-error controlling how grammar errors are marked
  • ::target-text controlling the appearance of the string matching a target-text URL
  • ::highlight defining the appearance of a named highlight, accessed via the CSS highlight API

The inheritance model described here was proposed and agreed upon in 2022, and is part of the CSS Pseudo-Elements Module Level 4 Working Draft Specification, but ::selection was implemented and widely used long before this, with browsers differing in how they implemented inheritance for selection styles. When this model is implemented and released to users, the inheritance of CSS properties for the existing pseudo elements, particularly ::selection will change in a way that may break sites.

The model is implemented behind a flag in Chrome 118 and higher, as is expected to be enabled by default as early as Chrome 123 in February 2024. To see the effects right now, enable “Experimental Web Platform features” via chrome://flags.

::selection, the Original highlight Pseudo #

Historically, ::selection, or even -moz-selection, was the only pseudo element available to control the highlighting of text, in this case the appearance of selected content. Sites use ::selection when they want to modify the default browser selection painting, most often adjusting the color to improve contrast or convey additional meaning to users. Sites also use the feature to work around selection painting problems in browsers, with the style .broken::selection { background-color: transparent; } and a .broken class on elements with broken selection painting. We’ll use this last situation as the example as we demonstrate the changes to property inheritance.

The ::selection pseudo element as implemented in most browsers uses originating element inheritance, meaning that the selection inherited properties from the style of the nearest element to which the selection is being applied. The inheritance behavior of ::selection has never been fully compatible among browsers, and has failed to comply with the CSS specification even as the spec has been updated over time. As a result, workarounds are required when a page-wide selection style is inadequate, as demonstrated with this example (works when highlight inheritance is not enabled):

<style>
::selection /* = *::selection (universal) */ {
background-color: lightgreen;
}
.broken::selection {
background-color: transparent;
}
</style>
<p>Some <em>not broken</em> text</p>
<p class="broken">Some <em>broken</em> text</p>
<script>
range = new Range();
range.setStart(document.body, 0);
range.setEnd(document.body, 3)
document.getSelection().addRange(range);
</script>

The intent of the page author is probably to have everything inside .broken be invisible for selection, but that does not happen. The problem here is that ::selection applies to all elements, including the <em> element. While the .broken element uses its own ::selection, that only applies to the element itself when selected, not its descendants.

You could always add more specific ::selection selectors, such as .broken >em::selection, but that is brittle to DOM changes and leads to style bloat.

You could also use a CSS custom property for the page-wide selection color, and just make it transparent for the broken elements, like this (works when highlight inheritance is not enabled):

<style>
:root {
--selection-color: lightgrey;
}
::selection {
background-color: var(--selection-color);
}
.broken {
--selection-color: transparent;
}
</style>
<p>Some <em>not broken</em> text</p>
<p class="broken">Some <em>broken</em> text</p>
<script>
range = new Range();
range.setStart(document.body, 0);
range.setEnd(document.body, 3)
document.getSelection().addRange(range);
</script>

In this case, we define a page-wide selection color as a custom property referenced in a ::selection rule that matches all elements. The more specific .broken rule re-defines the custom property to make the selection transparent. With originating inheritance, the ::selection for the word “broken” inherits the custom property value from the <em> element (it’s originating element), which in turn inherits the custom property value from the .broken element. This custom property workaround is how sites such as GitHub have historically worked around the problems with originating inheritance for ::selection.

New highlighting Features Change the Trade-Offs #

The question facing the CSS Working Group, the people who set the standards for CSS, was whether to make the specification match the implementations using originating inheritance, or require browsers to change to use the specified highlight inheritance model. The question sat on the back-burner for a while because web developers were not complaining very loudly (there was a workaround once custom properties were implemented) and there was uncertainty around whether the change would break sites. Some new features in CSS changed the calculus.

First came a set of additional CSS highlight pseudo elements that pushed for a resolution to the question of how they should inherit. Given that the new pseudos were to be implemented according to the spec, while ::selection was not, there was the possibility for ::selection inheritance behavior to be inconsistent with that of the new highlight pseudos:

  • ::target-text for controlling how URL text-fragments targets are rendered
  • ::spelling-error for modifying the appearance of browser detected spelling errors
  • ::grammar-error for modifying browser detected grammar errors
  • ::highlight for defining persistent highlights on a page

In addition, the set of properties allowed in ::selection and other highlight pseudos was expanded to allow for text decorations and control of fill and stroke colors on text, and maybe text-shadow. More properties makes the CSS custom property workaround for originating inheritance unwieldy in practice because it expands the set of variables needed.

The decision was made to change browsers to match the spec, thus changing the way that ::selection inheritance works in browsers. ::selection was the only widely used highlight pseudo and authors expected it to use originating inheritance, so any change to the inheritance behavior was likely to cause problems for some sites.

What is highlight inheritance? #

The CSS highlight inheritance model starts with defining a highlight inheritance tree for each type of highlight pseudo. These trees are parallel to the DOM element tree with the same branching stucture, except the nodes in the tree now represent the highlight styles for each element. The highlight pseudos inherit their styles through the ancestor chain in their highlight inheritance tree, with body::selection inheriting from html::selection and so on.

Under this model, the original example behaves the same because the ::selection rule still matches all elements:

<style>
::selection {
background-color: lightgreen;
}
.broken::selection {
background-color: transparent;
}
</style>
<p>Some <em>not broken</em> text</p>
<p class="broken">Some <em>broken</em> text</p>

However, with highlight inheritance is is recommended that page-wide selection styles be set in :root::selection or body::selection.

<style>
:root::selection {
background-color: lightgreen;
}
.broken::selection {
background-color: transparent;
}
</style>
<p>Some <em>not broken</em> text</p>
<p class="broken">Some <em>broken</em> text</p>
<script>
range = new Range();
range.setStart(document.body, 0);
range.setEnd(document.body, 3)
document.getSelection().addRange(range);
</script>

The :root::selection is at the root of the highlight inheritance tree, because it is the selection pseudo for the document root element. All selection pseudos in the document now inherit from their parent in the highlight tree, and do not need to match a universal ::selection rule. Furthermore, a change in ::selection matching a particular element will also be inherited by all its descendants.

In this example, there are no rules matching the <em> elements, so they look at their parent selection pseudo. The “not broken” text will be highlighted in lightgreen inherited from the root via it’s parent chain. The “broken” text receives a transparent background inherited from its parent <p> that matches its own rule.

The Custom Property Workaround Now Fails #

The most significant breakage caused by the change to highlight inheritance is due to sites employing the custom property workaround for originating inheritance. Consider our former example with that workaround:

<style>
:root {
--selection-color: lightgrey;
}
:root::selection {
background-color: var(--selection-color);
}
.broken {
--selection-color: transparent;
}
</style>
<p>Some <em>not broken</em> text</p>
<p class="broken">Some <em>broken</em> text</p>

According to the original specification, and the initial implementation in chromium, the default selection color is used everywhere in this example because the custom --selection-color: lightgrey; property is defined on :root which is the root of the DOM tree but not the highlight tree. The latter is rooted at :root::selection which does not define the property.

Many sites use :root as the location for CSS custom properties. Upvoted answers on Stack Overflow explicitly recommend doing so for selection colors. Unsurprisingly, many sites broke when chromium initially enabled highlight inheritance, including GitHub. To fix this, the specification was changed to require that :root::selection inherit custom properties from :root.

But all is not well. The <em> element’s ::selection has its background set to the custom property value, which is now inherited from :root via :root::selection. But because the inheritance chain does not involve .broken at all, the custom property is evaluated against the :root value. To make this work, the custom property must be overridden in a .broken::selection rule:

  :root {
--selection-color: lightgrey;
}
:root::selection {
background-color: var(--selection-color);
}
.broken::selection {
--selection-color: transparent;
background-color: var(--selection-color);
}

Note that there’s really no point using custom properties overrides for highlights, unless you desire different ::selection styles in different parts of the document and want them to use a common set of property values that you would define on the root.

The Current Situation (at the time of writing) #

While writing this, highlight inheritance is used for all highlight pseudos except ::selection and ::target-text in chromium-based browsers (other browsers are still implementing the additional pseudo elements). Highlight inheritance is also enabled for ::selection and ::target-text for users that have “Experimental Web Platform features” enabled in chrome::flags in Chrome.

Chrome is trying to enable highlight inheritance for ::selection. One attempt to enable for all users was rolled back, and another attempt will be made in an upcoming release. The current plan is to iterate until the change sticks. That is, until site authors notice that things have changed and make suitable site changes.

If inertia is too strong, additional changes to the spec may be needed to support, in particular, the continued use of the existing custom property workaround.

Thanks #

The implementation of highlight inheritance in chromium was undertaken by Igalia S.L. funded by Bloomberg L.P. Delan Azabani did most of the work, with contributions from myself.

January 29, 2024 12:00 AM

January 26, 2024

Stephen Chenney

CSS Spelling and Grammar Styling

CSS Spelling and Grammar features enable sites to customize the appearance of the markers that browsers render when a spelling or grammar error is detected, and to control the display of spelling and grammar markers that the site itself defines.

  • The text decoration spelling-error and grammar-error line types render native spelling or grammar markers controlled by CSS. The primary use case is for site-specific spell-check or grammar systems, allowing sites to render native looking error markers for the errors the site itself detects, rather than errors the browser detects. They are part of the CSS Text Decoration Module Level 4 Working Draft Specification.
  • The ::spelling-error and ::grammar-error pseudo elements allow sites to apply custom styling to the browser rendered spelling and grammar error markers. The target use case enables sites to override the native markers should custom markers offer a better user experience on the site. They are part of the CSS Pseudo-Elements Module Level 4 Working Draft Specification.

These features are available for all users in Chrome 121 and upward, and corresponding Edge versions. You can also enable “Experimental Web Platform features” with chrome://flags in earlier versions. The features should be coming soon to Safari and Firefox.

CSS Text Decorations for Spelling and Grammar Errors #

Imagine a site that uses a specialized dictionary for detecting spelling errors in user content. Maybe users are composing highly domain specific content, like botany plant names, or working in a language not well represented by existing spelling checkers. To provide an experience familiar to users, the site might wish to use the native spelling and grammar error markers with errors detected by the site itself. Before the error line types for CSS Text Decoration were available, only the browser could decide what received native markers.

Here’s an example of how you might control the rendering of spelling markers.

<style>
.spelling-error {
text-decoration-line: spelling-error;
}
</style>
<div id="content">Some content that has a misspeled word in it</div>
<script>
const range = document.createRange();
range.setStart(content.firstChild, 24);
range.setEnd(content.firstChild, 33);
const newSpan = document.createElement("span");
newSpan.classList.add("spelling-error");
range.surroundContents(newSpan);
</script>

How it works:

  • Create a style for the spelling errors. In this case a spelling error will have a text decoration added with the line type for native spelling errors.
  • As the user edits contents, JS can track the text, process it for spelling errors, and then trigger a method to mark the errors.
  • The script above is one way to apply the style to the error, assuming you have the start and end offset of the error within its Text node. A range is created with the given offset, and then the contents of the range are pulled out of the Text and put within a span, that is then re-inserted where the range once was. The span gets the spelling-error style.

The result will vary across browser and platform, depending on the native conventions for marking errors.

Using other text-decoration properties or, rather, not #

All other text decoration properties are ignored when text-decoration-line: spelling-error or grammar-error are used. The native markers rendered for these decorations are not, in general, styleable, so there would be no clear way to apply additional properties (such as text-decoration-color or text-decoration-thickness).

Spelling and Grammar Highlight Pseudos #

The ::spelling-error and ::grammar-error CSS pseudo classes allow sites to customize the markers rendered by the browser for browser-detected errors. The native markers are not rendered, being replaced by whatever styling is defined by the pseudo class applied to the text content for the error.

For the first example, let’s define a red background tint to replace the native spelling markers:

<style>
::spelling-error {
background-color: rgba(255,0,0,0.5);
text-decoration-line: none;
}
</style>
<textarea id="content">Some content that has a misspeled word in it</textarea>
<script>
content.focus();
</script>

Two things to note:

  • The example defines the root spelling error style, applying to all elements. In general, sites should use a single style for spelling errors to avoid user confusion.
  • The text-decoration-line property is set to none to suppress the native spelling error marker. Without this, the style would inherit the text-decoration-line property from the user agent, which defines text-decoration-line: spelling-error.

A limited range of properties may be used in ::spelling-error and ::grammar-error; see the spec for details. The limits address two potential concerns:

  • The spelling markers must not affect layout of text in any way, otherwise the page contents would shift when a spelling error was detected.
  • The spelling markers must not be too expensive to render; see Privacy Concerns below.

The supported properties include text decorations. Here’s another example:

<style>
::spelling-error {
text-decoration: 2px wavy blue underline;
}
textarea {
width: 200px;
height: 200px;
}
</style>
<textarea id="content">Some content that has a misspeled word in it</textarea>
<script>
content.focus();
</script>

One consequence of user agent styling is that you must provide a line type for any custom text decoration. Without one, the highlight will inherit the native spelling error line type. As discussed above, all other properties are ignored with that line type. For example, this renders the native spelling error, not the defined one:

<style>
::spelling-error {
text-decoration-style: solid;
text-decoration-color: green;
}
</style>
<textarea id="content">Some content that has a misspeled word in it</textarea>
<script>
content.focus();
</script>

Accessibility Concerns #

Modifying the appearance of user agent features, such as spelling and grammar errors, has significant potential to hurt users through poor contrast, small details, or other accessibility problems. Always maintain good contrast with all the backgrounds present on the page. Ensure that error markers are unambiguous. Include spelling and grammar errors in your accessibility testing.

CSS Spelling and Grammar styling is your friend when the native error markers pose accessibility problems, such as poor contrast. It’s arguably the strongest motivation for supplying the features in the first place.

Privacy Concerns #

User dictionaries contain names and other personal information, so steps have been taken to ensure that sites cannot observe the presence or absence of spelling or grammar errors. Computed style queries in Javascript will report the presence of custom spelling and grammar styles regardless of whether or not they are currently rendered for an element.

There is some potential for timing attacks that render potential spelling errors and determine paint timing differences. This is mitigated in two ways: browsers do not report accurate timing data, and the time to render error markers is negligible in the overal workload of rendering the page. The set of allowed properties for spelling and grammar pseudos is also limited to avoid the possibility of very expensive rendering operations.

Thanks #

The implementation of CSS Spelling and Grammar Features was undertaken by Igalia S.L. funded by Bloomberg L.P.

January 26, 2024 12:00 AM

January 25, 2024

Brian Kardell

half-light

half-light

Evolving ideas on "open stylable" issues, a new proposal.

Recently I wrote a piece called Lovely Trees in which I described how the Shadow DOM puts a lot of people off because the use case it's designed around thus far aren't the one most authors feel like they have. That's a real shame because there is a lot of usefulness there that is just lost. People seem to want something just a little less shadowy.

However, there are several slightly different visions for what it is we want, and how to achieve it. Each offers significantly different details and implications.

In that piece I also noted that we can make any of these possible through script in order to explore the space, gain practical experience with and iterate on until we at least have a pretty good idea what will work really well.

To this end I offered a library that I called "shadow-boxing" with several possible "modes" authors could explore.

"Feels like the wrong place"

Many people seemed to think this would be way better expressed in CSS itself somehow, rather than metadata in your HTML.

There is a long history here. Originally there were combinators for crossing the shadow boundary, but these were problematic and removed.

However, the fact that this kept coming up in different ways made me continue to discuss and bounce possible ideas around. I made a few rough takes, shared them with some people, and thanks to Mia and Dave Rupert for some good comments and discussion, today I'm adding a separate library which I think will make people much happier.

Take 2: half-light.js

half-light.js is a very tiny (~100 LoC) library that lets you express styles for shadow doms in your page in your CSS (it should be CSS inline or linked in the head). You can specify whether those rules apply to both your page and shadow roots, or just shadow roots, or just certain shadow roots, etc. Let's have a look.. All of these make use of CSS @media rules containing a custom --crossroot which can be functional. The easiest way to understand it is with code, let's have a look...

Rules for shadows, not the page...

This applies to <h1>'s in all shadow roots, but not in the light DOM of the page itself.

@media --crossroot { 
  h1 { ... }
}

Authors can also provide a selector filter to specify which elements should have their shadow roots affected. This can be, for example, a tag list. In the example below, it will style the <h2>'s in the shadows of <x-foo> or <x-bar> elements, and not to those in the light DOM of your page itself...

@media --crossroot(x-foo, x-bar) { 
  h2 { ... }
}

Most selectors should work there, so you could also exclude if you prefer. The example below will style the <h3>'s in the shadows of all elements except those of <x-bat> elements ...

@media --crossroot(:not(x-bat)) {
  h3 { ... } 
}

Rules for shadows, and the page...

It's really just a trick of @media that we're tapping into: Begin any of the examples above with screen, and put the whole --crossroot in parenthesis. The example below styles all the <h1>'s in both your light DOM and all shadows...

@media screen, (--crossroot) { 
  h1 { ... }
}

Or, to use the exclusion route from above, but to apply to all <h3>'s in the page, or of shadows of all elements except those of <x-bat> elements ...

@media screen, (--crossroot(:not(x-bat))) {
  h3 { ... } 
}

Play with it... There's a pen below. Once you've got an impression, give me your impression, even with a simple Emoji sentiment or short comment here.

See the Pen halflight by вкαя∂εℓℓ (@briankardell) on CodePen.

Is this a proposal?

No, not yet, it's just a library that should allow us to experiment with something "close enough" to what "features" a real proposal might need to support, and very vaguely what it might look like.

A real proposal, if it came from this, would certainly not use this syntax, which is simply trying to strike a balance between being totally valid and easy to process, and "close enough" for us to get an idea if it's got the right moving parts.

January 25, 2024 05:00 AM

January 22, 2024

Samuel Iglesias

XDC 2023: Behind the curtains

Time flies! Back in October, Igalia organized X.Org Developers Conference 2023 in A Coruña, Spain.

In case you don’t know it, X.Org Developers Conference, despite the X.Org in the name, is a conference for all developers working in the open-source graphics stack: anything related to DRM/KMS, Mesa, X11 and Wayland compositors, etc.

A Coruña's Orzán beach

This year, I participated in the organization of XDC in A Coruña, Spain (again!) by taking care of different aspects: from logistics in the venue (Palexco) to running it in person. It was a very tiring but fulfilling experience.

Sponsors

First of all, I would like to thank all the sponsors for their support, as without them, this conference wouldn’t happen:

XDC 2023 sponsors

They didn’t only give economic support to the conference: Igalia sponsored the welcome event and lunches; X.Org Foundation sponsored coffee breaks; Tourism Office of A Coruña sponsored the guided tour in the city center; and Raspberry Pi sent Raspberry Pi 5 boards to all speakers!

XDC 2023 Stats

XDC 2023 was a success on attendance and talks submissions. Here you have some stats:

  • 📈 160 registered attendees.
  • 👬 120 attendees picked their badge in person.
  • 💻 25 attendees registered as virtual.
  • 📺 More than 6,000 views on live stream.
  • 📝 55 talks/workshops/demos distributed in three days of conference..
  • 🧗‍♀️ There were 3 social events: welcome event, city center guide tour, and one unofficial climbing activity!

XDC 2023 welcome event

Was XDC 2023 perfect organization-wise? Of course… no! Like in any event, we had some issues here and there: one with the Wi-Fi network that was quickly detected and fixed; some issues with the meals and coffee breaks (food allergies mainly), we lost some seconds of audio of a talk in the on-live streaming, and other minor things. Not bad for a community-run event!

Nevertheless, I would like to thank all the staff at Palexco for their quick response and their understanding.

Talk recordings & slides

XDC 2023 talk by André Almeida

Want to see again some talks? All conference recordings were uploaded to X.Org Foundation Youtube channel.

Slides are available to download in each talk description.

Enjoy!

XDC 2024

XDC 2024 will be in North America

We cannot tell yet where is going to happen XDC 2024, other than it will be in North America… but I can tell you that this will be announced soon. Stay tuned!

Want to organize XDC 2025 or XDC 2026?

If we continue with the current cadence: 2025 would be again in Europe, and 2026 event would be in North America.

There is a list of requirements here. Nevertheless, feel free to contact me, or to the X.Org Board of Directors, in order to get first-hand experience and knowledge about what organizing XDC entails.

XDC 2023 audience

Thanks

Thanks to all volunteers, collaborators, Palexco staff, GPUL, X.Org Foundation and many other people for their hard work. Special thanks to my Igalia colleague Chema, who did an outstanding job organizing the event together with me.

Thanks for the sponsors for their extraordinary support to this conference.

Thanks to Igalia not only for sponsoring the event, but also for all the support I got during the past year. I am glad to be part of this company, and I am always surprised by how great my colleagues are.

And last, but not least, thanks to all speakers and attendees. Without you, the conference won’t exist.

See you at XDC 2024!

January 22, 2024 09:06 AM