GNOME Shell and Mutter: better, faster, cleaner

The very first update in the series is about GNOME Shell and Mutter. I’ve been increasingly involved with the development of those two core components of GNOME, and recently this has been the focus of my development time.

Fortunately, Endless allows me to use part of my work time to improve it. Naturally, I prioritize my upstream work considering what will impact Endless OS the most. So far, that lead to a series of very nice improvements to Mutter and GNOME Shell.

GNOME Shell

Most of my work time dedicated to GNOME Shell was oriented to performance and cleanup. At Endless, we have a modified GNOME Shell that constantly needs to be rebased. Since I’m taking care of these rebases now, it makes sense for me to also make myself familiar with the vanilla GNOME Shell codebase.

ShellGenericContainer

I’ll start with the work that makes me the proudest: removing the Shell.GenericContainer class.

First, a bit of history.

There was a time when GJS, the JavaScript engine that GNOME Shell is based on, did not support subclassing GObjects and overriding virtual functions. We could only instantiate GObject-based classes, and subclass them, all thanks to GObject-Introspection, but not override their virtual functions. This made, for example, implementing ClutterContent in JavaScript impossible.

For that reason, GNOME Shell developers created ShellGenericContainer: an actor that sends signals for various virtual functions. Because GJS supports signals, that worked well.

There are a few problems with that approach though:

  • Signals are slow, and should not be used on hot paths like layouting or rendering;
  • Going in and out of JavaScript territory is expensive;
  • It makes the JavaScript code slightly more complicated;

Thanks to the fantastic work by Jasper St. Pierre, GJS now supports overriding virtual functions. And that made Shell.GenericContainer obsolete. So I spent quite some time untangling it from GNOME Shell, and results were positive:

https://gitlab.gnome.org/GNOME/gnome-shell/uploads/b61233544ada29773b87de08e48beeb8/ShellGenericContainer.png
In general, running GNOME Shell without Shell.GenericContainer (blue line) led to more stable framerates compared to the current state (red line).

This is now merged and will be available with GNOME Shell 3.32, to be released on March 2019.

Improvements to the texture cache

After various investigations, another potential improvement that showed up was on StTextureCache. Textures (icons, image files, etc) are cached in GNOME Shell by StTextureCache, and that happened by keeping a ClutterTexture object alive.

That turned out to be a problem.

ClutterTexture is deprecated. Clutter has a new interface for drawing the contents of an actor: ClutterContent. It does not necessarily make the code faster, but it allows different actors to share a single ClutterContent without having to override ClutterActor.paint(). In other words, it is a nice and sane abstraction layer to control what an actor is drawing.

So I went ahead and wiped out ClutterTexture from StTextureCache. Then wiped it out entirely from GNOME Shell.

Unexpectedly, it made a small but noticeable difference! Icons are now slightly faster to load, but the most visible impact was in the startup animation.

Mutter

I did not know how fun and exciting compositors could be. It definitely is a new passion of mine, working on Mutter! So much has happened that it’ll be hard to summarize.

Goodbye, Autotools

During last year’s GUADEC, Jonas Ådahl worked on a Meson port of Mutter. After a series of reviews, and a few follow-up fixes, it reached almost complete feature parity with Autotools – the only exception being installed tests.

So I went ahead and added installed tests to the Meson build too.

And also removed Autotools.

Naturally, builds are much faster now. Saving us a few minutes per day.

Wayland vs X11

Another area that was interesting to work on was untangling X11-specific code from Wayland, and vice-versa. There are a handful of developers working on that already, and I had my fair share in better splitting X11 and Wayland code paths in Mutter.

Specifically, I worked on splitting X11-specific code from MetaWindowActor into subclasses. Mutter already handles different surfaces correctly; on X11 sessions, all surfaces are MetaSurfaceActorX11, and under Wayland, MetaSurfaceActorWayland.

MetaWindowActor has now the same split: Wayland windows have a MetaWindowActorWayland associated, while X11 windows have MetaWindowActorX11.

Interestingly, XWayland windows are X11 windows with a Wayland surface. You can check that using GNOME Shell’s Looking Glass:

wayland vs x11.gif
Example of a Xwayland window; it has a MetaSurfaceActorWayland surface, and a MetaWindowActorX11 actor associated.

There’s a lot more happening in this front, but I’ll spare the words for now. You’ll hear more about it in the future (and not necessarily from me).

CPU-side picking

More recently, I’ve been experimenting with the Cogl journal and ironing out a few bugs that are preventing a completely CPU-side picking implementation.

Picking is the process to figure out which element is beneath the cursor. There are two big approaches: geometry-based, and color-based. On games, the latter is the usual approach: each object in the scene is drawn with a plain color, and the final image is read to find out the color beneath a point. Geometry-based picking is what browsers usually do, and it’s basically math around rectangles.

Clutter uses color-based picking, but has a nice feature around that: a journal that tracks drawing operations and, under some conditions, hits an optimized path and does geometry-based picking. This is interesting for Mutter and GNOME Shell because it avoids sending draw operations to the GPU unecessarily when picking, reducing resource usage.

Unfortunately, due to various bugs and implementation details, we do not hit this optimization, causing GPU commands to be issued when they could be avoided.

Figuring out these bugs is what I’ve been experimenting with lately.

 


 

There’s much more that happened, so I will probably do a part 2 of this article soon. But those are big points already, and the post is becoming lengthy.

Many of these experiments and investigations already landed, and will be available with GNOME 3.32. This is all valuable work that is partially sponsored by my employer, Endless, and I’m happy to keep working on it!

Advertisements

My Perspective on This Year’s GUADEC

Greetings GNOMEies

This year, I had the pleasure to attend GUADEC at Almeria, Spain. Lots of things happened, and I believe some of them are important to be shared with the greater community.

GUADEC

This year’s GUADEC happened in Almería, Spain. It turns out Almería is a lovely city! Small and safe, locals were friendly and I managed to find pretty good vegan food with my broken Spanish.

I was particularly happy whenever locals noticed my struggle with the language, and helped and taught me some handy words. This alone was worth the entire trip!

Getting there was slightly complicated: there were no direct flights, nor single-connection routes, to there. I ended up having to get a 4 connection route to there, and it was somewhat exhausting. Apparently other people also had troublesome journeys there.

The main accommodation and the main venue could have been closer, but commuting to there was not a problem whatsoever because the GUADEC team schedule a morning bus to there. A well handled situation, I must say — turns out, commuting with other GNOME folks sparked interesting discussions and we had some interesting ideas there. The downside is that, if anyone wanted the GNOME Project to die, we were basically in a single bus 😛

Talks

There were quite a few interesting talks this year. My personal highlights:

BoFs

To me, the BoFs were the best part of this year’s GUADEC. The number of things that happened, the hard talks we’ve made, they all were extremely valuable. I think I made a good selection of BoFs to attend, because the ones I attended were interesting and valuable. Decisions were made, discussions were held, and overall it was productive.

I was particularly involved in five major areas: GNOME Shell & Mutter, GJS, GTK, GNOME Settings, and GNOME To Do.

GNOME Shell & Mutter

A big cleanup was merged during GUADEC. This probably will mean small adaptations in extensions, but I don’t particularly think it’s groundbreaking.

At the second BoF day, me and Jonas Ådahl dived into the Remote Desktop on Wayland work to figure out a few bugs we were having. Fortunately, Pipewire devs were present and we figured out some deadlocks into the code. Jonas also gave a small lecture on how the KMS-based renderer of Wayland’s code path works (thanks!), and I feel I’m more educated in that somewhat complex part of the code.

As of today, Carlos Garnacho’s paint volume rework was merged too, after extensive months of testing. It was a high-impact work, and certainly reduces Mutter’s CPU usage on certain situations.

At the very last day, we talked about various ideas for further performance improvements and cleanups on Mutter and GNOME Shell.  I myself am on the last steps of working on one of these ideas, and will write about it later.

As I sidenote, I would like to add that I can only work on that because Endless is sponsoring me to do that. Because

banner-down

Exciting times for GNOME Shell ahead!

GJS

The git master GJS received a bunch of memory optimizations. In my very informal testing, I could measure a systematic 25~33% reduce in the memory usage of every GJS-based application (Maps, Polari and GNOME Shell). However, I can’t guarantee the precisions of these results. They’re just casual observations.

Unfortunately, this rework was making GNOME Shell crash immediately on startup. Philip Chimento tricked me into fixing that issue, and so this happened! I’m very happy with the result, and looks like it’ll be an exciting release for GJS too!

Thanks Philip for helping me deep dive into the code.

GTK

Matthias already wrote an excellent write-up about the GTK BoF, and I won’t duplicate it. Check his blog post if you want to learn more about what was discussed, and what was decided.

GNOME Settings

At last, a dedicate Settings BoF happened at the last day of the conference. It had a surprisingly higher number of attendees than what I was expecting! A few points on our agenda that were addressed:

  • Maintainership: GNOME Settings has a shared maintainership model with different levels of power. We’ll add all the maintainers to the DOAP file so that anyone knows who to ping when opening a merge request against GNOME Settings.
  • GitLab: we want to finish the move to GitLab, so we’ll do like other big modules and triage Bugzilla bugs before moving them to GitLab. With that, the GitLab migration will be over.
  • Offloading Services to Systemd: Iain Lane has been working on starting sessions with systemd, and that means that we’ll be able to drop a bunch of code from GNOME Settings Daemon.
  • Future Plans: we’ve spent a good portion of this cycle cleaning up code. Before the final stable release, we’ll need to do some extensive testing on GNOME Settings. A bit of help from tech enthusiasts would be fantastic!

We should all thank Robert Ancell for proposing and organizing this BoF. It was important to get together and make some decisions for once! Also, thanks Bastien for being present and elucidating our problems with historical context – it certainly wouldn’t be the same without you!

GNOME To Do

Besides these main tracks, me and Tobias could finally sit down and review GNOME To Do’s new layout. Delegating work to who knows best is a good technique:

Tobias' GNOME To Do mockups in my engineering notebook.
Tobias’ GNOME To Do mockups in my engineering notebook.

I was also excited to see GNOME To Do stickers there:

gnome-todo stickers
Sexy GNOME To Do stickers, a courtesy of Jakub

It’s fantastic to see how GNOME To Do is gaining momentum these days. I certainly did not expect it three years ago, when I bootstrapped it as a small app to help me deal with my Google Summer of Code project on Nautilus. It’s just getting out of control.

Epilogue

Even though I was reluctant to go, this GUADEC turned out to be an excellent and productive event. Thanks for all the organizers and volunteers that worked hard on making it happen – you all deserve a drink and a hug!

I was proudly sponsored by the GNOME Foundation.

Sponsored by the GNOME Foundation

Leak Hunting and Mutter Hacking

Greetings GNOMErs!

Last week, when I upgraded to GNOME 3.28, I was sad to notice an extremely annoying bug in Mutter/GNOME Shell: every once in a while, a micro-stuttering happened. This was in additions to another bug that was disappointing me for quite a while: the tiling/maximize/unmaximize animations were not working on Wayland too.

About the former, it may not look like the end of the world, but trust me when I say that a split-second delay every ~10s is the perceived difference of a butter smooth and a trashy experience.

Of course, this is free software, we are free people, and I have this habit of fixing up whatever is bothering me. Naturally, I decided to fix them. I decided to document my journey for people that want to try Mutter/GNOME Shell development be less scared.

Animations

I decided to start working on the animations, since there was a comment written by Jonas Ådahl to bug 780292 that was a leading clue to whatever the issue was. Time to open GNOME Builder, and clone Mutter.

Mutter + Wayland + (Mutter + Wayland + (App))

While testing these changes, I obviously needed to run Mutter and see what was happening. Since we’re talking about Wayland, I was specifically interested in seeing which messages were being sent by the application and which message Mutter was receiving.

To dump the Wayland calls made by an application, we can just use the WAYLAND_DEBUG env var, like this:

$ WAYLAND_DEBUG=1 <application>

This should dump a lot of information into the terminal. This might or might not be useful to you.

One obvious way to test changes in Mutter is to build and install Mutter system-wide, then reboot. Rebooting takes almost 5 minutes to me. Clearly not a good approach. But Mutter has a nested mode where it can run inside another Mutter session.

To run a nested Mutter Wayland session:

$ mutter –nested –wayland

If your changes are making Mutter crash, you might want to run it with GDB. But Mutter is built with Autotools, which of course makes every single thing more complicated than it should be. You’ll notice that src/mutter is not an executable, but a wrapper script. To run Mutter under GDB, do that:

$ libtool –mode=execute gdb mutter

(gdb opens)

> r –nested –wayland

This will open a window with your new raw Mutter session. To run any graphical application against this new nested Mutter, as long as the toolkit supports Wayland, run:

$ WAYLAND_DISPLAY=wayland-1 <application>

Inspecting Mutter and Wayland

This was the trickiest part to me. Mutter has some env vars to control debugging, but I could not use them properly. Either it would dump too much info, or nothing useful at all.

I then decided to go the dumb way and just add dozens of prints around the code.

If you’re aware of any better way to do that, please leave a comment!

The Issue

The root of the issue was in this function:

static void
zxdg_surface_v6_set_window_geometry (struct wl_client   *client,
                                     struct wl_resource *resource,
                                     int32_t             x,
                                     int32_t             y,
                                     int32_t             width,
                                     int32_t             height)
{
  MetaWaylandSurface *surface = surface_from_xdg_surface_resource (resource);

  surface->pending->has_new_geometry = TRUE;
  surface->pending->new_geometry.x = x;
  surface->pending->new_geometry.y = y;
  surface->pending->new_geometry.width = width;
  surface->pending->new_geometry.height = height;
}

Can you spot the issue here?

Look again.

Notice that Mutter is accepting whatever the new geometry is. It doesn’t check if the new geometry differs from the current. When the geometry doesn’t change, we should not report anything to the compositor. If the compositor is GNOME Shell, things get even worse: we go through the JS trampoline, which is slow, when could have avoided it.

Apparently, GTK reports geometry changes even when they don’t happen, e.g. when hovering whatever area of the window. Every single one of these hundreds of geometry changes that didn’t actually change per second would go though IPC to Mutter, which will mindlessly jump into the compositor’s JS trampoline just to do… nothing. Because the geometry didn’t actually changed.

This was fixed by this commit.

Stuttering

The second point that was actually freaking me out was that Mutter was waking up my discrete GPU quite often. On a PRIME system, this means the GPU is put to sleep after a few seconds without being used. Every wakeup would produce an incredibly annoying stutter.

This was only happening on Wayland.

After further investigation, I came up with this temporary fix until Mutter becomes smarter about how it should handle GPUs. This one is already merged, and will be available on the next GNOME 3.28 release!

Memory Leak

Oh, dear, the infamous memory leak… I’ll just leave this link to the GitLab comment. Go figure.

Improved half tiling available in Mutter 3.26.1

A late night announcement: the improved tiling patches (shown in a previous blog post) were merged in Mutter and and GTK+3, and will be available in GNOME 3.26.1 / GTK 3.22.23 (not yet released; should be available this week).

I’d like to thank Florian Muellner, Matthias Clasen, Jonas Adahl and AlexGS for all their support, time, code reviews and testing.

Have a wonderful night!

Smarter half tiling in GNOME Shell/Mutter

Hello GNOMErs,

I think that, at this point, at least a good part of the community is aware of the many new features that are planned to arrive with GNOME 3.26.

I’m particularly looking forward a better tiling story in GNOME Shell and Mutter.

And, y’know, I’m not exactly a referrence in being passive about my own personal technological wishes. Heck, I love hacking stuff so much that it naturally happens even when I’m sleepless and under headache. Perhaps we can call that organic hacking? 🙂

Anyway, I can’t just sit down and keep waiting for something I could work on, right?

And that’s why this happened:

This is obviously a work in progress. You can track the progress of this smarter half tiling in bug 645153. But, sssshhh don’t tell anyone, this is actually part of the future quarter tiling feature!

Have a wonderful night! o/