Hello my GNOMEish friends!
This afternoon, I felt an urge to hear some classical music. Perhaps because I’m overworking a lot these days, I wanted to grab a good hot tea, and listen to relaxing music, and rest for a few minutes.
My player of choice is GNOME Music.
In the past, I couldn’t use it. It was way too slow to be usable. After a round of improvements in a sleepless night, however, Music was usable again to me.
But it was not fast enough for me.
It was taking 15~20 seconds just to show the albums. That’s unacceptable!
Thanks to Christian Hergert we have a beautiful and super useful Sysprof app! After running Music under Sysprof, I got this:
Clearly, there’s an area where Music hits the CPU (the area that is selected in the picture above). And, check it out, in this area, the biggest offenders were libjpeg, libpixman and libavcodec. After digging the code, there it was.
The performance issue was caused by the Album Art loading code.
Looking at the code, I made a simple experiement: tried to see how many parallel lookups (i.e. asynchronous calls) Music was performing.
The number is shocking: Music was running almost 1200 asynchronous operations in parallel.
These operations would be fired almost at the same time, would load Zeus knows how many album covers, and return almost at the same time. Precisely when these lookups finished, Music had that performance hit.
The solution, however, was quite simple: limit the number of active lookups, and queue them if needed. But, limit to what? 64 parallel lookups? Or perhaps 32?
I needed data.
DISCLAIMER: I know very well that the information below is not scientific data, nor a robust benchmark. It’s just a simple comparison.
I decided to try out a few lookup limits, and see what works best. I have a huge collection, big enough to squeeze Music. I’m on an i7 with 8GB of RAM, 7200RPM spinning hard drive.
It was measured (i) the time it took for the album list to show, (ii) the time for all album covers to be loaded, and (iii) a quick score I made up on the fly. All of them are of the type lower is better. I ran each limit 10 times, and used the average of the results.
The “No Limits” columns represent what Music does now. It takes a long time to show up, but the album covers are visible almost immediately after.
First conclusion: limiting the number of lookups always performs better than not. That said, the problem was just a matter of finding the optimal value.
After some trial and error, I found that 24 is an excellent limit.
In general, the initial loading of albums with the performance improvement is around 73% faster than without it. That’s quite a gain!
But words cannot express the satisfaction of seeing this:
15 thoughts on “Even faster GNOME Music”
Great work. Thanks a lot. Gnome Music is getting better and better!
What about reading the visible covers first? And from the top further to the bottom, so when scrolling down, the next albums are already loaded.
Nice. Are those asynchronous operations implemented using threads or something else (like GSource)?
If threads, then I think that this was triggered by the recent changes in GTask where it keeps growing the size of its thread pool as more and more jobs are submitted. Earlier, it would be limited to 10 threads, which could deadlock due to starvation. See https://bugzilla.gnome.org/show_bug.cgi?id=687223
In Photos, we serialize all thumbnailing jobs in a single thread (ie. a GThreadPool of size 1). The idea is to keep the CPU and I/O bandwidth free for other bits of user interaction. Like opening an image, accessing the DB, etc.. Thumbnailing can also hit the network for online accounts, so didn’t want to hammer the servers with multiple concurrent downloads over a potentially slow connection, but I don’t know if Music has that concern. Seems to have worked ok so far.
Thanks for working on Music. 🙂
LikeLiked by 1 person
Mr Stavracas, you are the real mvp.
Thank you for all your work on Gnome!
How does the optimal limit change with available CPU power? Does it make sense to do 24 parallel lookups on an old, single-core machine? What about a 64-core Xeon/Epyc workstation monster? Does HDD latency influence the result?
When it will read files from a custom album not only from ~/Music?
If you kindly add that i forget about rhythmbox and start using Gnome Music, Thanks
A bit off topic – what song are you using in the screencast? It’s pretty cool
Wouldn’t it be better, to show boxes of read albums beforehand and let a spinner load async the images? So it feels far more responsive and you can interact with the application before all images are shown
Which i/o scheduler do you use. It would be interesting to see if there is a difference when using BFQ.
The Solution: Use Rhythmbox!
Thanks! But does it correctly read the tags and sort correctly now? Because I have lots of examples where Music puts each song in it’s own album instead of together as it should, despite tags being correct.
@vonschutter, gnome-music combines album name, performer and year tags to distinguish between albums. If your albums are split, it is probably some song(s) missing one of these tags.
@feaneron, this is very cool stuff! It’s probably an stretch of the mediaart spec(ish), but I wonder if we could store raw pixel data in the mediaart cache, so in the gnome-music side it’s just matter of loading/mmap()ing the files, and doing cairo_image_surface_create_for_data().
I always thought I was a kind of bad choice to use jpeg for coverart storage, probably about the only gain is in terms of storage size, but of course it means it takes time to decode a myriad of these at startup…
LikeLiked by 1 person
Hi,I log on to your blogs named “Even faster GNOME Music – Georges Stavracas” daily.Your humoristic style is awesome, keep doing what you’re doing! And you can look our website about proxy list.
You should profile GNOME Software as well. That is sooo slow.