Funny thing: I was on sway full time for over a year, nearly 2 actually. I went so far as to uninstall i3 (not XWayland, nor X for obvious reasons)
Since a few months I noticed weird (complete) freezes/crashes on my pc, while gaming... it's not the youngest so I thought it may come to end of life.
Out of curiosity I reinstalled i3 a couple of days ago and used it (only) for gaming. No crash since.
I assume it's either a bug in Mesa (AMDGpu) or somewhere in the wayland stack... sway hasn't had an update since November, so... I dunno, I haven't taken the time to investigate.
While working I still use sway, because I've customized it to my needs, but for Gaming/Streaming I now switch to i3 again.
Nice sideeffect: I can finally play some games again that wouldn't even launch on Sway or were unplayable, like e.g. Natural Selection 2 which turned to a black screen when I switched workspaces (e.g. from one monitor to another) to e.g. tune down the music or scroll down a page while being dead.
While I respect Drew's right to keep Nvidia hardware unsupported, doing so cuts out a huge chunk of Linux video game players users. I therefore find it unsurprising that video games have a number of issues that day-to-day use cases don't. Even though this is AMD, lack of Nvidia support keeps Xorg as the default for video games, and this means worse AMD support too.
The main thing up to now keeping Xorg the default for video games was nVidia's lack of Xwayland acceleration support, which made gaming on Wayland largely impossible regardless of wlroots.
It’s not like a decision based on he doesn’t like nvidia — they would pretty much have to rewrite the whole rendering pipeline for a completely separate model. It simply won’t happen unless the kernel guys can do something and bring nvidia prop drivers to behave well with the kernel abstraction wlroots (and thus sway) sits on top, or someone(s) reimplement the whole thing yet again (and then maintain basically two codebases with twice the bugs)
Personally I don't see how supporting the proprietary nvidia drivers would help my situation as I am using Mesa (and Noveau is supported afaik (?!)) but I assume there is a kernel of truth in there that a lot of people cannot/won't even consider sway because they are on proprietary nvidia stacks.
I thought of using another installation, with and without flatpak, but hadn't thought of launching outside of sway, since the crash issue after a while. I'll try lxqt, which I'm currently falling back to for VR (waiting for https://gitlab.freedesktop.org/wayland/wayland-protocols/-/m... ).
Symptoms are: a black screen, and nothing responds. No SysRq, no network. Nothings gets written to the logs. I tried leveraging pstore without success. My GPU is an AMD R9 Fury, which I thought was defective (bought refurbished).
OK, re-tested, it also happens on LXQT, so no wayland-related. I still have to test that GPU in another computer to be sure, and also try AMDVLK, though that seems relatively complex with the steam flatpak.
Another frustrating experience was OBS. It would crash quite often due to a bug in qtwayland that someone from the community fixed (can't find the ticket right now) - had to do with too many events queuing up if it was in background and re-focused.
If you read this kind stranger: Thank you so much for this fix!
One thing that doesn't work to this date for me is custom browser docks, which means if you stream and want to interact with your viewers, you need to have a browser open as well :/
I have similar issues, those seems to be related to the xwayland implementation of wlroots/sway. I do not have those issues with Gnome on wayland.
Looking at the master branch of the project, it seems that it should get better with the next version.
That's great to hear. I wish they would release more often.
I tried to build sway-git twice during the last 2 weeks. Just from the AUR, because I was feeling lazy after work. It failed twice and I just reverted back to sway 1.5.x because I couldn't bother to learn yet another build-tool and dive into C again. (Yes, I am one of those developers)
This is an opinionated guide, hopefully nobody gets the wrong idea like it's actually hard to start using Wayland - for me, switching to Wayland was one drop-down menu option that automatically appeared on my login screen.
It is highly dependent on the user's hardware and software combination.
For example, getting Wayland to work with Plasma on Nvidia graphics cards is still a nightmare, despite it being officially supported. Out of the box, starting a Wayland session just knocks the user into a frozen TTY. Once you get it to work, you quickly learn that hardware acceleration for X11 apps is nonexistent, rendering most video games unplayable.
That is why I mentioned it being dependent on the user's hardware and software. "Use a different desktop environment" isn't really an acceptable solution. I don't think it is fully ready until the overwhelming majority of users can move to Wayland without having to give up their current hardware and software.
There is no one wayland. There are many implementations of the protocol, some ahead of the other. I don’t see why wouldn’t it be ready, when out of the 3 major ones, one is not yet there — especially when one of the two working ones (wlroots) is really extensible and anyone can create their own wm/compositor
No Wayland implementation works 100% on an Nvidia GPU yet, they are just in varying states broken. I mentioned Plasma because it is what I have the most familiarity with, but Mutter is only slightly better and Sway doesn't work at all yet.
Totally - the guide is a reflection of the steps that I personally went through to get it all set up. I use tiling window managers, so had to configure everything (the topbar, etc) myself.
I've been using sway a few years already: first in the office workstation with AMD card with Arch, then on my home ThinkPad T25 on NixOS and again on my hobby/travel ThinkPad X230 on FreeBSD that works so smooth and fast now. What's left on xorg/i3 is the workstation with an nvidia GPU, that'll be replaced to an AMD card when the prices come down a bit.
I'm super happy with sway. It's adding all i3 patches into one coherent setup. The community is friendly and helpful and the desktop super snappy and pleasure to use from old laptops to extremely fast workstations.
It works the same. What I like in sway is it's basically i3 AND i3-gaps together in an active codebase. The biggest difference you notice is when you get rid of picom the compositor, you'll get visibly snappier desktop... Without really needing any special xorg configuration or picom setup, you get composition, tear-free graphics and videos, all just working.
Be aware that to get there, you should make sure you run all your apps in wayland too, not in xwayland. Firefox requires an env var, same goes with GTK and QT apps. Emacs you get from the NixOS overlay; there's a branch that follows the pure gtk build, that doesn't need xwayland. Electron is still xwayland only for most apps. I'm eagerly waiting at least for slack and element to upgrade to Electron 12, which will bring native wayland support.
And there's a nice ecosystem building around wlroots and sway, great tools to choose from for your desktop needs, such as waybar, wofi, wf-recorder and greetd.
Of course you don't NEED any of this. But if I would think like this, I'd probably just used a macOS machine with default settings anyhoo.
Sounds good! I just checked around a little bit and the only program I really would like to have wayland support is vscode and that seems to be around the corner. So I will wait for that and then try it out.
Wrt SDL2: You can actually replace SDL2 with your own, even if it is statically linked with the game. First, do `export SDL_DYNAMIC_API=/my/actual/libSDL-2.0.so.0` and then launch your game.
I went full wayland a few months ago, at first on Ubuntu.
Now, I'm using the early days of the Manjaro Sway Community edition, and it's basically everything that is described in this post packaged and pre-configured out of the box.
I find alacritty’s slogan funny, because it’s literally the only terminal emulator I’ve used which has been so full of bugs that I found it didn't work for me and I stopped using it.
Wayland fixes a number of long standing issues with X, including:
* No screen tearing.
* Better hidpi support including different scale factors on different monitors.
* Improved security, because clients do not have access to all state on the server and servers do not run as root.
Separately, I like how with Wayland the compositor is also the server, so from a TTY you can enter a desktop environment by running a single executable with a single configuration file.
Can you explain the screen tearing thing? I'm running kwin on X and I don't notice any screen tearing (including doing things like moving windows around rapidly with the "wobbly windows" effect on).
That is because you use a compositor, which composites the screen in a separate buffer before presenting it. But in my experience, there were some edge cases where something like videos would tear.
What happens to your graphical programs when your compositor crashes? At least with X11, if your window manager crashes, your programs keep running (you'd just fire up the WM again).
Yes. That's the problem. Not only is the resulting stack less resilient to crashes, but also it takes away flexibility by needlessly welding together two orthogonal design concerns (window management vs rendering). Sway is like putting X11, the window manager, the global hotkey daemon, screenshot grabber, and so on, all into the same address space. What could possibly go wrong? /s
Regarding security, I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs from one another. Like, aren't keystrokes propagated as packets sent through an X11-owned UNIX domain socket in /tmp? Can't we just attach a policy to that socket to decide which PIDs (or process groups, session groups, containers, etc.) get to see which messages?
It's a theoretical problem. The window-management parts of sway are tiny and having them in the same process means you don't have to do IPC every time your windows do something. That simplicity means it's easier to write code that doesn't crash.
Most of the heavy-lifting is done in wlroots anyway. wlroots based compositors really do implement just their own flavour of compositing what you see on the screen on top.
That said, you still can use IPC, if you really want to; I have an external window manager that augments Sway's tiling system via its i3-compatible IPC mechanism to arrange my windows in a way that Sway doesn't do natively. If you really wanted to, there's nothing stopping you from writing a wayland compositor that uses an external window manager.
At any rate, I don't buy the reliability argument at all. I've used sway since 0.10 or something, and I only ever remember crashing it once, and I fixed that bug myself. :P
I'd say it's a very practical problem. Why put N different things that used to run in separate protection domains into the same protection domain? Have we gotten N times better at writing code that doesn't crash? Do we believe that we can put N different things into the same address space but somehow ensure that a security hole in one of them won't compromise all of them? Have computers gotten so slow in the last 30 years that doing IPC is no longer an option?
I'm glad that you have personally not encountered a crash in Sway -- I really, truly am. But let's not pretend that a data point of 1 indicates a trend.
In practice, placing everything in one process seems to reduce the total attack surface. There is quite a lot of code required to synchronize state between the X server, window manager and compositor. When you combine them, you can throw out most of those bits that are largely serialization/deserialization.
> In practice, placing everything in one process seems to reduce the total attack surface.
Surely you're joking. Privilege separation [1] is a thing for a reason.
If we believed that putting different things into the same address space made them more secure, then why stop there? Why not just put the kernel, the shell, X11, your HTTP server, and everything else into the same address space? Let's just do away with processes -- let all schedulable units be threads that can all read and write to each other's memory, because what could possibly go wrong? /s
There is no real security boundary or privilege separation in that case, the window manager and compositor are getting full access to the screen and the input devices and all the client windows. That's part of the reason why it doesn't make much sense to keep them separated, I know you were joking but it's true: they might as well be threads, it saves you the serialization/deserialization step.
In Wayland, a fail-stop bug in the window management logic will now bring down your compositor and every program that was connected to it. In X11, a fail-stop bug in the window management logic only crashes your window manager -- everything else keeps running. This is a really nice property to have -- in general, why make the "blast radius" of a fail-stop bug bigger when we don't need to?
Like, what's the upside of making it so a bug in the window management logic can crash the entire GUI? You claim latency due to no need for serialization/deserialization across process boundaries, and you claim potentially less-complex code. I'm very skeptical about the complexity reduction -- you're replacing the IPC with global state guarded by critical sections which your threads all need to respect. Getting rid of IPC isn't "free" -- you're replacing it with something that could be even worse. So, I'll need to see some actual case studies here.
I agree that there is measurable latency (context switches and all), but if it's a difference of only a few extra microseconds -- i.e. something the user won't notice because computers are insanely fast these days compared to when X11 and window managers were first written -- then I'm disinclined to give up my crash resilience. Do you have data to show that there is noticeable, irreducible performance lag in having a separate window manager process from a compositor?
I don't have any raw data for performance numbers and that wouldn't matter anyway because they may not be relevant to your set up; if you're concerned about that you should run a test comparing them yourself on your specific environment. I'm speaking in terms of code complexity here, if you want to follow the threaded approach then minimum two threads will do the job (one for the scenegraph, one for the sockets), and an X compositor should potentially be doing this anyway to avoid lag caused by slow rendering. The difference is that the X compositor is just storing a copy of large amounts of state from the X server, whereas the Wayland compositor would store the canonical data and wouldn't need to worry about falling out of sync with the X server.
Also, the way that X does it is overly complicated and is unnecessary to have protection against window manager crashes. A similar type of crash protection could be done with a Wayland implementation and it could be done in a much simpler way than moving the entire window manager out into a separate process. You just need to have another process that can hold the client fds and cache a minimum amount of state needed to resume the clients, it wouldn't need to know as much as the X server does to accomplish that task. Prior art is in the Arcan Wayland bridge, other Wayland implementations have not implemented this but they could eventually: https://arcan-fe.com/2017/12/24/crash-resilient-wayland-comp...
> if you're concerned about that you should run a test comparing them yourself on your specific environment.
I mean, X.org runs well enough on my end? I'm not finding myself wanting something with lower input latency.
> The difference is that the X compositor is just storing a copy of large amounts of state from the X server, whereas the Wayland compositor would store the canonical data and wouldn't need to worry about falling out of sync with the X server.
Honestly, if I were to do a ground-up X11 implementation, I'd probably build it around wlroots or similar. Then Wayland clients could interact with it, and I'd be able to preserve all the X11 compatibility and X11-isms I cared about. Like having separate window managers, hotkey daemons, screenshot tools, the slew of X11 command-line clients I know and love today, and so on.
Someone could build a combined Wayland/X server in the same process like that, but why? The only reason you would need to do that would be to run Wayland clients natively on your X server. IMO if you want to use Wayland it is much easier and more valuable to just port those tools. The hotkey daemons, screenshot tools, and command-line clients are pretty small and not that hard for someone to rewrite as a weekend project, people have done a lot of that already. The harder part is the window managers, but if you're a hard-core window manager author used to doing things the X way then you won't see much reason to switch anyway.
If the Wayland/fd.o crowd insists on breaking things I relied on to the point where I have to work weekends to fix everything that used to work, I'm going to spend that time replacing whatever software they wrote that I use with software that I wrote, since at least I won't be breaking my workflow.
I don't understand what you mean. You can still continue using X if that's what you want, people deciding to spend their time developing Wayland doesn't somehow break X or make it worse.
Also just FYI, it seems x.org has merged with freedesktop.org so they are mostly equivalent at this point, being run by the same group of people.
I'm assuming that X.org will go unmaintained after a time, which is fine. But if that means I can't get a usable X server running later on, I wouldn't mind taking a crack at adding a Wayland extension that implemented the X11 protocol.
It's not clear what that would solve, XWayland mostly fulfills that role already. It can't run window managers, but you wouldn't really get that everywhere from adding another Wayland extension either -- doing that would require a lot of additional code and the GNOME and KDE implementations wouldn't want that anyway because they have their own built in window managers. So maybe you could get a hacked up version of Weston that can run X window managers, but why bother dealing with maintaining that instead of just maintaining X? It's unlikely the X server will stop working as long as you have a GPU that supports GL or Vulkan, the glamor/modesetting driver should continue to work there.
I don't know of any significant applications that are Wayland only. If that ever happened, someone could just make a really simple Wayland server that does the reverse of XWayland.
> I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs
This can be done via firejail[1] + xpra/xephyr but is a rather cumbersome endeavor. The X11 standard also contains access control hooks that allow you to "firewall" any aspect of your application. However it is used by no application I personally know of and is rendered useless by how the xinput mechanism is implemented at this point.
The reason nobody bothered to deal with this so far is that people almost never run untrusted software on FOSS systems which is what X11 primarily targets. There was no demand.
The demand there would be with products like Qubes and Subgraph, which are currently using Xephyr and Xpra. Eventually Wayland should be able to improve performance there, and bring some of the security benefits of those setups to other distributions.
Seems to me that firewalling X11 programs from one another would take a lot less work and be a lot less disruptive than requiring users to run multiple VMs with multiple X11 servers and/or replace the whole graphics stack.
Trying to shoehorn proper security into X11 would be a formidable effort and still be quite disruptive to client software.
Back when Wayland was just being proposed there were not a lot of developers working on X. They almost-unanimously agreed that it was time to break with backward compatibility and eject a lot of cruft that had built up over the years, such as the horrible font handling. Modern toolkits had already started moving away from using many of these X11 facilities and were doing much more client side anyway. So the argument was that a relatively clean slate design was called for which should dispense with the cruft and better handle client-side rendering.
It's not perfect and I know it is disruptive for some people, but at least here it has led to a much better experience for some years now.
Would addressing the "firewalling" issue be more disruptive than throwing out X11? Because, Wayland definitely firewalls programs (among many other things) -- surely just implementing firewalling in X11 is not nearly as difficult or disruptive? Implementing firewalling could even be done in an incremental way that's easily reverted or tailored to individual apps and users.
I can’t reply to your comment below this, but the Wayland guys are the X guys, and while they are definitely not infallible, don’t you think it is a bit egoistic to think that they didn’t thought of this one simple little thing that you did, without any knowledge on the inner workings of any display server?
Don't get me wrong, I'm not trying to suggest that the X devs are ignorant of this. I too have both deep respect and gratitude for the work they have done and continue to do.
I'm only frustrated that I can't get a straight answer as to what problems Wayland is solving that can't be solved with less difficulty and breakage by repairing X11. I'm sure the X.org developers have an answer, and I would love to know it, but I'm not getting it here in this comment tree.
Here you are:
https://youtu.be/cQoQE_HDG8g
It’s a great presentation by one of the guys behind wayland, who worked on X a lot before.
Basically, X has a fundamentally misaligned abstraction of the underlying hardware - which is expected based on its age. When used with a compositor, it is basically a middle-man with no function whatsoever. So Wayland decided to cut out the middle man and pass events directly between client and compositor. But please watch the video, I think it will answer all your questions.
Also, it’s not like X will be totally deprecated, XWayland is an API implementation of it that will be supported forever.
Right, and as I said elsewhere, this is a problem with the reference implementation being crusty, not X11. That seems to be the point the speaker is making as well.
Like, I'm fully supportive of having the X server simply manage DRI3 for a bunch of clients and composite the results. That's all well and good.
I'm less supportive of doing this while also removing all support for the other things that the X server provides the ecosystem:
* Unified input/event capturing and forwarding
* Unified screen capturing/recording
* Window management
* Clipboard
* Structured IPC (on top of which you get ICCCM and NetWM)
* Xrdb
* Xprops
* Notions of windows in general (everything's now client-side)
* Fonts
* Drawing APIs
These were all standardized things that users could count on always being available, regardless of which GUI programs they used. Wayland completely punts on these things and defers them to extensions.
Before you ask, I've already seen the "Wayland is a protocol; these are all extensions" song and dance routine. That's a cop-out. Dropping these things means that there will now be multiple incompatible implementations of the same concept, and no way to mix-and-match them because they now all have to be built into the same process that does your window management. Wayland implementations completely destroy this digital commons, for no apparent reason or gain for the users. The only people I see potentially benefiting from this are full-fledged DEs who can leverage their compositors' incompatible implementations to enact a form of lock-in (i.e. your GNOME programs are no longer guaranteed to run in KDE, and vice versa). So, why do this?
This is simply insecure. “It is easier to cut holes into a solid block than to patch something that looks like swiss cheese”.
What reason does a random app has to see each keypress, when it doesn’t have focus? Do you trust eg. the teams app or the million other app to be a good citizen?
Screen capture is implemented with pipewire in a better way than before.
Fonts: noone uses the old font API of X, even under X. And third party libs like cairo work on both wayland and x, so nothing is lost here.
Drawing APIs: show me any app that uses it and was upgraded in the last two decades. Feel free to use a CPU only drawing API, I prefer not watching the line getting rendered.
Also, as I already mentioned XWayland is important for exactly this reason - it is a completely backward compatible X implementation, on top of a better display protocol. What’s the actual problem, because I still don’t see it.
There is no need to have incompatible implementations of each, and just look at the three main wayland implementations: they share many of the work.
> This is simply insecure. “It is easier to cut holes into a solid block than to patch something that looks like swiss cheese”. What reason does a random app has to see each keypress, when it doesn’t have focus? Do you trust eg. the teams app or the million other app to be a good citizen?
Up-thread I was asking why X.org doesn't simply firewall apps off from one another, and ship with an extension to control this firewall. Adding this capability to X.org (or any X11 implementation) could be done without throwing X11 away. Having a unified way to decide which programs get to see which events would be lost in a transition to Wayland, since each compositor would ship with its own incompatible way of doing this.
> Screen capture is implemented with pipewire in a better way than before.
It also requires that the given Wayland compositor works with it. So, you're SOL if the window manager you're using happens to be welded to a Wayland compositor that doesn't. This wasn't the case before with X11, where screen capture was handled by the X server.
> Fonts: noone uses the old font API of X, even under X. And third party libs like cairo work on both wayland and x, so nothing is lost here.
> Drawing APIs: show me any app that uses it and was upgraded in the last two decades. Feel free to use a CPU only drawing API, I prefer not watching the line getting rendered.
I wonder why the server still has them, then. Surely the X.org developers would have simply deleted old code without throwing the whole server away if they were as certain as you are that no one uses them?
Also, I see you haven't addressed the other points I raised (Xrdb, clipboard, ICCCM, NetWM, xprops, window management, etc.).
> Also, as I already mentioned XWayland is important for exactly this reason - it is a completely backward compatible X implementation, on top of a better display protocol. What’s the actual problem, because I still don’t see it.
It's not 100% compatible -- things still break. Distros are up-front about this (I have a sibling comment with sources).
> There is no need to have incompatible implementations of each, and just look at the three main wayland implementations: they share many of the work.
So now if I want to go and build a window manager, I have to go and re-implement a whole crap-ton of extensions myself that the X server used to do for me? And I have to do it perfectly, so apps written for other DEs won't just break? Sounds like a walled garden to me -- it raises the barrier to entry for new players.
> ship with an extension to control this firewall. Adding this capability to X.org (or any X11 implementation) could be done without throwing X11 away
Nothing is thrown away -> xserver is there for exactly this reason. Adding the extension for a system with bad abstraction is not too wise, but if you wanted to understand it, you would have done so already based on the video.
> Having a unified way to decide which programs get to see which events would be lost in a transition to Wayland
Why would it be lost? There is a core protocol that absolutely specifies it.
> This wasn't the case before with X11, where screen capture was handled by the X server.
And when you had only one player in the whole game.. which is pretty contradictory to your last sentence.
> I wonder why the server still has them, then.
Backward compatibility. Show me any desktop app that uses eg. xmotif or something. And with xwayland even these 30 years old apps can be run.
I didn’t address these things because basically everything has a solution under wayland nowadays. Please have a look at the wayland-protocol repo and see for yourself the state of it. Also, wayland is a display manager, just because the X server was a monolith, it had no place to eg. manage clipboard. Actually, Wayland is the one that fulfills the UNIX philosophy of do one thing (although I don’t find the UNIX philosophy a good thing in every case)
> It's not 100% compatible -- things still break. Distros are up-front about this
Such is life, I really can’t say anything else to this.
> So now if I want to go and build a window manager, I have to go and re-implement a whole crap-ton of extensions myself that the X server used to do for me?
No, you just use wlroots that implemented the “crap-ton” of extensions for you already, and be on your way.
> Nothing is thrown away -> xserver is there for exactly this reason. Adding the extension for a system with bad abstraction is not too wise, but if you wanted to understand it, you would have done so already based on the video.
I did watch the video, and while I was convinced that the X.org reference implementation was crusty, I was not convinced that there was anything inherently wrong with X11-the-protocol. Like, if there existed an X extension whose responsibility was just to get clients set up with their own video buffers that it could composite for them, then it sounds like it would address 90% of Wayland's value proposition. Is there a particular point in the video you want me to pay extra attention to that clarifies this?
> Why would it be lost? There is a core protocol that absolutely specifies it.
I read through the stable interface definitions in the wayland-protocols repo [1], and did not see anything related to controlling which programs get to see which events. Is this still in development (or unstable)? If so, is there an ETA at which point I can expect every correct Wayland compositor to faithfully implement it?
> And when you had only one player in the whole game.. which is pretty contradictory to your last sentence.
That's because the X server implements the mechanisms, not policies, for multiplexing the screen and input devices. In the service of this, it provides tools to enumerate, identify, query, modify, and extend properties of windows, as well as route messages between them. There was never a compelling need for multiple competing incompatible X servers because X is the narrow waist (i.e. an unopinionated digital commons) shared by software that competed on policy.
> I didn’t address these things because basically everything has a solution under wayland nowadays. Please have a look at the wayland-protocol repo and see for yourself the state of it. Also, wayland is a display manager, just because the X server was a monolith, it had no place to eg. manage clipboard. Actually, Wayland is the one that fulfills the UNIX philosophy of do one thing (although I don’t find the UNIX philosophy a good thing in every case)
I read through the unstable interface definitions, and see that Wayland is indeed trying to implement not only the same kinds IPC facilities and input device multiplexing that X provided, but also is trying to impose stronger opinions on what types of windows exist and how they behave (e.g. Wayland has a notion of pop-ups, text inputs, and so on). So if Wayland's goal is to avoid being as "monolithic" as X, it appears to be failing.
Also, putting core functionality that everyone must implement the same way into extensions just so they can call Wayland "just a protocol" or "just a display manager" is disingenuous. They might as well just say that they're part of the core protocol.
> No, you just use wlroots that implemented the “crap-ton” of extensions for you already, and be on your way.
Does the wlroots project define what extensions are standard and required for a piece of software to call itself a Wayland compositor? No? Then "just use wlroots" isn't addressing the problem of making sure these compositors are compliant to a set of common, useful standards. Like, maybe wlroots should be the standard-definer, just as X was? What happens with window managers built with a compositor that is not wlroots?
Anyway, I don't want to waste your time. If you can't help me understand why Wayland could not have been implemented as an X extension (including why isolating client input could not also have been implemented as an X extension), then I don't think we're going to get anywhere in this thread.
There are X extensions for shared memory buffers. Client isolation for X could also have been implemented as some kind of extension. With both of those you could solve some issues but it still wouldn't be the same as redesigning the core protocol.
If you are expecting every Wayland server to implement things exactly the same way, that will probably not happen, the point with having different implementations is that they can choose which parts they want. It's currently not looking like there will be any one standard-definer, you can build a monolithic implementation if you want but you don't have to. Yes this might cause some fragmentation but realistically, has X really helped there? The huge proliferation of clones and forks of various X window managers that are incompatible in various ways is another kind of fragmentation.
> There are X extensions for shared memory buffers. Client isolation for X could also have been implemented as some kind of extension. With both of those you could solve some issues but it still wouldn't be the same as redesigning the core protocol.
So why redesign the core protocol if the selling points of Wayland can be had without going through all that hassle? What are the true selling points of transitioning to Wayland, if they can be had for far less work?
> Yes this might cause some fragmentation but realistically, has X really helped there? The huge proliferation of clones and forks of various X window managers that are incompatible in various ways is another kind of fragmentation.
There was only ever one dominant X server implementation for the past 30 years (XFree86, then X.org), so yes, I'd say it helped a lot to keep the video/input multiplexing system out of the hands of window manager and desktop environment developers. This ensured that your graphical programs would always work, regardless of what desktop environment or window manager you used, because they all spoke the same protocol and relied on the same reference implementation. A proliferation of Wayland compositors would take all of that away.
The basic idea of Wayland is that it is a simplification and streamlining of a display server protocol. The work of designing that core protocol is already done and doesn't need to be done again, the original developer likely did it because they found it interesting or useful to work on in some way.
The situation isn't that much better in X, the window manager and desktop environment can break clients in other subtle ways that have nothing to do with the X server. There's no guarantee that graphical programs would always work if your setup does something strange.
I’m sorry, I may not have the time to answer every point you have made:
> I was not convinced that there was anything inherently wrong with X11-the-protocol
There is, the non-existant security model that can’t really be backfitted without breaking every program - in which case they can just as well fix all the bad parts.
> Is there a particular point in the video you want me to pay extra attention to that clarifies this?
I found the graphics of the client-compositor-Xserver vs client-compisitor under Wayland really informative. In modern usage, the Xserver actually acts more like a library and IPC bus, and is bad at the latter. Also, related to the API thing, there is no way to signal that a buffer is ready. You may not be interested in the “every frame is perfect”, but I like that I can watch a video in vlc without tearing.
Also, a wayland compositor can be much more lightweight than the whole xserver, because it is not as chatty (there is no useless communication to the xserver that communicates to the compositor for no reason)
It’s not without reason that wayland is/can be used in embedded systems.
> and did not see anything related to controlling which programs get to see which events
There is a one-to-one communication with the compositor and the client. Keyboard events, window resize and the like are sent to only a specific client. I may have worded it incorrectly that it is specified — I would rather say it has an inherent model for it, that can be changed with extension protocols when needed. But the default should not have been the everyone listens to everything and find what is interesting. (For example it is now possible that a global hotkey have to be registered and the compisitor will react to that based on the registration. But there can’t be a clash now and it will work reliably)
Also, in my opinion this flexibility (with which clients should not worry about) lets you create novel ways to interact with windows, that was not possible with X.
Also, you seem to think that there is all that much difference between compositor families —- it is not the case. The core and many extension libraries are while implemented multiple times, work in the same way. Thus a traditional client with some windows will just work. Some compositor have some custom extension for eg. having a specific status bar, which you may find bad since under X there could be cross-wm status bars etc. But realistically you could not have them eg. under gnome or kde without tinkering, so the status quo doesn’t really change.
> Also, putting core functionality that everyone must implement the same way into extensions just so they can call Wayland "just a protocol" or "just a display manager" is disingenuous
How would you create that API of X you mentionod? Wayland is a protocol, the core is mandatory. And it is in a repo, so that it can have versions — this is yet again an area where x is flawed. Even the core api can continue to evolve, and eg the compositor/client can both decide to support for example an older version — although in practise the core api is backward compatible. But a new feature for example can be used by a fresh client when available, with a proper way to fallback — due to the wl_registry.
> Does the wlroots project define what extensions are standard and required for a piece of software to call itself a Wayland compositor
That is the core protocol. You seem to have a misunderstanding around it. Otherwise, how would a wayland app work on every wayland compisitor? Wlroots can have some custom extensions and it does have , but you seem to misunderstand the point of those/scope of them. They are simple things like “a specific window that can work as a widget, eg don’t loose focus etc”.
Everything buffer related is core, and for example full screen WAS not part of the core initially, but an implementation that all compositors agreed on was merged and everyone implemented it many years ago.
> If you can't help me understand why Wayland could not have been implemented as an X extension
I’m trying to but you seem to have some grudge against the project.
I am no X developer so unfortunately I don’t have more knowledge on the topic than what I have already shared, but for example X developers tried to retrofit HiDPI to X, and things like mixed HiDPI over multiple monitors (hell, the whole multi-screen setup) simply can’t be done realistically — from what I gathered due to X API’s lack of semantic informations like scale. Wayland corrected the many many failings of the API in a future proof way that can avoid. Also, why do you think that basically every OS already changed to a compositor-based display server 2 decades ago? It is simply the better abstraction and this is a simple answer, but it is the fundamental one.
Hey, I appreciate you taking the time to reply as you did.
> There is, the non-existant security model that can’t really be backfitted without breaking every program - in which case they can just as well fix all the bad parts.
Most X11 clients only care about receiving input events for their own windows, no? Making it so the X server only sends input events to the window(s) that are in-focus and all belong to the same app by default wouldn't be nearly as disruptive as ripping out the entire X11 protocol, would it? If the mechanism that does this is well-designed, you could restore the "see all input events" feature on an app-by-app basis.
> I found the graphics of the client-compositor-Xserver vs client-compisitor under Wayland really informative. In modern usage, the Xserver actually acts more like a library and IPC bus, and is bad at the latter.
Is it, though? The X server is uniquely positioned in the graphics stack to (1) maintain a database of which windows (and associated metadata) exist and their parent/child relationships, (2) store global configuration state for applications with a graphical concern to query, and (3) route IPC data between processes on a window-by-window basis. This isn't something you can easily move into a separate process, since the state of all windows and input events mutates pretty quickly, and stale data is useless, or even dangerous for downstream apps to consume. I suppose the X server could delegate IPC responsibility to a trusted downstream process, but the X server would still need to be the authoritative source for all state-updates.
> Also, related to the API thing, there is no way to signal that a buffer is ready.
Can't there be an X extension that allows the X server to notify compatible clients when a buffer is ready? If we're not worried about old clients or infrequently-refreshed clients continuing to tear, then this would be no worse of a proposition than moving everything to Wayland.
> Also, a wayland compositor can be much more lightweight than the whole xserver, because it is not as chatty (there is no useless communication to the xserver that communicates to the compositor for no reason) It’s not without reason that wayland is/can be used in embedded systems.
Can't there be an X extension that allows clients to inform the X server that they don't care to receive certain kinds of messages (or, make it so I can configure the X server to not send messages to certain X clients, or maybe create a launch-wrapper for X clients that instructs the X server on this on their behalf)? Also, "embedded systems" these days are easily on-par with (of not vastly more powerful than) the computers for which X was designed.
> There is a one-to-one communication with the compositor and the client. Keyboard events, window resize and the like are sent to only a specific client. I may have worded it incorrectly that it is specified — I would rather say it has an inherent model for it, that can be changed with extension protocols when needed. But the default should not have been the everyone listens to everything and find what is interesting. (For example it is now possible that a global hotkey have to be registered and the compisitor will react to that based on the registration. But there can’t be a clash now and it will work reliably) Also, in my opinion this flexibility (with which clients should not worry about) lets you create novel ways to interact with windows, that was not possible with X.
I'm really not seeing how this precludes making it so X can just not send all X clients all messages. Clients that need to see events destined to other clients' windows (which is the uncommon case) would just need to get an exception granted from the X server.
> Also, you seem to think that there is all that much difference between compositor families —- it is not the case. The core and many extension libraries are while implemented multiple times, work in the same way.
Even if all compositors were 99.9% compatible, that's still a ton of breakage -- one in one thousand interactions will behave incorrectly. Like, just take a look at Web browsers today to see what I mean about having multiple implementations making our lives worse -- they all ostensibly support the same standards, and yet they all behave in subtly different ways that Web developers have to test for. Why should I believe that it will be any different for Wayland compositors?
> Thus a traditional client with some windows will just work. Some compositor have some custom extension for eg. having a specific status bar, which you may find bad since under X there could be cross-wm status bars etc. But realistically you could not have them eg. under gnome or kde without tinkering, so the status quo doesn’t really change.
I don't use GNOME or KDE -- I rely on the flexibility X11 affords me to run the X clients I deem necessary to do my work. I know for a fact that I'm not alone on this. If Wayland is going to take this away, then I'm going to put effort to keeping an X11 implementation alive (even if it's implemented as a Wayland extension) in order to keep using my computer in the way I see fit.
> How would you create that API of X you mentionod? Wayland is a protocol, the core is mandatory. And it is in a repo, so that it can have versions — this is yet again an area where x is flawed. Even the core api can continue to evolve, and eg the compositor/client can both decide to support for example an older version — although in practise the core api is backward compatible. But a new feature for example can be used by a fresh client when available, with a proper way to fallback — due to the wl_registry.
I don't even know how to parse what you're saying here. It sounds like you're saying that just because Wayland has protocol definitions that live in a github repository (as if that mattered), it's automagically better than X extensions? Because, if you swap "X" and "Wayland" in that above paragraph, the resulting paragraph would still be true. X11 is a protocol with a mandatory core; X protocols (and extensions) are most definitely versioned (we're using X version 11 revision 7.7 btw); X clients can decide which extensions (or versions of these extensions) they want to use. If the X server doesn't support what the X client wants, the X client can optionally fall back to an older, different extension.
> That is the core protocol. You seem to have a misunderstanding around it. Otherwise, how would a wayland app work on every wayland compisitor? Wlroots can have some custom extensions and it does have , but you seem to misunderstand the point of those/scope of them. They are simple things like “a specific window that can work as a widget, eg don’t loose focus etc”. Everything buffer related is core, and for example full screen WAS not part of the core initially, but an implementation that all compositors agreed on was merged and everyone implemented it many years ago.
Wlroots is most definitely NOT the core protocol. It's a Wayland project maintained by Drew DeVault for building Wayland compositors. But Drew DeVault does not dictate what is and is not part of Wayland. I was asking rhetorically to prove this point. Also, if every app needs to make sure it works with every compositor (instead of just needing to check against a recent X.org release), then Wayland represents a regression in the way we build desktop software. With Wayland, developers need to test their app against a bunch of different compositors to make sure they all behave the same way, just like how Web developers need to test their Web apps against a bunch of different browsers. I'd rather not repeat the Web's mistakes in desktop software development.
> I’m trying to but you seem to have some grudge against the project.
I have a grudge against breaking everything for no reason, and I try not to depend on software written by people who develop a reputation for doing this. This isn't specific to Wayland. But so far, it looks like Wayland is an instance of breaking everything for no reason.
> Wayland corrected the many many failings of the API in a future proof way that can avoid.
The same thing was said about X -- that's why X has a forward-compatible extension model that Wayland largely copies. So let's not delude ourselves into thinking that Wayland is going to somehow magically avoid becoming the new X.org when all is said and done.
> Also, why do you think that basically every OS already changed to a compositor-based display server 2 decades ago? It is simply the better abstraction and this is a simple answer, but it is the fundamental one.
Why should I care what other OS's that I don't use do? First, I care about programs that I depend on not breaking. Second, I care that I can retain the power to mix and match different graphical UI tools to my liking, instead of having to take into consideration which compositors they may or may not work on (something I didn't have to do with X.org). I'm not convinced at all that Wayland actually fixes anything that couldn't have been fixed in an X extension for far less work and disruption. It's not like X.org doesn't have DRI3 support, which provides exactly the compositor-based display server you clamor for.
I'm no expert in the X11 codebase, but I have lots of respect for the guys who were working on it at the time the Wayland direction was undertaken. So I don't feel in a position to second guess their opinion or tremendous contributions, especially since it has led to a much better experience than I ever had with X.
That doesn't make them infallible. If anything, Wayland smells like yet another instance of CADT [1]. Like, why can no one explain why a world-breaking change like Wayland is justified, when the problems Wayland solves seem like they could be addressed by repairing X11?
I'm honestly interested in building a better X11, and am willing to contribute both time and money. But first, I'd like to understand why the X11 maintainers deemed Wayland necessary -- like, what am I missing here?
The main problems with X11 are within the core X11 protocol itself, things that are long considered deprecated/obsolete and can't be fixed or removed without doing a protocol break. I could go more into detail, but if you depend on some old X clients that use all those old protocol features, and you're already committed to putting money down on X11, it seems unlikely that those details would be relevant to you. Please let me know if I'm reading this wrong here. I support you working on X11, but just be warned, it is highly unlikely that the major desktops are going to want to continue on that path going forward.
> The main problems with X11 are within the core X11 protocol itself, things that are long considered deprecated/obsolete and can't be fixed or removed without doing a protocol break.
Which problems, exactly, would these be? Why is it impossible (or too difficult or disruptive) to solve these problems through a server extension, or some other backwards-compatible fix? I'm sure the X.org developers have an answer, but I'm not seeing it here.
> it is highly unlikely that the major desktops are going to want to continue on that path going forward.
I don't care what other desktops and toolkits choose to do -- I don't even use a fd.o desktop (hell, I don't even run dbus). If it weren't for needing a Web browser and Zoom, I wouldn't even need a GUI at all.
I'm doing this for myself. I like UNIX a lot, but I really dislike the direction modern Linux desktops are going in. But instead of whining about it online, I'm willing to put in the time and effort to keep things working the way I like.
I'm asking what problems Wayland solves in order to figure out why it's a bad idea for me to take a crack at implementing a simple X server of my own (assuming X.org is indeed going to be deprecated). Like, what is so wrong/broken about the X11 protocol that the X.org server developers are so enthusiastic about abandoning it? Clearly, I must be missing something. I would like to know what that something is.
You really should consider watching the youtube talk that was linked in some sibling comments, it explains it in more detail than I could in just one post. For me personally, the real bad issues are things like the core protocol being synchronous, the coordinates being limited to 16-bit, the inherent raciness and insecurity of various things like window properties and server grabs... There is a lot of legacy functionality there too like colormaps, window borders, bitmap fonts, all the core drawing primitives, all the core input stuff.... Newer applications are not using any of that, and often with that legacy stuff the the only specified behavior for edge cases that clients expects is "do whatever Xorg does" which makes a rewrite pretty impractical. It would be interesting to see a secure rewrite of the X server in Rust or some newer language like that, but doing that would probably take many years for little benefit, I'd advise against it.
Side note, I don't get the hate for dbus, it's a rather simplistic message bus, orders of magnitude smaller than the X server. It would be much easier to implement your own dbus daemon for example.
From what I got from the video, the main complaint is that the X.org reference implementation has gotten really crusty and hard to maintain. If so, maybe the solution there would be to do some housekeeping and start deprecating features no one uses. Maybe that could include factoring legacy/obscure protocols and code-paths into separate code modules, off the server's "happy paths," which these protocols' downstream consumers can take over maintaining (if they really still need them).
This doesn't call for ditching X11 in my mind.
> For me personally, the real bad issues are things like the core protocol being synchronous, the coordinates being limited to 16-bit, the inherent raciness and insecurity of various things like window properties and server grabs...
Why can't an extension offer a way for clients to establish asynchronous communication channels to the X server? Why can't an extension offer 64-bit coordinates? Why can't an extension offer a way for an authorized program to take care of guarding and serializing access to window properties and orchestrating server grabs? Why do we need to break the world to have these things?!? These may not be trivial undertakings, but I doubt they would take anywhere close to the amount of work required to upgrade every graphical program and toolkit in UNIX-land to use a wholly-different _suite_ of input/video multiplexing systems (which on a given day will be only 95% compatible with one another in expectation).
Also, it's not like Wayland is destined to be less crufty than X11. I wouldn't be surprised at all if all of the complexity in X.org today returns to Wayland compositors by way of a bunch of all-but-required Wayland extensions that get shoe-horned in over the years. So if we're going to be shoe-horning new features into existing systems, we might as well do it on the devil we all know (or perhaps we should solve this once and for all by creating an "X12" protocol in a way that shoe-horning is painless and won't lead us to cruftiness again).
> Side note, I don't get the hate for dbus, it's a rather simplistic message bus, orders of magnitude smaller than the X server. It would be much easier to implement your own dbus daemon for example.
It's not simplistic for what it does, and its developers have a horribly-misguided "put-it-in-the-kernel-because-performance" development ideology that belies a profound lack of understanding of why or how dbus isn't fast enough for their purposes. My biggest turn-off is the fact that it doesn't do anything that I can't already do faster and cheaper with a RAM filesystem of named pipes and UNIX domain sockets.
* Want user/system namespaced paths? Create a directory for each users' endpoints that's separate from other users' endpoints, and have a distinct system/ directory that only authorized users can explore. Leverage filesystem hierarchies and permissions to communicate which endpoints belong to the same service, and to control who can access them.
* Want to register a service endpoint for suspending/shutting-down your laptop? Create a directory under system/ whose group ID is the group of users who are authorized suspend/shutdown, put two "suspend" and "shut-down" named pipes in them, and have a suspend/shutdown daemon just do blocking reads from them. Once a byte arrives on the "suspend" pipe, execute suspend-to-RAM. Once a byte arrives on the "shut-down" pipe, execute shutdown.
* Want to register a service endpoint for sending desktop notifications? Make a "notifications" directory in the user's service endpoints directory, and put a UNIX domain socket in it. Have the notification daemon listen on this socket, and simply pop up a window whenever another program connects to it and sends a properly-structured message (note that that other program must have permission to traverse the service directory to access this UNIX domain socket to do so).
* Want introspection on how to form that message? Have the daemon that implements the endpoint write out a symlink to its documentation in its service directory, which you can just `cat` or `more` to figure out how to talk to the service.
* Want something really elaborate, like sending a video stream? Transfer a file descriptor to the service provider via the UDS and then pipe the audio/video data in that way.
So, yeah -- dbus doesn't need to exist in order for us to have the things it offers.
What you're saying is mostly what has happened already except the work has just been done outside the X server. The features in X that people don't use are already considered deprecated, and factoring the legacy parts off into a separate code module is essentially what XWayland is anyway.
If you added all those things as X extensions, it would essentially be the same thing as Wayland, because clients that wouldn't use them would still be broken, and every graphical program and toolkit would still be need to be updated to use them.
Your suggestions for dbus would work for some applications but would not really work for other things that a message bus handles like multicast, global message ordering, and resource accounting. Plus GNOME and KDE adopted dbus specifically so they could get away from having to pass around random sockets in folders everywhere. I assume by "put-it-in-the-kernel-because-performance development ideology" you're referring to kdbus, which was an alternate implementation not made by the original dbus developers, and is now a dead project and is not really a thing anymore. Please don't get those things confused. Of course the reason they could do that is because dbus is also just another protocol with a reference implementation, and you could make another implementation that works closer to what you describe and maybe gets 80-90% of the way there depending on some changes in the kernel, for example I saw a hacky dbus fuse filesystem a while ago: https://github.com/sidorares/dbusfs
> If you added all those things as X extensions, it would essentially be the same thing as Wayland, because clients that wouldn't use them would still be broken, and every graphical program and toolkit would still be need to be updated to use them.
There's a massive difference between extending the X server and having each window manager implement all the trappings of an X server as a library. Namely:
* X remains the "narrow waist" for video/input multiplexing. GUI "policy" infrastructure -- window managers, panels, notification services, and so on -- remain separate programs, with separate maintainers, to be mixed and matched downstream as needed. Moreover, all these programs keep working.
* By remaining a separate X server program, we keep mechanism and policy cleanly separated. GUI "policy" infrastructure can't impose itself systemically on other GUI "policy" infrastructure, which is a good thing because all these GUI "policy" infrastructure authors tend to think their way is the best way and how dare anyone question it or resist it (see also GNOME). X keeps me and mine safe from their idiocy.
* The barrier-to-entry for creating new "policy" infrastructure remains low, since you can run these programs without coupling them to a particular compositor.
* Changes to X's rendering infrastructure get incrementally deployed. No change in any workflow is required; toolkits and programs opt-in to the new rendering infrastructure as they need to. Programs that don't opt-in keep working until the old code paths get dropped.
* Non-display services of X get preserved, like xprops, xinput, etc. All xclients keep working. If desired, these can be policed through a separate opt-in extension. Existing IPC conventions like ICCCM and NetWM keep working, so all the downstream tools that use them keep working.
> Your suggestions for dbus would work for some applications but would not really work for other things that a message bus handles like multicast, global message ordering, and resource accounting.
Nonsense.
You can multicast messages from one process to many processes via a UNIX domain socket trivially -- just send the damn message to each recipient! It's not like you're going to have 10 million clients, so copying the data isn't going to be that bad (and, the service endpoint can always throttle clients). But, sure, let's suppose the message you're trying to send is gigantic, and you do need to send it to lots of clients. You can just store it as a file (you're doing this anyway if the message is truly that big) and send each client a read-only file descriptor to go and consume it at their own pace. If you're using a file at least, all your clients will hit the same cached pages in the kernel, so you're no longer making N different copies of the data (the kernel will take care of implementing the right caching strategy for you). If you're streaming data, you could simply buffer it to a file and treat the file as a ring-buffer, and still hand out read-only file descriptors to it to downstream clients.
Global message ordering and message dependencies is also easily solved without dbus -- just implement an "ordering" service adapter. The adapter writes its own UDS to the place where its upstream services' UDSs live, and it takes care of marshaling requests and replies to and from the upstream services according to some ordering principle you require. For example, if you have a service for shutdown/suspend, and a service for logout, you could implement a small ordering adapter that prevents messages to shutdown/suspend from being delivered if the user is in the process of logging out. I'd imagine that for a DE, you could simply have a singleton ordering service adapter that determines what services get to be accessed under which circumstances (thereby cleanly separating the task of systems integration from the task of providing the individual service).
Resource accounting is similarly straightforward. Just like the "ordering" service adapter pattern, you can also create a "resource usage" service adapter pattern. For example, you can ensure that the volume increment or decrement requests to your sound daemon arrive at a fixed rate, no matter how many requests come in. As another example, if the service is streaming data, you can use a service adapter to monitor how quickly clients are consuming versus the service producing, and induce back-pressure on the service to hint that it should down-sample if clients are too slow.
Because everything is represented as files, I can do those last two things trivially with shell scripts. No need to take over the init process (cough systemd-logind cough), no need to implement a whole wire format and marshaling library and stub-compiler, no need to create language bindings, etc. Files, directories, named pipes, UNIX domain sockets, and a humble script to set desktop-wide policies on inter-service interactions are more than adequate. But noooooo, we had to build dbus and all of dbus's infrastructure.
I honestly believe the authors of dbus simply lack imagination. Like, we have all this wonderful battle-tested POSIX IPC infrastructure sitting around waiting to be used that they don't even have to maintain, and the kernel makes a fine I/O multiplexer and request broker. Why not use it to its fullest potential? It'll save time and effort, and you won't need any specialized SDKs or tooling to interact with services.
I don't want to say that I think the dbus authors are, well, stupid. If there's something that dbus does that well and truly cannot be done as described above, I'd love to know what it is, and why it justifies all the complexity of re-implementing POSIX IPC analogues in a bespoke system. But I've been writing software for over 20 years, and I've been around the block plenty of times, and this entire project smells like something someone would have written if they simply were not familiar with what their runtime environment could already offer them.
> Plus GNOME and KDE adopted dbus specifically so they could get away from having to pass around random sockets in folders everywhere.
So instead we should just implement worse-performing analogues of most of the POSIX IPC primitives in userspace and pass around service endpoints instead? Come on now.
> you're referring to kdbus, which was an alternate implementation not made by the original dbus developers, and is now a dead project and is not really a thing anymore. Please don't get those things confused.
Thanks for correcting me. I wouldn't want to hate on people for the wrong reasons ;)
------
Anyway, we've been going back and forth for a while. I'm convinced now that Wayland is just an instance of CADT and doesn't solve anything that couldn't have been solved with a less-glamorous but less-effort X extension. But whatever -- the X.org and fd.o developers are free to do whatever they want, etc. etc.
I actually like the X11 model, and wouldn't mind taking a crack at writing a Wayland compositor that simply back-ported all the non-graphical aspects of X as a Wayland extension. Then everything I'm using today could, ostensibly, keep working (and I don't have to care nearly as much what the fd.o folks do going forward).
Look, you're a smart and accomplished person and you have some developed ideas of how thing should be done, please don't hate on other open source developers or accuse them of being "idiots" or "CADT" when you yourself acknowledge that you don't fully understand their work. If you have an idea you think is better then you can just do it, you don't need to trash talk other people's work and use insults like "attention deficit teenager" to get your point across. If you want your X programs to continue working, you don't need to write a Wayland compositor, you can just keep using X. The only reason to write a Wayland compositor would be if you wanted to use Wayland clients, which would not have access to any of the X protocol features anyway.
If you don't care about policies then all those things in X can be a good thing, but if you do care about policies then Wayland could allow for a better design, at least it seems that's what GNOME and KDE are aiming for anyway since their policies are very well established at this point, and they don't really seem to care about breaking ICCCM and other such things.
As for dbus, your solutions would work for some things, but would not have exactly the same semantics as dbus and would come with their own set of issues, and requires building several more infrastructure pieces, some of which you just described. You could build those but it likely wouldn't fit the same use cases as dbus. If you're sending messages that you expect other clients to parse then you still need to agree on a wire format and marshaling library, you can't get around that. If you ask me dbus itself doesn't require much infrastructure at all, you should consider reading the source code for the dbus reference implementation at some point because it's actually pretty small and stable. And I don't understand what you mean by re-implement POSIX IPC analogues, dbus is essentially just a wire format for Unix domain sockets and a message bus that routes the messages, it doesn't re-implement anything. If you want to use dbus from shell scripts, you can use tools like dbus-send and busctl, or you can try to use something like that dbus fuse filesystem -- the nature of dbus makes it map pretty well to that, there's no reason you can't have both a message bus and an easy interface to access from shell scripts.
(Also just another nitpick here, the systemd developers are not the dbus developers, and systemd-logind doesn't take over the init process, that is its own smaller daemon)
If you want to read more, see some comments from the original dbus author:
Can you think of even a single thing dbus can do that my approach cannot do? Emphasis on cannot here -- if you do reply to this, I expect you to prove that the thing cannot be done by any simpler means. If not, then why does dbus need to exist? Better question -- why are people who insist on writing software that doesn't need to exist given decision-making powers in fd.o? Software is like a form of pollution -- more code means more bugs and more security holes (and dbus isn't immune [1]). Any greenhorn developer can write lots and lots of code; it takes wisdom and experience to avoid writing code. So if people who don't grok this are running fd.o, why should I trust anything fd.o produces?
Before you try and tone-police the above, you should know that it is fd.o that needs to convince me to venerate their software artifacts. People writing more code isn't by itself praiseworthy -- code is a goddamn liability, so it had better have a good reason to exist and (in dbus's case) have a very good reason to be widely depended-on. Just because you happen to like or use someone's code doesn't mean that it is any good.
> And I don't understand what you mean by re-implement POSIX IPC analogues, dbus is essentially just a wire format for Unix domain sockets and a message bus that routes the messages, it doesn't re-implement anything.
I guess if you didn't understand POSIX IPC, you wouldn't see how this sentence is an oxymoron. The kernel itself gives you all the trappings of a message bus for free. You don't need a wholly-separate daemon and wire format spec.
Also, the only people who seem to use dbus's wire format are dbus clients. Even when dbus was new, there were already widely-used and well-understood formats for representing structured data (e.g. ASN.1, typed netstrings, S-expressions) that could have been leveraged to make interacting with the service that much more straightforward. But then again, we're talking about people who wanted to re-invent POSIX IPC, so I guess I shouldn't be surprised they also wanted to impose their own wire format on the world.
> If you don't care about policies then all those things in X can be a good thing
I know better than anyone else on Earth what graphical policies are good for me, so I'm going to take this as your affirmation that X is indeed the right tool for the job for people like me who know what they want out of their computers. I stopped using DEs years ago because I got tired of having to fight them all the time to get them to do the things I needed.
I don't understand what you are saying about freedesktop.org. That is just another volunteer run open source organization that hosts projects that are loosely related to open source desktops, you can start contributing to that if you want, or you can not use any of it if you don't find it useful. I'm just here for an interesting conversation, I'm not trying to convince you of anything, and I would rather not continue this discussion if you're going to start throwing around insults and making it personal and accusing other developers of being ignorant or having bad intentions. Please don't do any more of that, it's not interesting conversation and it's against the rules here. You're better than that. If that's tone policing then I'm sorry but my point is we ultimately can't have a conversation if your goal is to attack other people who aren't even here and tear them down, that just isn't my goal.
I also still don't see what you mean about dbus, the Linux kernel itself doesn't specify a wire format for arbitrary messages, and doesn't specify all the things that you need to get the complete functionality of a message bus. Maybe you could get that with another operating system that is based around message passing but Linux is not that. The methods you describe could technically be done without a daemon, but they still require a lot of additional code to set up a bunch of files and sockets and enforce ordering, security, etc, which could also contain bugs. You could tell the applications to implement all that themselves or you could put it all in a daemon which is mostly what dbus does anyway, and by doing it in one daemon it totally eliminates a certain class of race conditions and synchronization issues. Again please refer to the comments by the dbus developer that I showed, this conversation is not new and already happened years ago. If you want to store ASN.1 or S-expressions in a d-bus message you can do that pretty easily. And if you really believe that your solution could work then I would encourage you to develop a dbus implementation that works like you describe and then test to see if it works exactly the same and doesn't break existing setups. But I don't think this would really work, you wouldn't really be saving many lines of code, and in particular multicast and service activation would be pretty hard to do in the way dbus does it without a central message bus.
If you don't agree with GNOME or KDE's policies and you want to implement your own IPC then that's great, I support you doing what you need to do, however they chose dbus a long time ago, and currently it's looking like X is not the right tool for them anymore, so you may just have to accept your differences and move on.
I tried to give a hint about the basic reason of trying to get out from under the weight of a huge code base that had become old and crufty while the very architecture it was designed around was becoming moot since things were shifting to client side rendering already.
Couldn't find a good article on short notice, but there's a decent video from back in 2013 about it.
> I tried to give a hint about the basic reason of trying to get out from under the weight of a huge code base that had become old and crufty while the very architecture it was designed around was becoming moot since things were shifting to client side rendering already.
I can totally get behind doing a clean re-write of X.org (possibly in a memory-safe language this time around) in order to get rid of legacy cruft that's truly no longer used. They could take the opportunity to refactor the super-popular X extensions like GLX to have better "happy paths" in order to make the overall implementation cleaner and easier to maintain. This could even be done incrementally in order to avoid breaking existing clients.
What I'm struggling to understand is what's so wrong with X11-the-protocol and the popular extensions that ditching everything was considered the best idea? Like, if the X11-to-Wayland transition were happening on the Web, it would be a lot like Google deciding to ditch HTML/CSS/Javascript in favor of something home-grown. Sure, that homegrown thing might actually be better, but it would really leave everyone else in a real lurch now, wouldn't it?
Well there were a lot of problems with the X protocol actually, having to do with latency and multi-threading support at the very least. So much so that Xcb was developed as a replacement protocol; so applications were already having to be refactored if they wanted to avoid such problems.
But really, most applications do not deal with Xlib or Xcb _anyway_, they are programmed at the Gui toolkit level. So all that has to be done is add another backend to the few popular toolkits in use. But guess what, Wayland supports both Xcb and Xlib protocols through a virtual X server that transparently translates to Wayland if that's what you have your heart set on.
But I have lost track of what the specific problem is you're actually trying to solve.
XCB and Xlib are libraries that implement the client side of the X11 protocol. They are not themselves protocols.
I'm trying to fix the problems with X11 that supposedly justify Wayland's existence, because I have reason to believe that fixing/extending X11 would be far less painful and far easier than throwing it all away.
However, no one seems to be able to explain what is unfixable about X11. I assume in good faith that Wayland exists because there is something truly unfixable. I'd like to know what that is.
Note that I'm talking about the protocol here. People here (yourself included) point out that X.org is old and hard to maintain. This may be true, but that is a problem with the reference implementation, not the X11 protocol. Thus it doesn't in my mind justify Wayland's existence (but it does justify writing a new reference implementation).
EDIT: Here's an example -- what if someone wrote an X server that only allowed clients to render via DRI3, and by default prevented programs from receiving keyboard or mouse events intended for other programs? There would be a new protocol extension for setting and querying these blocking policies, so integrators could set more-secure default access controls without breaking compatibility. Isn't that basically what Wayland is aiming for -- client input is isolated and everything graphical happens through off-screen rendering to client-controlled GEM buffers?
You could do that but such an X server would not really be any practically different from Wayland. You would still complain that it broke your old clients, and newer clients would still have to maintain two code paths for the newer server and for the X servers that didn't support DRI3. (DRI3 is not supported when running X clients over the network for example)
Are there widely-used X servers that don't support DRI3? Genuine question -- it's been out since 2013. I realize that DRI3 doesn't work over the network (I also never complained about losing X11's network transparency).
> You could do that but such an X server would not really be any practically different from Wayland.
Not quite -- the X server would still provide all the device-independent IPC, input, and screen multiplexing facilities and APIs. Dealing with input isolation could be addressed with an extension.
So I think this answers my question -- Wayland isn't anything special. It sounds like I'd get a lot of mileage out of taking wlroots and adding back in all the device-independent X11 protocols as a Wayland extension. This would basically be the "X server with only DRI3" I described.
AFAIK DRI3 is also Linux-only and is not supported on any X server outside of Linux.
In Wayland those tasks have been split out into libraries. The details of the protocol IPC is handled by libwayland, the input is handled by libinput. Screen multiplexing is specific to the compositor and not really something you can farm out to a library, which is the same as composited X where the compositor process takes over the entire screen and handles all the rendering.
It would be interesting if someone combined an X server with a Wayland server like you described, but I don't think it would be useful. A lot of your legacy applications would still be broken, for example no old clients or window managers are rendering using DRI3. If you want to design an extension for client isolation, the problem there isn't that X doesn't have that but that the existing methods don't really work well. My suggestion there would be to talk to any desktop environments to find out what their requirements are, if they haven't already committed to switching to Wayland already. (i.e. GNOME and KDE already have their solution for this in Wayland) It may be that an additional X extension is unnecessary for what the other desktops require.
> AFAIK DRI3 is also Linux-only and is not supported on any X server outside of Linux.
So? I never said I cared about the portability of low-level rendering software. It's not like anyone cares that Xenocara and the aperture driver only work on OpenBSD, for example.
> Screen multiplexing is specific to the compositor and not really something you can farm out to a library, which is the same as composited X where the compositor process takes over the entire screen and handles all the rendering.
Hold up. Isn't screen multiplexing and compositing exactly what libwayland gives a program the power to do? You'd build and run a compositor (like Sway, or like Kwin), and it fulfills compositing, screen multiplexing, and so on, as well as IPC, window management, hotkeys, screenshots, etc.
At least with X, these were separate programs you could mix and match.
> It would be interesting if someone combined an X server with a Wayland server like you described, but I don't think it would be useful.
I'd find it useful. I don't care if no one else does, since I'm writing this for myself.
> A lot of your legacy applications would still be broken, for example no old clients or window managers are rendering using DRI3.
I'd add the necessary compatibility code for the programs I need to run. I'd add them in a way that, if others wanted to fork my code, they could easily restore their own legacy code paths.
> If you want to design an extension for client isolation, the problem there isn't that X doesn't have that but that the existing methods don't really work well.
Sounds like a problem with the particular extension, not X11.
> My suggestion there would be to talk to any desktop environments to find out what their requirements are,
Don't care. I'm not doing this for them. I don't use any of them, and they're all dead to me at this point. I'm doing this to keep my minimalist X11 window manager and X11 clients, and to satisfy my intellectual curiosity.
You asked if other X servers are used, there are other X servers that are widely used outside Linux (Xquartz, Xwin, etc) which would break if the clients required DRI3.
Libwayland is a small library carrying the implementation of the wire protocol, and a few other bits like a simple event loop for servers and a library that can load X cursors. The point with that is that you're bringing your own compositing and multiplexing anyway. From there it's optional if implementations want to put in additional features for IPC, window management, hotkeys, screenshots, etc, and they can choose if they want to put that in the server or put it in a separate program. So you can still mix and match on some level anyway, it's not quite the same though.
I had exactly the same idea as you a few years ago to build something like that and I thought it would be useful too, and I thought about it for a while and realized that it doesn't really give you any of the benefits of Wayland or the benefits of X11. The point with wayland is already that it strips the unnecessary bits out and maintains a legacy code path with XWayland, and the point with X is that it's always going to keep the legacy code running anyway, so you don't gain much by combining them. If you have a minimalist window manager that's only a few thousand lines of code, and you want to get the benefits of Wayland, it's much easier to just port that using wlroots or something than it would be to rewrite the whole X server. That's just my experience.
>Sounds like a problem with the particular extension, not X11.
Since this is an issue with Xorg lacking the right extensions it's basically the same thing.
Maybe that's true if that's your only concern, but there are other reasons to replace X11 than just this.
Also, X11 is arguably not the whole graphics stack, at this point the DRM/Mesa piece is much larger and more significant, and Wayland doesn't replace it outright anyway -- it makes it optional if needed for backwards compatibility, in the same way that macOS has XQuartz.
The other two concerns in GP are no screen tearing, and better hidpi / multi-monitor support. Is it truly less work and less disruptive to address these to concerns within X11 than it is to throw X11 out (and also leave all nvidia users high and dry)? Also, keep in mind that throwing X11 out and replacing it will take more than just technical legwork -- it will also take ecosystem buy-in and standardization, and if we're being honest with ourselves, this is the harder problem. Recall that the X11 ecosystem has a 30-year head start on this, and there's a crap-ton of 3rd party software that assumes an X11 environment that Wayland is going to need to emulate. If X11 does indeed go the way of the dodo, I think we can reasonably expect another 30 years of bug reports in the form of "Fuck Wayland! I upgraded to Wayland and my $IMPORTANT_THING broke!". I very much doubt that at the end of the day the switch to Wayland is going to be overall easier than just fixing X11, but would love to be convinced otherwise.
The usual way to fix other concerns like that has been to add more WM atoms or add more X extensions, which is a similarly uphill battle requiring buy-in and standardization, and typically old X clients just won't be updated to support those new things. The way to get the most value out of such things would be to add support to the major toolkits, but those have already been ported to Wayland for some years now.
The backwards compatibility is done through XWayland which functions similarly to XQuartz, in that it is just the Xorg server running using Wayland as a backend driver.
What do you think is more of an uphill battle, in terms of time and energy sunk? Adding another X extension that can be incrementally deployed, or trying to phase X out by maintaining both an X11 and Wayland back-end for all apps trying to avoid breakage?
This doesn't even speak to X11 apps that aren't built with toolkits (for example, I use xterm, xpdf, xfig, Openbox, etc.).
XWayland is a nice idea, don't get me wrong. But it's not a 100% replacement either. Distros offering XWayland are even up-front about it's shortcomings [1][2][3].
There isn't much difference there, but it would be more of an uphill battle if you tried to put everything different that Wayland does into X extensions. That still requires maintaining an extra code path for old X servers that don't support the new extensions, and creates additional risk of breaking things and causing regressions in the X server because of all the new code you're adding.
Clients like term, xpdf, and xfig should work fine in XWayland. Window managers won't work without getting ported, but someone has been working on a port of Openbox: https://github.com/johanmalm/labwc
> That still requires maintaining an extra code path for old X servers that don't support the new extensions, and creates additional risk of breaking things and causing regressions in the X server because of all the new code you're adding.
Yes, agreed! But why is it _more_ risky to do that than to throw the whole X server concept away and start from scratch? Rewriting such a widely-used piece of infrastructure from the ground up is a super-risky proposition.
That's mostly a misconception, Wayland implementations don't need to start from scratch. Weston and wlroots are minimal from-scratch implementations, but GNOME and KDE for example do their implementations by re-using most of the code from their X compositor.
Great! So instead of having one standard way to do video/input multiplexing, we have at least four -- GNOME's, KDE's, wlroots, and weston (and probably a smattering of others). If I want to write a program that works with "Wayland," I'm either going to have to test them on all of the widely-used compositors (because of course they're not all going to behave exactly the same way), or I'm going to have to just punt on them. The former option is 4x the work, and the latter option is me telling users "Hey everone, remember that program that used to run everywhere in every window manager ever that you all know and love and depend on to do your jobs? Well, now it only works on GNOME, since that's all the time I have to support it. Good luck non-GNOME users!"
EDIT: Before you say "just use a toolkit, it'll take care of everything," I can already tell you that users don't care. They only care that the app that used to work in KDE no longer works in KDE. They're not going to complain to Qt or Kwin; they're going to complain to the app author. So the app author becomes responsible for the additional burden of testing their software in a bunch of different compositors, for zero gain.
Is that any different from normal? In my experience, if you're shipping a product on a Linux-based desktop, usually you target a specific set of distributions, i.e. the default configuration of the last few LTS versions of RHEL or Ubuntu or whatever, which at least for those examples all happen to be GNOME based. Customers who come with some weird hacked-up distribution would be on their own for support anyway, they can try it but there's no guarantee it will work. If KDE (or something else) really is doing something different here then you would have had to extend the same amount of effort as you did previously.
Before, a graphical program would run just fine under GNOME or KDE because it wasn't GNOME or KDE handling the video/input multiplexing facilities. But now with Wayland, a graphical program not only needs to target distros, but specific configurations of those distros (i.e. Debian/GNOME, OpenSUSE/KDE, etc.). This isn't helping fragmentation.
That's only if you're using functionality specific to the DE, which is handled mostly the same as it is under X. GNOME and KDE for example tend to provide their functionality as dbus services. If you just have a simple app that needs no special privileges or features then that will work just the same. If you use GTK or Qt, the transition will be mostly seamless and would only be a problem if you were circumventing that and calling Xlib or xcb directly.
No, this happens if the particular Wayland compositor you're running the program on happens to implement a Wayland protocol or extension you use in a "unique" way that causes your app to break. This wan't a problem with X.org because all distros used the same X.org (or, if they used an older X.org, and if that led to breakage, the solution for users was always the same: upgrade X.org).
I already explained above why "just use a popular toolkit" isn't a viable solution. Users do not care whose fault it is; all they care about is that your app used to work in KDE and now it doesn't in GNOME (the problem is even worse in Wayland than I'm letting on, because with Wayland, the DE controls the compositor and renderer -- there are so many more ways for the DE-specific code to interfere with the graphical program than there was with X.org).
Sure but that's not any different if your application depended on some other GNOME or KDE specific API. If KDE decides an API is KDE only and GNOME doesn't want to make their own implementation then there's not much that you could ever do about that. The point with using a toolkit is that it's an abstraction layer that handles the differences between window systems and implementations for you. It would be better if you mentioned the specific reason why your app is breaking because that would likely be a bug in the toolkit.
> Regarding security, I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs from one another.
The response I've heard to this question is entirely nonsensical: it could be done with an X extension, but getting adoption from various parties to make this work would be difficult. As if building an entirely new display system doesn't require orders of magnitude more work and buy-in.
Basically everything would stop working with X, because X simply relies on being able to listen to everything. Actually there are nested xservers that do something like this, but now global hotkeys don’t work, it doesn’t have an api for screenshots, so those won’t work either and the like.
And you can probably add some X extension which can’t be queried properly, but then you can just as well create a new display protocol that actually knows about GPUs
I don't care if X sees everything (hell, even if X didn't, the kernel certainly still would). I only care that I can control which programs see which X11 events. Like, my hotkey program can see everything, but my Web browser can only see its own x-windows' events.
There is some rough support for that in the X server, but it is lacking a good API or user interface, and the desktops that would implement that are doing it in Wayland.
Doesn't that strike you as odd? Like, why is it that X11 was so close to fixing the problem, but everyone who would benefit from it (and who touts Wayland's ability to do it) decides to just throw it all away and re-built everything from the ground up? I'd sure like to know what they know about this.
No, the hard part is building a good API and user interface that works for everybody. IMO that's mostly why there are a lot of half-finished and inconsistent things like that X11.
Doesn't that undermine Wayland's selling point of isolating clients' input? No one was clamoring for this until Wayland announced it, and no one was willing to put in the effort to fix it in X11 all these years (even though it would have been easier than ripping out X11 entirely).
That's one of the things Wayland was designed to do, of course implementations can build other things around it that allow privileged clients to break client isolation. The effort could have been put into X11 but it seems the people interested in this would rather put that effort into Wayland.
That's literally the essence of the CADT[1] model of software development. Why do a comparatively-small amount of unglamorous work to solve a bug when you can just burn everything down and rewrite it from scratch?
Option 1: add an X extension that lets you configure which windows get to see which input events. Most clients don't actually need to see any events besides the ones they would see while in focus, so most clients don't notice.
Option 2: replace X with something entirely different -- different rendering, different input, different IPC, different organizing principles, different programming models -- and patch all downstream dependencies to use it.
If you can't see that Option 2 is clearly more work and more disruptive, I don't know what else to say to you.
Which is a mistake. On X11 the server, window manager and compositor are three separate programs. Both window manager and compositor can individually crash, started, stopped and replaced at runtime without any of the other running X11 client instances affected.
On the other hand on X11, Xorg cannot crash without the X11 client instances being affected - a much larger chunk of code. It's only because Xorg is older that that doesn't happen much.
A chunk of code that is running in production for more than 30 years and should be considered battle tested. In my experience Wayland compositors crash much more often than X11 despite the supposed reduced complexity. The last time X11 server crashed on me was in 2004 if I remember correctly.
Exactly, that reliability of Xorg is a function of its age and doesn't imply anything about the correct design of a Wayland compositor. What's the chance those Wayland crashes were in the window management code rather than the rendering, protocol, clipboard, and drag/drop handling code? dwm is 2000 SLOC to Xorg's 1 million or so. I don't think splitting out the WM code would have gained much.
You underestimate the inherent complexity of Wayland. As exercise I recommend to implement a "Hello World" native Wayland client. Watch and see the complexity explode when you simply want to add the functionality to take screenshots to that client.
I'm not sure what kind of "Hello World" clients you're comparing, but if you check the Wayland backends in Gtk/Qt, you will actually find them to be smaller than the respective X11/XCB backends there, for various reasons.
> you will actually find them to be smaller than the respective X11/XCB backends there, for various reasons.
It doesn't seem surprising to me. As X.org has gained extensions over the last 30 years, toolkits that speak X11 find themselves having to decide which extensions they'd like to use. Adding flexibility on this naturally leads to a bigger feature matrix. Of course, the toolkits are also free to drop support for X servers that don't have those extensions, which in turn would shrink the X11 backend.
I have no doubt that in 30 years, they'll have a similarly-sized feature matrix for all the Wayland extensions they'll want to support.
I think we would all love it if we could magically fix every bug or design flaw in existence with a wave of the hand, unfortunately in real life it takes time and experience to do things right.
Can you be more specific about what are you comparing this to? An X client with similar functionality would likely be longer and require several X extensions.
What is a wl_registry_listener and why do I need it? What is a simple XGetImage() equivalent on Wayland? On Xlib function names at least give you an idea about what they are supposed to be doing.
> Also, how often do you write gui apps without any framework?
As soon as you have a Toolkit you don't need Wayland anymore. Windows then are just additional nodes in the object tree. There was even a demonstration of GTK applications running in a framebuffer without X11 way before Wayland even existed. If Wayland can only be used sanely with toolkits it indeed is completely pointless.
It was some time ago I looked into it, but wl_registry_listener registers a callback for when the compositor “declares” what protocol extensions it supports. This is painfully missing from X, but this makes Wayland much more modular/extensibly.
Wayland’s abstraction is basically a buffer. The client simply creates a buffer either in shared memory, or directly on the GPU and then passes the compositor a handle to the buffer. That’s it.
Also, it is sort of ingenious to compare the two — the XLib is a higher level lib than libwayland. There is absolutely no reason why someone could not create a wrapper for this — although I again ask, how often does one create an only X/only Wayland window without a framework.
I am not sure what you mean by Wayland can only be used sanely with toolkits, any GUI usually need a toolkit or some equivalent, even under X. if you are not using a pre-existing toolkit and are writing your own routines to draw buttons and text boxes and such, that would be implementing your own toolkit.
How much of that 1 million lines of code actually gets executed? Also, everyone runs the X server, so its code gets a lot of testing. This isn't true for window managers -- there's a long tail of them. This is just one data point, but in my experience I've had window managers crash far more often than X servers (since they get less love).
Actually most of the crashes I had with Sway were in fact related to compositing and window management. including stupid stuff like crashing because sway couldn't decide which window to focus after hiding another.
Screen tearing under X11 is more or less not a problem anymore. HIDPI, yes, but only for multiple monitors (maybe even just programs that don't support it properly ?). Security... eh, yes and no (Xorg can be ran as a normal user).
The actual good thing about Wayland is that it simplifies things. While the bad thing is that it needs some kind of extensions for even the basic things a desktop needs, and that (AFAIK) freeGNOMEdesktop is in charge now.
> Screen tearing under X11 is more or less not a problem anymore.
For GLX/DRI clients where there's an actual concept of swapping buffers w/vsync, sure, but for classical X clients this is not true.
X got extensions for double buffering at one point, but practically nobody uses them.
There is no concept of a "completed frame ready for presentation" in core X, there's no way to really fix this without ceasing to be X (hello, Wayland). X compositors literally just drain event queues of X requests and throw shit on-screen when the event loop gets around to it. If that presents a partially updated window, so be it. GTK+/GNOME folks added "frame clocks" to try work around it, but not everything is a modern-ish GTK+ app, nor do all compositors implement it.
If there's anything Wayland fixes that really required such an upheaval to fix, it's flicker/tear-free compositing.
Well, it seems to work fine (xorg.conf tearfree option, that is). AFAIK wayland compositing also has the problem that clients don't know when they should be done with rendering, as in when the flip is going to happen. I don't know much about how it (wayland, DRI) actually works (as in, can the "client" just render where the compositor told DRI it should without involving the compositor, or does it have to tell the compositor when it rendered).
If Wayland is throwing partially constructed buffers on-screen, it's the client's fault for submitting them unfinished.
In X, there isn't really a concept of what a completion boundary is. The client asks stuff to be drawn, the display server gets around to it when it gets around to it, and makes the changes visible willy-nilly, eventually becoming consistent with the client state.
If you look at the source for xcompmgr, the event loop is pretty simple and clearly schedules repainting the root window with all newly received damage updates whenever its X socket is drained of new events [0]. This is a pretty arbitrary boundary to perform redrawing on; process scheduling, socket buffer sizes/limits, it's not well controlled at all. The way this is done it will make visible whatever damage events managed to get into this timeslice. If that results in only part of a window being updated, with the rest of the damage part of that "frame" arriving in the next timeslice, POOF, there's a tear.
To a layman none or almost none direct benefits. But there are non-direct benefits like those listed by sibling comments. However it will, in longer term, simplify work for developers, while in short term it will be or rather it is a complication in the transitional phase.
It is an evolutionary step for end users, there is no revolution really. What is the benefit to a layman of a program to change bogies in all trains? He doesn't care about regenerative breaking, so smoother ride? But until almost all of his rides are on this newer platform he would not see a real benefit. Then when it is finally there he will only see its absence. Same with Wayland.
Not entirely. Despite the fact that no available Wayland compositor allows you to disable vsync at this point Wayland also throws away all the work done by ddx drivers on X11 implementing 2D acceleration. This allows X11 clients to de-facto render to directly to the frontbuffer when no compositor is in use. It is the lowest possible latency you can get.
In theory there could be a Wayland implementation that loads Xfree86 drivers, but why? Going forward, the implementations that want high performance and low latency (i.e. VR/XR interfaces) are likely going to target Vulkan, and you will likely never want to disable vsync there.
I really like no application being in control of other windows besides my compositor. As a side-effect to security improvements (clipboard, keylogger, screnshotter), you get:
- Apps cannot arbitrarily modeset (change screen resolution and "mess my desktop up") anymore.
- Under sway, apps are the size I want. I can somewhat resize them over/under their limits
- No more windows that grab the focus and force their way to the foreground
- Apps cannot move my mouse cursor anymore. I hate it when they do, I know some Cadence DKs that do so (position the mouse cursor on the OK button: nice touch, I hate it).
- Some apps/games might have crashed when changing workspaces, but I have never once been unable to "alt-tab" or change workspace, change the screen the app was on, etc.
Granted, under sway those are just different APIs and are not hardened, but they could be in the future. And I mean the above, I've had to restart the X server (or switch to a TTY and kill the app) due to a misbehaving app numerous times, but this is mostly a thing of the past now.
Also, bonus:
- easy multiseat under sway: in under 1 minute, given an extra mouse/keyboard, I can theoretically work with someone else on the same computer: I have a window focused, that I can type in, they too.
- Easily create headless displays, or nested sessions (sway can run as a wayland, X or DRI client). You can use that to leverage another computer as an external display: https://news.ycombinator.com/item?id=25891464
- Under sway, the compositor handles configuring input and display. No more xorg.conf (cue https://xkcd.com/963/ though it's less true these days).
And on the technical side:
- From my understanding, apps should be able to pass GPU buffers around much more efficiently, even drawing directly on the final buffer thanks to dma-buf. This leads to lower-latency and higher performance, especially for high resolutions. In turn, it helps quite a bit with pipewire for screensharing and passing video around, as well as zero-copy hardware video acceleration.
Very thoughtful guide. Don't forget to launch dbus and policykit is all I can add.
I've been using sway daily for about two years now. Here are my current gripes:
- Sometimes, client applications do not receive input anymore (mouse/keyboard). This has been a known issue for a while, but I still experience it.
- `dpms off` started to crash my AMD-powered PC some time ago, annoying when I configured it to happen with `swayidle`
- Sharing screen in browsers worked extremely well... last month or so, when it finally got turned on in Firefox. I didn't change my config, but it isn't working anymore. The handshakes happen, but Firefox or OBS display nothing with xdg-desktop-portal
Minor gripes:
- Some Wine (proton) games have trouble getting focused (Sins of a solar empire launcher, Evochron mercenary, IIRC). Other play funny, with screen resolution and mouse coordinates (ashes of the singularity, I think).
- I launch it with `sway`. After Sysrq+R, I often terminate sway by mistake by pressing Ctrl+C
- Very occasional (once every 200 hours or so) crashes. Probably because C.
If you go this "build it piecewise" route instead of a full package DE you'll probably also want notifications, screenshot, a GUI tool for external monitor management (I go as far as using nmtui instead of a widget for networking but absolute detest dealing with sway commands to present on an external screen), battery monitor, and a screen lock/screensaver. Plenty of good options for all the above but sway is not a batteries included DE to choose so you have to find them yourself.
Mostly to learn new things I didn't do any xwayland at all. It's definitely been interesting but certainly not what most would want at the moment IMO.
For screenshots there are currently two competing approaches. GNOME and KDE have opted for dbus interfaces (e.g org.gnome.Shell.Screenshot) and Sway has opted for Wayland protocol extensions (zwlr_screencopy_manager_v1). I think the former is more maintainable because dbus interfaces are accessible by pure cli tools where grim and wl-clipboard have to create dummy wayland surfaces just to talk to the compositor.
At least everyone agrees on the notification dbus interface and the tooling is super mature.
Besides the tiling Sway there's also the stacking Wayfire[0] that is from the same family, but modeled after Compiz (blur, desktop cube and good old wobbly windows are all there) and highly configurable.
I use it with Waybar[1], wf-dock[2], Sirula[3] as a launcher and a bunch of other small tools like Gammastep[4] (fork of Redshift) for white balance adjustment, grim & slurp [5] for screenshots and mako[6] for notifications. [7]
It's a very DIY-y experience, but it's meant to be (if you want something pre-configured you can barely change try Gnome). The combination of getting it just right and the Ikea effect makes for a pretty rewarding result (I also maintain a list of the available desktop tools you can use when on a wlroots based compositor for your DIY needs [8]). The vision for the future is pre-configured DEs being offered on this base and it possibly even offering a lot of Sway's tiling features. [9]
It still feels like the early days (for non-Gnome), but with Nvidia driver 470 & accelerated XWayland coming up, the Vulkan efforts, Electron (finally) and Wine making the switch I feel fairly confident saying that 2021 is shaping up to be the year of the Wayland desktop.
Free of screen tearing and X-related worries since 2020 :-)
It is a common misunderstanding that all C programs are fast by default. In reality that's not the case. It takes hell a lot of time and resource to get that right. C programs just compile blazing fast. D compiles even faster.
I never use GNOME but I decided to try it to test pop shell awhile back. This was shortly before 1.0 of pop shell so it could have changed. Not trying to disparage it, just sharing my experience as an i3 user. It feels so so slow because of animations.
I googled how to get rid of animations because it wasn't in the GUI settings. Animations were removed, but a delay remained with any tiling movements where an animation would have been. Not sure if this is a limitation of GNOME or a bug in pop shell, but it wasn't going to work for me. It would be really hard to give up the snappiness.
Yes, as is Pop Shell implemented on top of Gnome (mentioned in another comment).
You can tell they've made some questionable decisions when you try to use Gnome on very weak hardware.
On an old dual-core Celeron w/ 2GB memory I found it unusable. KDE—which, when I first started using Linux desktops on machines less than 1/4 that powerful, was noticeably heavier than Gnome—was a little slow but basically fine.
To my eyes gnome also drops frames like crazy (like, even for Linux, which is a pretty jittery environment to begin with) even on excellent hardware—not sure, but I think it's a reasonable guess that's also a symptom of sprinkling a scripting language all over the system without incredible levels of discipline to make sure it's never in the way of anything important.
Since a few months I noticed weird (complete) freezes/crashes on my pc, while gaming... it's not the youngest so I thought it may come to end of life.
Out of curiosity I reinstalled i3 a couple of days ago and used it (only) for gaming. No crash since.
I assume it's either a bug in Mesa (AMDGpu) or somewhere in the wayland stack... sway hasn't had an update since November, so... I dunno, I haven't taken the time to investigate.
While working I still use sway, because I've customized it to my needs, but for Gaming/Streaming I now switch to i3 again.
Nice sideeffect: I can finally play some games again that wouldn't even launch on Sway or were unplayable, like e.g. Natural Selection 2 which turned to a black screen when I switched workspaces (e.g. from one monitor to another) to e.g. tune down the music or scroll down a page while being dead.
Feels funny, but more annoying :/