Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sway (the window manager) _is_ the compositor. There is no need for a separate program like compiz or picom.


Yes. That's the problem. Not only is the resulting stack less resilient to crashes, but also it takes away flexibility by needlessly welding together two orthogonal design concerns (window management vs rendering). Sway is like putting X11, the window manager, the global hotkey daemon, screenshot grabber, and so on, all into the same address space. What could possibly go wrong? /s

Regarding security, I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs from one another. Like, aren't keystrokes propagated as packets sent through an X11-owned UNIX domain socket in /tmp? Can't we just attach a policy to that socket to decide which PIDs (or process groups, session groups, containers, etc.) get to see which messages?


It's a theoretical problem. The window-management parts of sway are tiny and having them in the same process means you don't have to do IPC every time your windows do something. That simplicity means it's easier to write code that doesn't crash.

Most of the heavy-lifting is done in wlroots anyway. wlroots based compositors really do implement just their own flavour of compositing what you see on the screen on top.

That said, you still can use IPC, if you really want to; I have an external window manager that augments Sway's tiling system via its i3-compatible IPC mechanism to arrange my windows in a way that Sway doesn't do natively. If you really wanted to, there's nothing stopping you from writing a wayland compositor that uses an external window manager.

At any rate, I don't buy the reliability argument at all. I've used sway since 0.10 or something, and I only ever remember crashing it once, and I fixed that bug myself. :P


I'd say it's a very practical problem. Why put N different things that used to run in separate protection domains into the same protection domain? Have we gotten N times better at writing code that doesn't crash? Do we believe that we can put N different things into the same address space but somehow ensure that a security hole in one of them won't compromise all of them? Have computers gotten so slow in the last 30 years that doing IPC is no longer an option?

I'm glad that you have personally not encountered a crash in Sway -- I really, truly am. But let's not pretend that a data point of 1 indicates a trend.


In practice, placing everything in one process seems to reduce the total attack surface. There is quite a lot of code required to synchronize state between the X server, window manager and compositor. When you combine them, you can throw out most of those bits that are largely serialization/deserialization.


> In practice, placing everything in one process seems to reduce the total attack surface.

Surely you're joking. Privilege separation [1] is a thing for a reason.

If we believed that putting different things into the same address space made them more secure, then why stop there? Why not just put the kernel, the shell, X11, your HTTP server, and everything else into the same address space? Let's just do away with processes -- let all schedulable units be threads that can all read and write to each other's memory, because what could possibly go wrong? /s

[1] https://en.wikipedia.org/wiki/Privilege_separation


There is no real security boundary or privilege separation in that case, the window manager and compositor are getting full access to the screen and the input devices and all the client windows. That's part of the reason why it doesn't make much sense to keep them separated, I know you were joking but it's true: they might as well be threads, it saves you the serialization/deserialization step.


In Wayland, a fail-stop bug in the window management logic will now bring down your compositor and every program that was connected to it. In X11, a fail-stop bug in the window management logic only crashes your window manager -- everything else keeps running. This is a really nice property to have -- in general, why make the "blast radius" of a fail-stop bug bigger when we don't need to?

Like, what's the upside of making it so a bug in the window management logic can crash the entire GUI? You claim latency due to no need for serialization/deserialization across process boundaries, and you claim potentially less-complex code. I'm very skeptical about the complexity reduction -- you're replacing the IPC with global state guarded by critical sections which your threads all need to respect. Getting rid of IPC isn't "free" -- you're replacing it with something that could be even worse. So, I'll need to see some actual case studies here.

I agree that there is measurable latency (context switches and all), but if it's a difference of only a few extra microseconds -- i.e. something the user won't notice because computers are insanely fast these days compared to when X11 and window managers were first written -- then I'm disinclined to give up my crash resilience. Do you have data to show that there is noticeable, irreducible performance lag in having a separate window manager process from a compositor?


I don't have any raw data for performance numbers and that wouldn't matter anyway because they may not be relevant to your set up; if you're concerned about that you should run a test comparing them yourself on your specific environment. I'm speaking in terms of code complexity here, if you want to follow the threaded approach then minimum two threads will do the job (one for the scenegraph, one for the sockets), and an X compositor should potentially be doing this anyway to avoid lag caused by slow rendering. The difference is that the X compositor is just storing a copy of large amounts of state from the X server, whereas the Wayland compositor would store the canonical data and wouldn't need to worry about falling out of sync with the X server.

Also, the way that X does it is overly complicated and is unnecessary to have protection against window manager crashes. A similar type of crash protection could be done with a Wayland implementation and it could be done in a much simpler way than moving the entire window manager out into a separate process. You just need to have another process that can hold the client fds and cache a minimum amount of state needed to resume the clients, it wouldn't need to know as much as the X server does to accomplish that task. Prior art is in the Arcan Wayland bridge, other Wayland implementations have not implemented this but they could eventually: https://arcan-fe.com/2017/12/24/crash-resilient-wayland-comp...


> if you're concerned about that you should run a test comparing them yourself on your specific environment.

I mean, X.org runs well enough on my end? I'm not finding myself wanting something with lower input latency.

> The difference is that the X compositor is just storing a copy of large amounts of state from the X server, whereas the Wayland compositor would store the canonical data and wouldn't need to worry about falling out of sync with the X server.

Honestly, if I were to do a ground-up X11 implementation, I'd probably build it around wlroots or similar. Then Wayland clients could interact with it, and I'd be able to preserve all the X11 compatibility and X11-isms I cared about. Like having separate window managers, hotkey daemons, screenshot tools, the slew of X11 command-line clients I know and love today, and so on.


Someone could build a combined Wayland/X server in the same process like that, but why? The only reason you would need to do that would be to run Wayland clients natively on your X server. IMO if you want to use Wayland it is much easier and more valuable to just port those tools. The hotkey daemons, screenshot tools, and command-line clients are pretty small and not that hard for someone to rewrite as a weekend project, people have done a lot of that already. The harder part is the window managers, but if you're a hard-core window manager author used to doing things the X way then you won't see much reason to switch anyway.


If the Wayland/fd.o crowd insists on breaking things I relied on to the point where I have to work weekends to fix everything that used to work, I'm going to spend that time replacing whatever software they wrote that I use with software that I wrote, since at least I won't be breaking my workflow.


I don't understand what you mean. You can still continue using X if that's what you want, people deciding to spend their time developing Wayland doesn't somehow break X or make it worse.

Also just FYI, it seems x.org has merged with freedesktop.org so they are mostly equivalent at this point, being run by the same group of people.


I'm assuming that X.org will go unmaintained after a time, which is fine. But if that means I can't get a usable X server running later on, I wouldn't mind taking a crack at adding a Wayland extension that implemented the X11 protocol.


It's not clear what that would solve, XWayland mostly fulfills that role already. It can't run window managers, but you wouldn't really get that everywhere from adding another Wayland extension either -- doing that would require a lot of additional code and the GNOME and KDE implementations wouldn't want that anyway because they have their own built in window managers. So maybe you could get a hacked up version of Weston that can run X window managers, but why bother dealing with maintaining that instead of just maintaining X? It's unlikely the X server will stop working as long as you have a GPU that supports GL or Vulkan, the glamor/modesetting driver should continue to work there.


In the event that I do need to use something that is Wayland-only, I'd like to have the ability to run it without breaking everything else.


I don't know of any significant applications that are Wayland only. If that ever happened, someone could just make a really simple Wayland server that does the reverse of XWayland.


> I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs

This can be done via firejail[1] + xpra/xephyr but is a rather cumbersome endeavor. The X11 standard also contains access control hooks that allow you to "firewall" any aspect of your application. However it is used by no application I personally know of and is rendered useless by how the xinput mechanism is implemented at this point.

The reason nobody bothered to deal with this so far is that people almost never run untrusted software on FOSS systems which is what X11 primarily targets. There was no demand.

1.: https://firejail.wordpress.com/documentation-2/x11-guide/


The demand there would be with products like Qubes and Subgraph, which are currently using Xephyr and Xpra. Eventually Wayland should be able to improve performance there, and bring some of the security benefits of those setups to other distributions.


Seems to me that firewalling X11 programs from one another would take a lot less work and be a lot less disruptive than requiring users to run multiple VMs with multiple X11 servers and/or replace the whole graphics stack.


Trying to shoehorn proper security into X11 would be a formidable effort and still be quite disruptive to client software.

Back when Wayland was just being proposed there were not a lot of developers working on X. They almost-unanimously agreed that it was time to break with backward compatibility and eject a lot of cruft that had built up over the years, such as the horrible font handling. Modern toolkits had already started moving away from using many of these X11 facilities and were doing much more client side anyway. So the argument was that a relatively clean slate design was called for which should dispense with the cruft and better handle client-side rendering.

It's not perfect and I know it is disruptive for some people, but at least here it has led to a much better experience for some years now.


Would addressing the "firewalling" issue be more disruptive than throwing out X11? Because, Wayland definitely firewalls programs (among many other things) -- surely just implementing firewalling in X11 is not nearly as difficult or disruptive? Implementing firewalling could even be done in an incremental way that's easily reverted or tailored to individual apps and users.


I can’t reply to your comment below this, but the Wayland guys are the X guys, and while they are definitely not infallible, don’t you think it is a bit egoistic to think that they didn’t thought of this one simple little thing that you did, without any knowledge on the inner workings of any display server?

I’m sorry if it sounds harsh, but honestly.


Don't get me wrong, I'm not trying to suggest that the X devs are ignorant of this. I too have both deep respect and gratitude for the work they have done and continue to do.

I'm only frustrated that I can't get a straight answer as to what problems Wayland is solving that can't be solved with less difficulty and breakage by repairing X11. I'm sure the X.org developers have an answer, and I would love to know it, but I'm not getting it here in this comment tree.


Here you are: https://youtu.be/cQoQE_HDG8g It’s a great presentation by one of the guys behind wayland, who worked on X a lot before.

Basically, X has a fundamentally misaligned abstraction of the underlying hardware - which is expected based on its age. When used with a compositor, it is basically a middle-man with no function whatsoever. So Wayland decided to cut out the middle man and pass events directly between client and compositor. But please watch the video, I think it will answer all your questions.

Also, it’s not like X will be totally deprecated, XWayland is an API implementation of it that will be supported forever.


Right, and as I said elsewhere, this is a problem with the reference implementation being crusty, not X11. That seems to be the point the speaker is making as well.

Like, I'm fully supportive of having the X server simply manage DRI3 for a bunch of clients and composite the results. That's all well and good.

I'm less supportive of doing this while also removing all support for the other things that the X server provides the ecosystem:

* Unified input/event capturing and forwarding

* Unified screen capturing/recording

* Window management

* Clipboard

* Structured IPC (on top of which you get ICCCM and NetWM)

* Xrdb

* Xprops

* Notions of windows in general (everything's now client-side)

* Fonts

* Drawing APIs

These were all standardized things that users could count on always being available, regardless of which GUI programs they used. Wayland completely punts on these things and defers them to extensions.

Before you ask, I've already seen the "Wayland is a protocol; these are all extensions" song and dance routine. That's a cop-out. Dropping these things means that there will now be multiple incompatible implementations of the same concept, and no way to mix-and-match them because they now all have to be built into the same process that does your window management. Wayland implementations completely destroy this digital commons, for no apparent reason or gain for the users. The only people I see potentially benefiting from this are full-fledged DEs who can leverage their compositors' incompatible implementations to enact a form of lock-in (i.e. your GNOME programs are no longer guaranteed to run in KDE, and vice versa). So, why do this?


> Unified input/event capturing and forwarding

This is simply insecure. “It is easier to cut holes into a solid block than to patch something that looks like swiss cheese”. What reason does a random app has to see each keypress, when it doesn’t have focus? Do you trust eg. the teams app or the million other app to be a good citizen?

Screen capture is implemented with pipewire in a better way than before.

Fonts: noone uses the old font API of X, even under X. And third party libs like cairo work on both wayland and x, so nothing is lost here.

Drawing APIs: show me any app that uses it and was upgraded in the last two decades. Feel free to use a CPU only drawing API, I prefer not watching the line getting rendered.

Also, as I already mentioned XWayland is important for exactly this reason - it is a completely backward compatible X implementation, on top of a better display protocol. What’s the actual problem, because I still don’t see it.

There is no need to have incompatible implementations of each, and just look at the three main wayland implementations: they share many of the work.


> This is simply insecure. “It is easier to cut holes into a solid block than to patch something that looks like swiss cheese”. What reason does a random app has to see each keypress, when it doesn’t have focus? Do you trust eg. the teams app or the million other app to be a good citizen?

Up-thread I was asking why X.org doesn't simply firewall apps off from one another, and ship with an extension to control this firewall. Adding this capability to X.org (or any X11 implementation) could be done without throwing X11 away. Having a unified way to decide which programs get to see which events would be lost in a transition to Wayland, since each compositor would ship with its own incompatible way of doing this.

> Screen capture is implemented with pipewire in a better way than before.

It also requires that the given Wayland compositor works with it. So, you're SOL if the window manager you're using happens to be welded to a Wayland compositor that doesn't. This wasn't the case before with X11, where screen capture was handled by the X server.

> Fonts: noone uses the old font API of X, even under X. And third party libs like cairo work on both wayland and x, so nothing is lost here. > Drawing APIs: show me any app that uses it and was upgraded in the last two decades. Feel free to use a CPU only drawing API, I prefer not watching the line getting rendered.

I wonder why the server still has them, then. Surely the X.org developers would have simply deleted old code without throwing the whole server away if they were as certain as you are that no one uses them?

Also, I see you haven't addressed the other points I raised (Xrdb, clipboard, ICCCM, NetWM, xprops, window management, etc.).

> Also, as I already mentioned XWayland is important for exactly this reason - it is a completely backward compatible X implementation, on top of a better display protocol. What’s the actual problem, because I still don’t see it.

It's not 100% compatible -- things still break. Distros are up-front about this (I have a sibling comment with sources).

> There is no need to have incompatible implementations of each, and just look at the three main wayland implementations: they share many of the work.

So now if I want to go and build a window manager, I have to go and re-implement a whole crap-ton of extensions myself that the X server used to do for me? And I have to do it perfectly, so apps written for other DEs won't just break? Sounds like a walled garden to me -- it raises the barrier to entry for new players.


> ship with an extension to control this firewall. Adding this capability to X.org (or any X11 implementation) could be done without throwing X11 away

Nothing is thrown away -> xserver is there for exactly this reason. Adding the extension for a system with bad abstraction is not too wise, but if you wanted to understand it, you would have done so already based on the video.

> Having a unified way to decide which programs get to see which events would be lost in a transition to Wayland

Why would it be lost? There is a core protocol that absolutely specifies it.

> This wasn't the case before with X11, where screen capture was handled by the X server.

And when you had only one player in the whole game.. which is pretty contradictory to your last sentence.

> I wonder why the server still has them, then.

Backward compatibility. Show me any desktop app that uses eg. xmotif or something. And with xwayland even these 30 years old apps can be run.

I didn’t address these things because basically everything has a solution under wayland nowadays. Please have a look at the wayland-protocol repo and see for yourself the state of it. Also, wayland is a display manager, just because the X server was a monolith, it had no place to eg. manage clipboard. Actually, Wayland is the one that fulfills the UNIX philosophy of do one thing (although I don’t find the UNIX philosophy a good thing in every case)

> It's not 100% compatible -- things still break. Distros are up-front about this

Such is life, I really can’t say anything else to this.

> So now if I want to go and build a window manager, I have to go and re-implement a whole crap-ton of extensions myself that the X server used to do for me?

No, you just use wlroots that implemented the “crap-ton” of extensions for you already, and be on your way.


> Nothing is thrown away -> xserver is there for exactly this reason. Adding the extension for a system with bad abstraction is not too wise, but if you wanted to understand it, you would have done so already based on the video.

I did watch the video, and while I was convinced that the X.org reference implementation was crusty, I was not convinced that there was anything inherently wrong with X11-the-protocol. Like, if there existed an X extension whose responsibility was just to get clients set up with their own video buffers that it could composite for them, then it sounds like it would address 90% of Wayland's value proposition. Is there a particular point in the video you want me to pay extra attention to that clarifies this?

> Why would it be lost? There is a core protocol that absolutely specifies it.

I read through the stable interface definitions in the wayland-protocols repo [1], and did not see anything related to controlling which programs get to see which events. Is this still in development (or unstable)? If so, is there an ETA at which point I can expect every correct Wayland compositor to faithfully implement it?

> And when you had only one player in the whole game.. which is pretty contradictory to your last sentence.

That's because the X server implements the mechanisms, not policies, for multiplexing the screen and input devices. In the service of this, it provides tools to enumerate, identify, query, modify, and extend properties of windows, as well as route messages between them. There was never a compelling need for multiple competing incompatible X servers because X is the narrow waist (i.e. an unopinionated digital commons) shared by software that competed on policy.

> I didn’t address these things because basically everything has a solution under wayland nowadays. Please have a look at the wayland-protocol repo and see for yourself the state of it. Also, wayland is a display manager, just because the X server was a monolith, it had no place to eg. manage clipboard. Actually, Wayland is the one that fulfills the UNIX philosophy of do one thing (although I don’t find the UNIX philosophy a good thing in every case)

I read through the unstable interface definitions, and see that Wayland is indeed trying to implement not only the same kinds IPC facilities and input device multiplexing that X provided, but also is trying to impose stronger opinions on what types of windows exist and how they behave (e.g. Wayland has a notion of pop-ups, text inputs, and so on). So if Wayland's goal is to avoid being as "monolithic" as X, it appears to be failing.

Also, putting core functionality that everyone must implement the same way into extensions just so they can call Wayland "just a protocol" or "just a display manager" is disingenuous. They might as well just say that they're part of the core protocol.

> No, you just use wlroots that implemented the “crap-ton” of extensions for you already, and be on your way.

Does the wlroots project define what extensions are standard and required for a piece of software to call itself a Wayland compositor? No? Then "just use wlroots" isn't addressing the problem of making sure these compositors are compliant to a set of common, useful standards. Like, maybe wlroots should be the standard-definer, just as X was? What happens with window managers built with a compositor that is not wlroots?

Anyway, I don't want to waste your time. If you can't help me understand why Wayland could not have been implemented as an X extension (including why isolating client input could not also have been implemented as an X extension), then I don't think we're going to get anywhere in this thread.

[1] I was looking here: https://github.com/wayland-project/wayland-protocols


There are X extensions for shared memory buffers. Client isolation for X could also have been implemented as some kind of extension. With both of those you could solve some issues but it still wouldn't be the same as redesigning the core protocol.

If you are expecting every Wayland server to implement things exactly the same way, that will probably not happen, the point with having different implementations is that they can choose which parts they want. It's currently not looking like there will be any one standard-definer, you can build a monolithic implementation if you want but you don't have to. Yes this might cause some fragmentation but realistically, has X really helped there? The huge proliferation of clones and forks of various X window managers that are incompatible in various ways is another kind of fragmentation.


> There are X extensions for shared memory buffers. Client isolation for X could also have been implemented as some kind of extension. With both of those you could solve some issues but it still wouldn't be the same as redesigning the core protocol.

So why redesign the core protocol if the selling points of Wayland can be had without going through all that hassle? What are the true selling points of transitioning to Wayland, if they can be had for far less work?

> Yes this might cause some fragmentation but realistically, has X really helped there? The huge proliferation of clones and forks of various X window managers that are incompatible in various ways is another kind of fragmentation.

There was only ever one dominant X server implementation for the past 30 years (XFree86, then X.org), so yes, I'd say it helped a lot to keep the video/input multiplexing system out of the hands of window manager and desktop environment developers. This ensured that your graphical programs would always work, regardless of what desktop environment or window manager you used, because they all spoke the same protocol and relied on the same reference implementation. A proliferation of Wayland compositors would take all of that away.


The basic idea of Wayland is that it is a simplification and streamlining of a display server protocol. The work of designing that core protocol is already done and doesn't need to be done again, the original developer likely did it because they found it interesting or useful to work on in some way.

The situation isn't that much better in X, the window manager and desktop environment can break clients in other subtle ways that have nothing to do with the X server. There's no guarantee that graphical programs would always work if your setup does something strange.


> The basic idea of Wayland is that it is a simplification and streamlining of a display server protocol. The work of designing that core protocol is already done and doesn't need to be done again, the original developer likely did it because they found it interesting or useful to work on in some way.

The mere existence of Wayland does not justify removing X11. I'm not trying to say that the X.org and fd.o developers shouldn't do as they please; I'm saying that I'll keep X11 until I see a compelling reason to drop it for Wayland.

> The situation isn't that much better in X, the window manager and desktop environment can break clients in other subtle ways that have nothing to do with the X server. There's no guarantee that graphical programs would always work if your setup does something strange.

I'll have to take your word for it, since I have literally never seen this happen (been using Linux as my daily driver since 2006 and have used dozens of WMs and all the major DEs). I agree that something inconsequential like the visual appearance of widgets or some such might not be consistent with the overall WM or DE theme, but X11 programs in general can run on a bare X server.

The only kind of non-trivial breakage I could imagine happening is accessibility features malfunctioning due to the absence of a particular DE, but I don't know to what extent this is the DE's fault, the program's fault, or the toolkit's fault.


I’m sorry, I may not have the time to answer every point you have made:

> I was not convinced that there was anything inherently wrong with X11-the-protocol

There is, the non-existant security model that can’t really be backfitted without breaking every program - in which case they can just as well fix all the bad parts.

> Is there a particular point in the video you want me to pay extra attention to that clarifies this?

I found the graphics of the client-compositor-Xserver vs client-compisitor under Wayland really informative. In modern usage, the Xserver actually acts more like a library and IPC bus, and is bad at the latter. Also, related to the API thing, there is no way to signal that a buffer is ready. You may not be interested in the “every frame is perfect”, but I like that I can watch a video in vlc without tearing. Also, a wayland compositor can be much more lightweight than the whole xserver, because it is not as chatty (there is no useless communication to the xserver that communicates to the compositor for no reason) It’s not without reason that wayland is/can be used in embedded systems.

> and did not see anything related to controlling which programs get to see which events

There is a one-to-one communication with the compositor and the client. Keyboard events, window resize and the like are sent to only a specific client. I may have worded it incorrectly that it is specified — I would rather say it has an inherent model for it, that can be changed with extension protocols when needed. But the default should not have been the everyone listens to everything and find what is interesting. (For example it is now possible that a global hotkey have to be registered and the compisitor will react to that based on the registration. But there can’t be a clash now and it will work reliably) Also, in my opinion this flexibility (with which clients should not worry about) lets you create novel ways to interact with windows, that was not possible with X.

Also, you seem to think that there is all that much difference between compositor families —- it is not the case. The core and many extension libraries are while implemented multiple times, work in the same way. Thus a traditional client with some windows will just work. Some compositor have some custom extension for eg. having a specific status bar, which you may find bad since under X there could be cross-wm status bars etc. But realistically you could not have them eg. under gnome or kde without tinkering, so the status quo doesn’t really change.

> Also, putting core functionality that everyone must implement the same way into extensions just so they can call Wayland "just a protocol" or "just a display manager" is disingenuous

How would you create that API of X you mentionod? Wayland is a protocol, the core is mandatory. And it is in a repo, so that it can have versions — this is yet again an area where x is flawed. Even the core api can continue to evolve, and eg the compositor/client can both decide to support for example an older version — although in practise the core api is backward compatible. But a new feature for example can be used by a fresh client when available, with a proper way to fallback — due to the wl_registry.

> Does the wlroots project define what extensions are standard and required for a piece of software to call itself a Wayland compositor

That is the core protocol. You seem to have a misunderstanding around it. Otherwise, how would a wayland app work on every wayland compisitor? Wlroots can have some custom extensions and it does have , but you seem to misunderstand the point of those/scope of them. They are simple things like “a specific window that can work as a widget, eg don’t loose focus etc”. Everything buffer related is core, and for example full screen WAS not part of the core initially, but an implementation that all compositors agreed on was merged and everyone implemented it many years ago.

> If you can't help me understand why Wayland could not have been implemented as an X extension

I’m trying to but you seem to have some grudge against the project. I am no X developer so unfortunately I don’t have more knowledge on the topic than what I have already shared, but for example X developers tried to retrofit HiDPI to X, and things like mixed HiDPI over multiple monitors (hell, the whole multi-screen setup) simply can’t be done realistically — from what I gathered due to X API’s lack of semantic informations like scale. Wayland corrected the many many failings of the API in a future proof way that can avoid. Also, why do you think that basically every OS already changed to a compositor-based display server 2 decades ago? It is simply the better abstraction and this is a simple answer, but it is the fundamental one.


Hey, I appreciate you taking the time to reply as you did.

> There is, the non-existant security model that can’t really be backfitted without breaking every program - in which case they can just as well fix all the bad parts.

Most X11 clients only care about receiving input events for their own windows, no? Making it so the X server only sends input events to the window(s) that are in-focus and all belong to the same app by default wouldn't be nearly as disruptive as ripping out the entire X11 protocol, would it? If the mechanism that does this is well-designed, you could restore the "see all input events" feature on an app-by-app basis.

> I found the graphics of the client-compositor-Xserver vs client-compisitor under Wayland really informative. In modern usage, the Xserver actually acts more like a library and IPC bus, and is bad at the latter.

Is it, though? The X server is uniquely positioned in the graphics stack to (1) maintain a database of which windows (and associated metadata) exist and their parent/child relationships, (2) store global configuration state for applications with a graphical concern to query, and (3) route IPC data between processes on a window-by-window basis. This isn't something you can easily move into a separate process, since the state of all windows and input events mutates pretty quickly, and stale data is useless, or even dangerous for downstream apps to consume. I suppose the X server could delegate IPC responsibility to a trusted downstream process, but the X server would still need to be the authoritative source for all state-updates.

> Also, related to the API thing, there is no way to signal that a buffer is ready.

Can't there be an X extension that allows the X server to notify compatible clients when a buffer is ready? If we're not worried about old clients or infrequently-refreshed clients continuing to tear, then this would be no worse of a proposition than moving everything to Wayland.

> Also, a wayland compositor can be much more lightweight than the whole xserver, because it is not as chatty (there is no useless communication to the xserver that communicates to the compositor for no reason) It’s not without reason that wayland is/can be used in embedded systems.

Can't there be an X extension that allows clients to inform the X server that they don't care to receive certain kinds of messages (or, make it so I can configure the X server to not send messages to certain X clients, or maybe create a launch-wrapper for X clients that instructs the X server on this on their behalf)? Also, "embedded systems" these days are easily on-par with (of not vastly more powerful than) the computers for which X was designed.

> There is a one-to-one communication with the compositor and the client. Keyboard events, window resize and the like are sent to only a specific client. I may have worded it incorrectly that it is specified — I would rather say it has an inherent model for it, that can be changed with extension protocols when needed. But the default should not have been the everyone listens to everything and find what is interesting. (For example it is now possible that a global hotkey have to be registered and the compisitor will react to that based on the registration. But there can’t be a clash now and it will work reliably) Also, in my opinion this flexibility (with which clients should not worry about) lets you create novel ways to interact with windows, that was not possible with X.

I'm really not seeing how this precludes making it so X can just not send all X clients all messages. Clients that need to see events destined to other clients' windows (which is the uncommon case) would just need to get an exception granted from the X server.

> Also, you seem to think that there is all that much difference between compositor families —- it is not the case. The core and many extension libraries are while implemented multiple times, work in the same way.

Even if all compositors were 99.9% compatible, that's still a ton of breakage -- one in one thousand interactions will behave incorrectly. Like, just take a look at Web browsers today to see what I mean about having multiple implementations making our lives worse -- they all ostensibly support the same standards, and yet they all behave in subtly different ways that Web developers have to test for. Why should I believe that it will be any different for Wayland compositors?

> Thus a traditional client with some windows will just work. Some compositor have some custom extension for eg. having a specific status bar, which you may find bad since under X there could be cross-wm status bars etc. But realistically you could not have them eg. under gnome or kde without tinkering, so the status quo doesn’t really change.

I don't use GNOME or KDE -- I rely on the flexibility X11 affords me to run the X clients I deem necessary to do my work. I know for a fact that I'm not alone on this. If Wayland is going to take this away, then I'm going to put effort to keeping an X11 implementation alive (even if it's implemented as a Wayland extension) in order to keep using my computer in the way I see fit.

> How would you create that API of X you mentionod? Wayland is a protocol, the core is mandatory. And it is in a repo, so that it can have versions — this is yet again an area where x is flawed. Even the core api can continue to evolve, and eg the compositor/client can both decide to support for example an older version — although in practise the core api is backward compatible. But a new feature for example can be used by a fresh client when available, with a proper way to fallback — due to the wl_registry.

I don't even know how to parse what you're saying here. It sounds like you're saying that just because Wayland has protocol definitions that live in a github repository (as if that mattered), it's automagically better than X extensions? Because, if you swap "X" and "Wayland" in that above paragraph, the resulting paragraph would still be true. X11 is a protocol with a mandatory core; X protocols (and extensions) are most definitely versioned (we're using X version 11 revision 7.7 btw); X clients can decide which extensions (or versions of these extensions) they want to use. If the X server doesn't support what the X client wants, the X client can optionally fall back to an older, different extension.

> That is the core protocol. You seem to have a misunderstanding around it. Otherwise, how would a wayland app work on every wayland compisitor? Wlroots can have some custom extensions and it does have , but you seem to misunderstand the point of those/scope of them. They are simple things like “a specific window that can work as a widget, eg don’t loose focus etc”. Everything buffer related is core, and for example full screen WAS not part of the core initially, but an implementation that all compositors agreed on was merged and everyone implemented it many years ago.

Wlroots is most definitely NOT the core protocol. It's a Wayland project maintained by Drew DeVault for building Wayland compositors. But Drew DeVault does not dictate what is and is not part of Wayland. I was asking rhetorically to prove this point. Also, if every app needs to make sure it works with every compositor (instead of just needing to check against a recent X.org release), then Wayland represents a regression in the way we build desktop software. With Wayland, developers need to test their app against a bunch of different compositors to make sure they all behave the same way, just like how Web developers need to test their Web apps against a bunch of different browsers. I'd rather not repeat the Web's mistakes in desktop software development.

> I’m trying to but you seem to have some grudge against the project.

I have a grudge against breaking everything for no reason, and I try not to depend on software written by people who develop a reputation for doing this. This isn't specific to Wayland. But so far, it looks like Wayland is an instance of breaking everything for no reason.

> Wayland corrected the many many failings of the API in a future proof way that can avoid.

The same thing was said about X -- that's why X has a forward-compatible extension model that Wayland largely copies. So let's not delude ourselves into thinking that Wayland is going to somehow magically avoid becoming the new X.org when all is said and done.

> Also, why do you think that basically every OS already changed to a compositor-based display server 2 decades ago? It is simply the better abstraction and this is a simple answer, but it is the fundamental one.

Why should I care what other OS's that I don't use do? First, I care about programs that I depend on not breaking. Second, I care that I can retain the power to mix and match different graphical UI tools to my liking, instead of having to take into consideration which compositors they may or may not work on (something I didn't have to do with X.org). I'm not convinced at all that Wayland actually fixes anything that couldn't have been fixed in an X extension for far less work and disruption. It's not like X.org doesn't have DRI3 support, which provides exactly the compositor-based display server you clamor for.


I'm no expert in the X11 codebase, but I have lots of respect for the guys who were working on it at the time the Wayland direction was undertaken. So I don't feel in a position to second guess their opinion or tremendous contributions, especially since it has led to a much better experience than I ever had with X.


That doesn't make them infallible. If anything, Wayland smells like yet another instance of CADT [1]. Like, why can no one explain why a world-breaking change like Wayland is justified, when the problems Wayland solves seem like they could be addressed by repairing X11?

I'm honestly interested in building a better X11, and am willing to contribute both time and money. But first, I'd like to understand why the X11 maintainers deemed Wayland necessary -- like, what am I missing here?

[1] https://www.jwz.org/doc/cadt.html


The main problems with X11 are within the core X11 protocol itself, things that are long considered deprecated/obsolete and can't be fixed or removed without doing a protocol break. I could go more into detail, but if you depend on some old X clients that use all those old protocol features, and you're already committed to putting money down on X11, it seems unlikely that those details would be relevant to you. Please let me know if I'm reading this wrong here. I support you working on X11, but just be warned, it is highly unlikely that the major desktops are going to want to continue on that path going forward.


> The main problems with X11 are within the core X11 protocol itself, things that are long considered deprecated/obsolete and can't be fixed or removed without doing a protocol break.

Which problems, exactly, would these be? Why is it impossible (or too difficult or disruptive) to solve these problems through a server extension, or some other backwards-compatible fix? I'm sure the X.org developers have an answer, but I'm not seeing it here.

> it is highly unlikely that the major desktops are going to want to continue on that path going forward.

I don't care what other desktops and toolkits choose to do -- I don't even use a fd.o desktop (hell, I don't even run dbus). If it weren't for needing a Web browser and Zoom, I wouldn't even need a GUI at all.

I'm doing this for myself. I like UNIX a lot, but I really dislike the direction modern Linux desktops are going in. But instead of whining about it online, I'm willing to put in the time and effort to keep things working the way I like.

I'm asking what problems Wayland solves in order to figure out why it's a bad idea for me to take a crack at implementing a simple X server of my own (assuming X.org is indeed going to be deprecated). Like, what is so wrong/broken about the X11 protocol that the X.org server developers are so enthusiastic about abandoning it? Clearly, I must be missing something. I would like to know what that something is.


You really should consider watching the youtube talk that was linked in some sibling comments, it explains it in more detail than I could in just one post. For me personally, the real bad issues are things like the core protocol being synchronous, the coordinates being limited to 16-bit, the inherent raciness and insecurity of various things like window properties and server grabs... There is a lot of legacy functionality there too like colormaps, window borders, bitmap fonts, all the core drawing primitives, all the core input stuff.... Newer applications are not using any of that, and often with that legacy stuff the the only specified behavior for edge cases that clients expects is "do whatever Xorg does" which makes a rewrite pretty impractical. It would be interesting to see a secure rewrite of the X server in Rust or some newer language like that, but doing that would probably take many years for little benefit, I'd advise against it.

Side note, I don't get the hate for dbus, it's a rather simplistic message bus, orders of magnitude smaller than the X server. It would be much easier to implement your own dbus daemon for example.


From what I got from the video, the main complaint is that the X.org reference implementation has gotten really crusty and hard to maintain. If so, maybe the solution there would be to do some housekeeping and start deprecating features no one uses. Maybe that could include factoring legacy/obscure protocols and code-paths into separate code modules, off the server's "happy paths," which these protocols' downstream consumers can take over maintaining (if they really still need them).

This doesn't call for ditching X11 in my mind.

> For me personally, the real bad issues are things like the core protocol being synchronous, the coordinates being limited to 16-bit, the inherent raciness and insecurity of various things like window properties and server grabs...

Why can't an extension offer a way for clients to establish asynchronous communication channels to the X server? Why can't an extension offer 64-bit coordinates? Why can't an extension offer a way for an authorized program to take care of guarding and serializing access to window properties and orchestrating server grabs? Why do we need to break the world to have these things?!? These may not be trivial undertakings, but I doubt they would take anywhere close to the amount of work required to upgrade every graphical program and toolkit in UNIX-land to use a wholly-different _suite_ of input/video multiplexing systems (which on a given day will be only 95% compatible with one another in expectation).

Also, it's not like Wayland is destined to be less crufty than X11. I wouldn't be surprised at all if all of the complexity in X.org today returns to Wayland compositors by way of a bunch of all-but-required Wayland extensions that get shoe-horned in over the years. So if we're going to be shoe-horning new features into existing systems, we might as well do it on the devil we all know (or perhaps we should solve this once and for all by creating an "X12" protocol in a way that shoe-horning is painless and won't lead us to cruftiness again).

> Side note, I don't get the hate for dbus, it's a rather simplistic message bus, orders of magnitude smaller than the X server. It would be much easier to implement your own dbus daemon for example.

It's not simplistic for what it does, and its developers have a horribly-misguided "put-it-in-the-kernel-because-performance" development ideology that belies a profound lack of understanding of why or how dbus isn't fast enough for their purposes. My biggest turn-off is the fact that it doesn't do anything that I can't already do faster and cheaper with a RAM filesystem of named pipes and UNIX domain sockets.

* Want user/system namespaced paths? Create a directory for each users' endpoints that's separate from other users' endpoints, and have a distinct system/ directory that only authorized users can explore. Leverage filesystem hierarchies and permissions to communicate which endpoints belong to the same service, and to control who can access them.

* Want to register a service endpoint for suspending/shutting-down your laptop? Create a directory under system/ whose group ID is the group of users who are authorized suspend/shutdown, put two "suspend" and "shut-down" named pipes in them, and have a suspend/shutdown daemon just do blocking reads from them. Once a byte arrives on the "suspend" pipe, execute suspend-to-RAM. Once a byte arrives on the "shut-down" pipe, execute shutdown.

* Want to register a service endpoint for sending desktop notifications? Make a "notifications" directory in the user's service endpoints directory, and put a UNIX domain socket in it. Have the notification daemon listen on this socket, and simply pop up a window whenever another program connects to it and sends a properly-structured message (note that that other program must have permission to traverse the service directory to access this UNIX domain socket to do so).

* Want introspection on how to form that message? Have the daemon that implements the endpoint write out a symlink to its documentation in its service directory, which you can just `cat` or `more` to figure out how to talk to the service.

* Want something really elaborate, like sending a video stream? Transfer a file descriptor to the service provider via the UDS and then pipe the audio/video data in that way.

So, yeah -- dbus doesn't need to exist in order for us to have the things it offers.


What you're saying is mostly what has happened already except the work has just been done outside the X server. The features in X that people don't use are already considered deprecated, and factoring the legacy parts off into a separate code module is essentially what XWayland is anyway.

If you added all those things as X extensions, it would essentially be the same thing as Wayland, because clients that wouldn't use them would still be broken, and every graphical program and toolkit would still be need to be updated to use them.

Your suggestions for dbus would work for some applications but would not really work for other things that a message bus handles like multicast, global message ordering, and resource accounting. Plus GNOME and KDE adopted dbus specifically so they could get away from having to pass around random sockets in folders everywhere. I assume by "put-it-in-the-kernel-because-performance development ideology" you're referring to kdbus, which was an alternate implementation not made by the original dbus developers, and is now a dead project and is not really a thing anymore. Please don't get those things confused. Of course the reason they could do that is because dbus is also just another protocol with a reference implementation, and you could make another implementation that works closer to what you describe and maybe gets 80-90% of the way there depending on some changes in the kernel, for example I saw a hacky dbus fuse filesystem a while ago: https://github.com/sidorares/dbusfs


> If you added all those things as X extensions, it would essentially be the same thing as Wayland, because clients that wouldn't use them would still be broken, and every graphical program and toolkit would still be need to be updated to use them.

There's a massive difference between extending the X server and having each window manager implement all the trappings of an X server as a library. Namely:

* X remains the "narrow waist" for video/input multiplexing. GUI "policy" infrastructure -- window managers, panels, notification services, and so on -- remain separate programs, with separate maintainers, to be mixed and matched downstream as needed. Moreover, all these programs keep working.

* By remaining a separate X server program, we keep mechanism and policy cleanly separated. GUI "policy" infrastructure can't impose itself systemically on other GUI "policy" infrastructure, which is a good thing because all these GUI "policy" infrastructure authors tend to think their way is the best way and how dare anyone question it or resist it (see also GNOME). X keeps me and mine safe from their idiocy.

* The barrier-to-entry for creating new "policy" infrastructure remains low, since you can run these programs without coupling them to a particular compositor.

* Changes to X's rendering infrastructure get incrementally deployed. No change in any workflow is required; toolkits and programs opt-in to the new rendering infrastructure as they need to. Programs that don't opt-in keep working until the old code paths get dropped.

* Non-display services of X get preserved, like xprops, xinput, etc. All xclients keep working. If desired, these can be policed through a separate opt-in extension. Existing IPC conventions like ICCCM and NetWM keep working, so all the downstream tools that use them keep working.

> Your suggestions for dbus would work for some applications but would not really work for other things that a message bus handles like multicast, global message ordering, and resource accounting.

Nonsense.

You can multicast messages from one process to many processes via a UNIX domain socket trivially -- just send the damn message to each recipient! It's not like you're going to have 10 million clients, so copying the data isn't going to be that bad (and, the service endpoint can always throttle clients). But, sure, let's suppose the message you're trying to send is gigantic, and you do need to send it to lots of clients. You can just store it as a file (you're doing this anyway if the message is truly that big) and send each client a read-only file descriptor to go and consume it at their own pace. If you're using a file at least, all your clients will hit the same cached pages in the kernel, so you're no longer making N different copies of the data (the kernel will take care of implementing the right caching strategy for you). If you're streaming data, you could simply buffer it to a file and treat the file as a ring-buffer, and still hand out read-only file descriptors to it to downstream clients.

Global message ordering and message dependencies is also easily solved without dbus -- just implement an "ordering" service adapter. The adapter writes its own UDS to the place where its upstream services' UDSs live, and it takes care of marshaling requests and replies to and from the upstream services according to some ordering principle you require. For example, if you have a service for shutdown/suspend, and a service for logout, you could implement a small ordering adapter that prevents messages to shutdown/suspend from being delivered if the user is in the process of logging out. I'd imagine that for a DE, you could simply have a singleton ordering service adapter that determines what services get to be accessed under which circumstances (thereby cleanly separating the task of systems integration from the task of providing the individual service).

Resource accounting is similarly straightforward. Just like the "ordering" service adapter pattern, you can also create a "resource usage" service adapter pattern. For example, you can ensure that the volume increment or decrement requests to your sound daemon arrive at a fixed rate, no matter how many requests come in. As another example, if the service is streaming data, you can use a service adapter to monitor how quickly clients are consuming versus the service producing, and induce back-pressure on the service to hint that it should down-sample if clients are too slow.

Because everything is represented as files, I can do those last two things trivially with shell scripts. No need to take over the init process (cough systemd-logind cough), no need to implement a whole wire format and marshaling library and stub-compiler, no need to create language bindings, etc. Files, directories, named pipes, UNIX domain sockets, and a humble script to set desktop-wide policies on inter-service interactions are more than adequate. But noooooo, we had to build dbus and all of dbus's infrastructure.

I honestly believe the authors of dbus simply lack imagination. Like, we have all this wonderful battle-tested POSIX IPC infrastructure sitting around waiting to be used that they don't even have to maintain, and the kernel makes a fine I/O multiplexer and request broker. Why not use it to its fullest potential? It'll save time and effort, and you won't need any specialized SDKs or tooling to interact with services.

I don't want to say that I think the dbus authors are, well, stupid. If there's something that dbus does that well and truly cannot be done as described above, I'd love to know what it is, and why it justifies all the complexity of re-implementing POSIX IPC analogues in a bespoke system. But I've been writing software for over 20 years, and I've been around the block plenty of times, and this entire project smells like something someone would have written if they simply were not familiar with what their runtime environment could already offer them.

> Plus GNOME and KDE adopted dbus specifically so they could get away from having to pass around random sockets in folders everywhere.

So instead we should just implement worse-performing analogues of most of the POSIX IPC primitives in userspace and pass around service endpoints instead? Come on now.

> you're referring to kdbus, which was an alternate implementation not made by the original dbus developers, and is now a dead project and is not really a thing anymore. Please don't get those things confused.

Thanks for correcting me. I wouldn't want to hate on people for the wrong reasons ;)

------

Anyway, we've been going back and forth for a while. I'm convinced now that Wayland is just an instance of CADT and doesn't solve anything that couldn't have been solved with a less-glamorous but less-effort X extension. But whatever -- the X.org and fd.o developers are free to do whatever they want, etc. etc.

I actually like the X11 model, and wouldn't mind taking a crack at writing a Wayland compositor that simply back-ported all the non-graphical aspects of X as a Wayland extension. Then everything I'm using today could, ostensibly, keep working (and I don't have to care nearly as much what the fd.o folks do going forward).


Look, you're a smart and accomplished person and you have some developed ideas of how thing should be done, please don't hate on other open source developers or accuse them of being "idiots" or "CADT" when you yourself acknowledge that you don't fully understand their work. If you have an idea you think is better then you can just do it, you don't need to trash talk other people's work and use insults like "attention deficit teenager" to get your point across. If you want your X programs to continue working, you don't need to write a Wayland compositor, you can just keep using X. The only reason to write a Wayland compositor would be if you wanted to use Wayland clients, which would not have access to any of the X protocol features anyway.

If you don't care about policies then all those things in X can be a good thing, but if you do care about policies then Wayland could allow for a better design, at least it seems that's what GNOME and KDE are aiming for anyway since their policies are very well established at this point, and they don't really seem to care about breaking ICCCM and other such things.

As for dbus, your solutions would work for some things, but would not have exactly the same semantics as dbus and would come with their own set of issues, and requires building several more infrastructure pieces, some of which you just described. You could build those but it likely wouldn't fit the same use cases as dbus. If you're sending messages that you expect other clients to parse then you still need to agree on a wire format and marshaling library, you can't get around that. If you ask me dbus itself doesn't require much infrastructure at all, you should consider reading the source code for the dbus reference implementation at some point because it's actually pretty small and stable. And I don't understand what you mean by re-implement POSIX IPC analogues, dbus is essentially just a wire format for Unix domain sockets and a message bus that routes the messages, it doesn't re-implement anything. If you want to use dbus from shell scripts, you can use tools like dbus-send and busctl, or you can try to use something like that dbus fuse filesystem -- the nature of dbus makes it map pretty well to that, there's no reason you can't have both a message bus and an easy interface to access from shell scripts.

(Also just another nitpick here, the systemd developers are not the dbus developers, and systemd-logind doesn't take over the init process, that is its own smaller daemon)

If you want to read more, see some comments from the original dbus author:

https://news.ycombinator.com/item?id=8649459

https://news.ycombinator.com/item?id=8648995


Can you think of even a single thing dbus can do that my approach cannot do? Emphasis on cannot here -- if you do reply to this, I expect you to prove that the thing cannot be done by any simpler means. If not, then why does dbus need to exist? Better question -- why are people who insist on writing software that doesn't need to exist given decision-making powers in fd.o? Software is like a form of pollution -- more code means more bugs and more security holes (and dbus isn't immune [1]). Any greenhorn developer can write lots and lots of code; it takes wisdom and experience to avoid writing code. So if people who don't grok this are running fd.o, why should I trust anything fd.o produces?

Before you try and tone-police the above, you should know that it is fd.o that needs to convince me to venerate their software artifacts. People writing more code isn't by itself praiseworthy -- code is a goddamn liability, so it had better have a good reason to exist and (in dbus's case) have a very good reason to be widely depended-on. Just because you happen to like or use someone's code doesn't mean that it is any good.

> And I don't understand what you mean by re-implement POSIX IPC analogues, dbus is essentially just a wire format for Unix domain sockets and a message bus that routes the messages, it doesn't re-implement anything.

I guess if you didn't understand POSIX IPC, you wouldn't see how this sentence is an oxymoron. The kernel itself gives you all the trappings of a message bus for free. You don't need a wholly-separate daemon and wire format spec.

Also, the only people who seem to use dbus's wire format are dbus clients. Even when dbus was new, there were already widely-used and well-understood formats for representing structured data (e.g. ASN.1, typed netstrings, S-expressions) that could have been leveraged to make interacting with the service that much more straightforward. But then again, we're talking about people who wanted to re-invent POSIX IPC, so I guess I shouldn't be surprised they also wanted to impose their own wire format on the world.

> If you don't care about policies then all those things in X can be a good thing

I know better than anyone else on Earth what graphical policies are good for me, so I'm going to take this as your affirmation that X is indeed the right tool for the job for people like me who know what they want out of their computers. I stopped using DEs years ago because I got tired of having to fight them all the time to get them to do the things I needed.

[1] https://security-tracker.debian.org/tracker/source-package/d...


I don't understand what you are saying about freedesktop.org. That is just another volunteer run open source organization that hosts projects that are loosely related to open source desktops, you can start contributing to that if you want, or you can not use any of it if you don't find it useful. I'm just here for an interesting conversation, I'm not trying to convince you of anything, and I would rather not continue this discussion if you're going to start throwing around insults and making it personal and accusing other developers of being ignorant or having bad intentions. Please don't do any more of that, it's not interesting conversation and it's against the rules here. You're better than that. If that's tone policing then I'm sorry but my point is we ultimately can't have a conversation if your goal is to attack other people who aren't even here and tear them down, that just isn't my goal.

I also still don't see what you mean about dbus, the Linux kernel itself doesn't specify a wire format for arbitrary messages, and doesn't specify all the things that you need to get the complete functionality of a message bus. Maybe you could get that with another operating system that is based around message passing but Linux is not that. The methods you describe could technically be done without a daemon, but they still require a lot of additional code to set up a bunch of files and sockets and enforce ordering, security, etc, which could also contain bugs. You could tell the applications to implement all that themselves or you could put it all in a daemon which is mostly what dbus does anyway, and by doing it in one daemon it totally eliminates a certain class of race conditions and synchronization issues. Again please refer to the comments by the dbus developer that I showed, this conversation is not new and already happened years ago. If you want to store ASN.1 or S-expressions in a d-bus message you can do that pretty easily. And if you really believe that your solution could work then I would encourage you to develop a dbus implementation that works like you describe and then test to see if it works exactly the same and doesn't break existing setups. But I don't think this would really work, you wouldn't really be saving many lines of code, and in particular multicast and service activation would be pretty hard to do in the way dbus does it without a central message bus.

If you don't agree with GNOME or KDE's policies and you want to implement your own IPC then that's great, I support you doing what you need to do, however they chose dbus a long time ago, and currently it's looking like X is not the right tool for them anymore, so you may just have to accept your differences and move on.


I see that you didn't take me up on my challenge to prove that dbus does anything we couldn't easily do with bread-and-butter POSIX IPC. Just regurgitating something I could have just read on the dbus homepage isn't fooling anyone.

Look, I have very strong opinions on what software I consider worth running on my computer, because I've been doing this for a long, long time. I also have very strong opinions on how to go about solving the problems that dbus, udev, systemd, logind, PulseAudo, and the rest of the fd.o middleware purport to solve. I also happen to strongly disagree with how they go about doing it. But, please don't misconstrue this as me believing that their authors don't have a right to create whatever software they see fit. I put my money where my mouth is on this and write code to do the things I want if I can't bear to use the code they publish -- in fact, I have my own binary-compatible udev replacement waiting on stand-by but ready-to-go in case udev comes to hard-depend on systemd [1].

I normally don't share my opinions publicly like this because it causes people like yourself to crawl out of the woodwork and throw well-meaning but unsolicited advice and github links at me to wrappers and adapters for projects that depend on the very software and the very architectural paradigms I'm trying to avoid. When I try to explain this is not what I asked for, it falls on deaf ears. It's a supremely annoying, frustrating experience.

I don't know if you remember, but the only thing I was trying to find out in this entire comment thread is whether or not Wayland solves a problem that was truly impossible to solve with an X extension. I've already got my answer: no, it does not. So, I think I'm done here.

[1] https://github.com/jcnelson/vdev


I tried to give a hint about the basic reason of trying to get out from under the weight of a huge code base that had become old and crufty while the very architecture it was designed around was becoming moot since things were shifting to client side rendering already.

Couldn't find a good article on short notice, but there's a decent video from back in 2013 about it.

https://youtu.be/cQoQE_HDG8g


> I tried to give a hint about the basic reason of trying to get out from under the weight of a huge code base that had become old and crufty while the very architecture it was designed around was becoming moot since things were shifting to client side rendering already.

I can totally get behind doing a clean re-write of X.org (possibly in a memory-safe language this time around) in order to get rid of legacy cruft that's truly no longer used. They could take the opportunity to refactor the super-popular X extensions like GLX to have better "happy paths" in order to make the overall implementation cleaner and easier to maintain. This could even be done incrementally in order to avoid breaking existing clients.

What I'm struggling to understand is what's so wrong with X11-the-protocol and the popular extensions that ditching everything was considered the best idea? Like, if the X11-to-Wayland transition were happening on the Web, it would be a lot like Google deciding to ditch HTML/CSS/Javascript in favor of something home-grown. Sure, that homegrown thing might actually be better, but it would really leave everyone else in a real lurch now, wouldn't it?


Well there were a lot of problems with the X protocol actually, having to do with latency and multi-threading support at the very least. So much so that Xcb was developed as a replacement protocol; so applications were already having to be refactored if they wanted to avoid such problems.

But really, most applications do not deal with Xlib or Xcb _anyway_, they are programmed at the Gui toolkit level. So all that has to be done is add another backend to the few popular toolkits in use. But guess what, Wayland supports both Xcb and Xlib protocols through a virtual X server that transparently translates to Wayland if that's what you have your heart set on.

But I have lost track of what the specific problem is you're actually trying to solve.


XCB and Xlib are libraries that implement the client side of the X11 protocol. They are not themselves protocols.

I'm trying to fix the problems with X11 that supposedly justify Wayland's existence, because I have reason to believe that fixing/extending X11 would be far less painful and far easier than throwing it all away.

However, no one seems to be able to explain what is unfixable about X11. I assume in good faith that Wayland exists because there is something truly unfixable. I'd like to know what that is.

Note that I'm talking about the protocol here. People here (yourself included) point out that X.org is old and hard to maintain. This may be true, but that is a problem with the reference implementation, not the X11 protocol. Thus it doesn't in my mind justify Wayland's existence (but it does justify writing a new reference implementation).

EDIT: Here's an example -- what if someone wrote an X server that only allowed clients to render via DRI3, and by default prevented programs from receiving keyboard or mouse events intended for other programs? There would be a new protocol extension for setting and querying these blocking policies, so integrators could set more-secure default access controls without breaking compatibility. Isn't that basically what Wayland is aiming for -- client input is isolated and everything graphical happens through off-screen rendering to client-controlled GEM buffers?


You could do that but such an X server would not really be any practically different from Wayland. You would still complain that it broke your old clients, and newer clients would still have to maintain two code paths for the newer server and for the X servers that didn't support DRI3. (DRI3 is not supported when running X clients over the network for example)


Are there widely-used X servers that don't support DRI3? Genuine question -- it's been out since 2013. I realize that DRI3 doesn't work over the network (I also never complained about losing X11's network transparency).

> You could do that but such an X server would not really be any practically different from Wayland.

Not quite -- the X server would still provide all the device-independent IPC, input, and screen multiplexing facilities and APIs. Dealing with input isolation could be addressed with an extension.

So I think this answers my question -- Wayland isn't anything special. It sounds like I'd get a lot of mileage out of taking wlroots and adding back in all the device-independent X11 protocols as a Wayland extension. This would basically be the "X server with only DRI3" I described.


AFAIK DRI3 is also Linux-only and is not supported on any X server outside of Linux.

In Wayland those tasks have been split out into libraries. The details of the protocol IPC is handled by libwayland, the input is handled by libinput. Screen multiplexing is specific to the compositor and not really something you can farm out to a library, which is the same as composited X where the compositor process takes over the entire screen and handles all the rendering.

It would be interesting if someone combined an X server with a Wayland server like you described, but I don't think it would be useful. A lot of your legacy applications would still be broken, for example no old clients or window managers are rendering using DRI3. If you want to design an extension for client isolation, the problem there isn't that X doesn't have that but that the existing methods don't really work well. My suggestion there would be to talk to any desktop environments to find out what their requirements are, if they haven't already committed to switching to Wayland already. (i.e. GNOME and KDE already have their solution for this in Wayland) It may be that an additional X extension is unnecessary for what the other desktops require.


> AFAIK DRI3 is also Linux-only and is not supported on any X server outside of Linux.

So? I never said I cared about the portability of low-level rendering software. It's not like anyone cares that Xenocara and the aperture driver only work on OpenBSD, for example.

> Screen multiplexing is specific to the compositor and not really something you can farm out to a library, which is the same as composited X where the compositor process takes over the entire screen and handles all the rendering.

Hold up. Isn't screen multiplexing and compositing exactly what libwayland gives a program the power to do? You'd build and run a compositor (like Sway, or like Kwin), and it fulfills compositing, screen multiplexing, and so on, as well as IPC, window management, hotkeys, screenshots, etc.

At least with X, these were separate programs you could mix and match.

> It would be interesting if someone combined an X server with a Wayland server like you described, but I don't think it would be useful.

I'd find it useful. I don't care if no one else does, since I'm writing this for myself.

> A lot of your legacy applications would still be broken, for example no old clients or window managers are rendering using DRI3.

I'd add the necessary compatibility code for the programs I need to run. I'd add them in a way that, if others wanted to fork my code, they could easily restore their own legacy code paths.

> If you want to design an extension for client isolation, the problem there isn't that X doesn't have that but that the existing methods don't really work well.

Sounds like a problem with the particular extension, not X11.

> My suggestion there would be to talk to any desktop environments to find out what their requirements are,

Don't care. I'm not doing this for them. I don't use any of them, and they're all dead to me at this point. I'm doing this to keep my minimalist X11 window manager and X11 clients, and to satisfy my intellectual curiosity.


You asked if other X servers are used, there are other X servers that are widely used outside Linux (Xquartz, Xwin, etc) which would break if the clients required DRI3.

Libwayland is a small library carrying the implementation of the wire protocol, and a few other bits like a simple event loop for servers and a library that can load X cursors. The point with that is that you're bringing your own compositing and multiplexing anyway. From there it's optional if implementations want to put in additional features for IPC, window management, hotkeys, screenshots, etc, and they can choose if they want to put that in the server or put it in a separate program. So you can still mix and match on some level anyway, it's not quite the same though.

I had exactly the same idea as you a few years ago to build something like that and I thought it would be useful too, and I thought about it for a while and realized that it doesn't really give you any of the benefits of Wayland or the benefits of X11. The point with wayland is already that it strips the unnecessary bits out and maintains a legacy code path with XWayland, and the point with X is that it's always going to keep the legacy code running anyway, so you don't gain much by combining them. If you have a minimalist window manager that's only a few thousand lines of code, and you want to get the benefits of Wayland, it's much easier to just port that using wlroots or something than it would be to rewrite the whole X server. That's just my experience.

>Sounds like a problem with the particular extension, not X11.

Since this is an issue with Xorg lacking the right extensions it's basically the same thing.


> You asked if other X servers are used,

Allow me to qualify: ON LINUX! Sorry if that wasn't blindingly obvious when I said I was totally on-board with a DRI3-only X server.

> The point with that is that you're bringing your own compositing and multiplexing anyway.

I did not have to do this in the X server world. I did not have to worry about inter-compositor compatibility, because there was only one compositor implementation. Wayland has made me have to do extra work for absolutely no gain. This has got to be the fifth time I've explained this on this comment thread. I don't know how much clearer I can be.

> From there it's optional if implementations want to put in additional features for IPC, window management, hotkeys, screenshots, etc, and they can choose if they want to put that in the server or put it in a separate program. So you can still mix and match on some level anyway, it's not quite the same though.

I'm glad you've understood that N different window managers will now have N different ways of doing this, whereas before, N different window managers only had 1 way of doing this. It's nowhere near close to the same -- now in order to do one thing reliably, I have to implement it N different ways.

It's infuriating that none of the Wayland fans seem to see this as a problem. It's almost as if they're the ones who won't be suffering the consequences of their bad architectural decisions!


If you're talking about making something that is Linux only, that also would be a no-go for X clients that expect to run on other operating systems. They would still want to maintain the old code path, so it wouldn't help much.

You do have to bring your own multiplexing in the X world if you were implementing an X compositor. Maybe that's unfortunate to you that the focus changed to X compositors, but Wayland didn't change the fact that this work has to be done by some willing party to get that to work.

I wouldn't call myself a "Wayland fan" but what you are saying isn't really a problem, the different window managers can choose to implement it just one way. They don't have to do it N different ways, of course they will do it differently if they have a valid reason to. From an application developer perspective you shouldn't have to deal with this problem, I'm sorry if you are a toolkit developer and this has caused you pain, but in my experience nearly all of the bits that you would need to have to do a native port to Wayland aren't specific to any window manager.


> the different window managers can choose to implement it just one way. They don't have to do it N different ways,

Not holding my breath. Just because you have a standard doesn't mean all implementations behave the same. In fact, they usually don't, which is the problem.

> but what you are saying isn't really a problem

Thank you for proving my point about being in a position where you don't have to suffer the consequences of your bad architectural decisions.


Maybe that's true if that's your only concern, but there are other reasons to replace X11 than just this.

Also, X11 is arguably not the whole graphics stack, at this point the DRM/Mesa piece is much larger and more significant, and Wayland doesn't replace it outright anyway -- it makes it optional if needed for backwards compatibility, in the same way that macOS has XQuartz.


The other two concerns in GP are no screen tearing, and better hidpi / multi-monitor support. Is it truly less work and less disruptive to address these to concerns within X11 than it is to throw X11 out (and also leave all nvidia users high and dry)? Also, keep in mind that throwing X11 out and replacing it will take more than just technical legwork -- it will also take ecosystem buy-in and standardization, and if we're being honest with ourselves, this is the harder problem. Recall that the X11 ecosystem has a 30-year head start on this, and there's a crap-ton of 3rd party software that assumes an X11 environment that Wayland is going to need to emulate. If X11 does indeed go the way of the dodo, I think we can reasonably expect another 30 years of bug reports in the form of "Fuck Wayland! I upgraded to Wayland and my $IMPORTANT_THING broke!". I very much doubt that at the end of the day the switch to Wayland is going to be overall easier than just fixing X11, but would love to be convinced otherwise.


The usual way to fix other concerns like that has been to add more WM atoms or add more X extensions, which is a similarly uphill battle requiring buy-in and standardization, and typically old X clients just won't be updated to support those new things. The way to get the most value out of such things would be to add support to the major toolkits, but those have already been ported to Wayland for some years now.

The backwards compatibility is done through XWayland which functions similarly to XQuartz, in that it is just the Xorg server running using Wayland as a backend driver.


What do you think is more of an uphill battle, in terms of time and energy sunk? Adding another X extension that can be incrementally deployed, or trying to phase X out by maintaining both an X11 and Wayland back-end for all apps trying to avoid breakage?

This doesn't even speak to X11 apps that aren't built with toolkits (for example, I use xterm, xpdf, xfig, Openbox, etc.).

XWayland is a nice idea, don't get me wrong. But it's not a 100% replacement either. Distros offering XWayland are even up-front about it's shortcomings [1][2][3].

[1] https://wiki.debian.org/Wayland

[2] https://docs.fedoraproject.org/en-US/quick-docs/debug-waylan...

[3] https://wiki.archlinux.org/index.php/wayland#XWayland


There isn't much difference there, but it would be more of an uphill battle if you tried to put everything different that Wayland does into X extensions. That still requires maintaining an extra code path for old X servers that don't support the new extensions, and creates additional risk of breaking things and causing regressions in the X server because of all the new code you're adding.

Clients like term, xpdf, and xfig should work fine in XWayland. Window managers won't work without getting ported, but someone has been working on a port of Openbox: https://github.com/johanmalm/labwc


> That still requires maintaining an extra code path for old X servers that don't support the new extensions, and creates additional risk of breaking things and causing regressions in the X server because of all the new code you're adding.

Yes, agreed! But why is it _more_ risky to do that than to throw the whole X server concept away and start from scratch? Rewriting such a widely-used piece of infrastructure from the ground up is a super-risky proposition.


That's mostly a misconception, Wayland implementations don't need to start from scratch. Weston and wlroots are minimal from-scratch implementations, but GNOME and KDE for example do their implementations by re-using most of the code from their X compositor.


Great! So instead of having one standard way to do video/input multiplexing, we have at least four -- GNOME's, KDE's, wlroots, and weston (and probably a smattering of others). If I want to write a program that works with "Wayland," I'm either going to have to test them on all of the widely-used compositors (because of course they're not all going to behave exactly the same way), or I'm going to have to just punt on them. The former option is 4x the work, and the latter option is me telling users "Hey everone, remember that program that used to run everywhere in every window manager ever that you all know and love and depend on to do your jobs? Well, now it only works on GNOME, since that's all the time I have to support it. Good luck non-GNOME users!"

EDIT: Before you say "just use a toolkit, it'll take care of everything," I can already tell you that users don't care. They only care that the app that used to work in KDE no longer works in KDE. They're not going to complain to Qt or Kwin; they're going to complain to the app author. So the app author becomes responsible for the additional burden of testing their software in a bunch of different compositors, for zero gain.


Is that any different from normal? In my experience, if you're shipping a product on a Linux-based desktop, usually you target a specific set of distributions, i.e. the default configuration of the last few LTS versions of RHEL or Ubuntu or whatever, which at least for those examples all happen to be GNOME based. Customers who come with some weird hacked-up distribution would be on their own for support anyway, they can try it but there's no guarantee it will work. If KDE (or something else) really is doing something different here then you would have had to extend the same amount of effort as you did previously.


Before, a graphical program would run just fine under GNOME or KDE because it wasn't GNOME or KDE handling the video/input multiplexing facilities. But now with Wayland, a graphical program not only needs to target distros, but specific configurations of those distros (i.e. Debian/GNOME, OpenSUSE/KDE, etc.). This isn't helping fragmentation.


That's only if you're using functionality specific to the DE, which is handled mostly the same as it is under X. GNOME and KDE for example tend to provide their functionality as dbus services. If you just have a simple app that needs no special privileges or features then that will work just the same. If you use GTK or Qt, the transition will be mostly seamless and would only be a problem if you were circumventing that and calling Xlib or xcb directly.


No, this happens if the particular Wayland compositor you're running the program on happens to implement a Wayland protocol or extension you use in a "unique" way that causes your app to break. This wan't a problem with X.org because all distros used the same X.org (or, if they used an older X.org, and if that led to breakage, the solution for users was always the same: upgrade X.org).

I already explained above why "just use a popular toolkit" isn't a viable solution. Users do not care whose fault it is; all they care about is that your app used to work in KDE and now it doesn't in GNOME (the problem is even worse in Wayland than I'm letting on, because with Wayland, the DE controls the compositor and renderer -- there are so many more ways for the DE-specific code to interfere with the graphical program than there was with X.org).


Sure but that's not any different if your application depended on some other GNOME or KDE specific API. If KDE decides an API is KDE only and GNOME doesn't want to make their own implementation then there's not much that you could ever do about that. The point with using a toolkit is that it's an abstraction layer that handles the differences between window systems and implementations for you. It would be better if you mentioned the specific reason why your app is breaking because that would likely be a bug in the toolkit.


Except now the whole X server is part of a GNOME-specific or KDE-specific API.

The bug is "Wayland makes it so programs don't reliably run anymore because they are no longer guaranteed to work with all window managers and desktop environments." It's an architectural flaw.


Not really, shared functionality there is in the core wayland protocol and in the standard set of wayland extensions, or in some other standard dbus interface. (generic ones usually go in the org.freedesktop namespace) If you depend on GNOME-specific or KDE-specific APIs then yes, you would have to deal with those being unsupported outside. Not much has changed there.


sigh In one ear, out the other.


Nvidia cards with proprietary drivers are supported under gnome’s and plasma’s wayland implementations.


Ah, I didn't know that. Kudos to them!


> Regarding security, I'm honestly surprised no one has just tried to make it so you can "firewall" X11 programs from one another.

The response I've heard to this question is entirely nonsensical: it could be done with an X extension, but getting adoption from various parties to make this work would be difficult. As if building an entirely new display system doesn't require orders of magnitude more work and buy-in.


Basically everything would stop working with X, because X simply relies on being able to listen to everything. Actually there are nested xservers that do something like this, but now global hotkeys don’t work, it doesn’t have an api for screenshots, so those won’t work either and the like.

And you can probably add some X extension which can’t be queried properly, but then you can just as well create a new display protocol that actually knows about GPUs


I don't care if X sees everything (hell, even if X didn't, the kernel certainly still would). I only care that I can control which programs see which X11 events. Like, my hotkey program can see everything, but my Web browser can only see its own x-windows' events.


There is some rough support for that in the X server, but it is lacking a good API or user interface, and the desktops that would implement that are doing it in Wayland.


Doesn't that strike you as odd? Like, why is it that X11 was so close to fixing the problem, but everyone who would benefit from it (and who touts Wayland's ability to do it) decides to just throw it all away and re-built everything from the ground up? I'd sure like to know what they know about this.


No, the hard part is building a good API and user interface that works for everybody. IMO that's mostly why there are a lot of half-finished and inconsistent things like that X11.


Doesn't that undermine Wayland's selling point of isolating clients' input? No one was clamoring for this until Wayland announced it, and no one was willing to put in the effort to fix it in X11 all these years (even though it would have been easier than ripping out X11 entirely).


That's one of the things Wayland was designed to do, of course implementations can build other things around it that allow privileged clients to break client isolation. The effort could have been put into X11 but it seems the people interested in this would rather put that effort into Wayland.


That's literally the essence of the CADT[1] model of software development. Why do a comparatively-small amount of unglamorous work to solve a bug when you can just burn everything down and rewrite it from scratch?

[1] https://www.jwz.org/doc/cadt.html


Not really, it's the same amount of work either way considering this doesn't exist yet.


Option 1: add an X extension that lets you configure which windows get to see which input events. Most clients don't actually need to see any events besides the ones they would see while in focus, so most clients don't notice.

Option 2: replace X with something entirely different -- different rendering, different input, different IPC, different organizing principles, different programming models -- and patch all downstream dependencies to use it.

If you can't see that Option 2 is clearly more work and more disruptive, I don't know what else to say to you.


That's not really a reasonable comparison here, because option 2 has already mostly been done, for other reasons.


I hear you, but a compositor could be multi-process - it was not yet done, but in time it will be.


Which is a mistake. On X11 the server, window manager and compositor are three separate programs. Both window manager and compositor can individually crash, started, stopped and replaced at runtime without any of the other running X11 client instances affected.


On the other hand on X11, Xorg cannot crash without the X11 client instances being affected - a much larger chunk of code. It's only because Xorg is older that that doesn't happen much.


A chunk of code that is running in production for more than 30 years and should be considered battle tested. In my experience Wayland compositors crash much more often than X11 despite the supposed reduced complexity. The last time X11 server crashed on me was in 2004 if I remember correctly.


Exactly, that reliability of Xorg is a function of its age and doesn't imply anything about the correct design of a Wayland compositor. What's the chance those Wayland crashes were in the window management code rather than the rendering, protocol, clipboard, and drag/drop handling code? dwm is 2000 SLOC to Xorg's 1 million or so. I don't think splitting out the WM code would have gained much.


You underestimate the inherent complexity of Wayland. As exercise I recommend to implement a "Hello World" native Wayland client. Watch and see the complexity explode when you simply want to add the functionality to take screenshots to that client.


I'm not sure what kind of "Hello World" clients you're comparing, but if you check the Wayland backends in Gtk/Qt, you will actually find them to be smaller than the respective X11/XCB backends there, for various reasons.


> you will actually find them to be smaller than the respective X11/XCB backends there, for various reasons.

It doesn't seem surprising to me. As X.org has gained extensions over the last 30 years, toolkits that speak X11 find themselves having to decide which extensions they'd like to use. Adding flexibility on this naturally leads to a bigger feature matrix. Of course, the toolkits are also free to drop support for X servers that don't have those extensions, which in turn would shrink the X11 backend.

I have no doubt that in 30 years, they'll have a similarly-sized feature matrix for all the Wayland extensions they'll want to support.


Or perhaps by then we will have moved onto another protocol that's even simpler.


Better idea: fix the bad life choices that got us to this point, so we don't have to keep re-living them. What a concept!


I think we would all love it if we could magically fix every bug or design flaw in existence with a wave of the hand, unfortunately in real life it takes time and experience to do things right.


I'm specifically not talking about toolkits. I'm talking about a "simple" native Wayland client. Try to write one and witness utter insanity.


Can you be more specific about what are you comparing this to? An X client with similar functionality would likely be longer and require several X extensions.


What complexity? I have written both a simple wayland client and server — both are simple lib calls.

Also, how often do you write gui apps without any framework? It is absolutely hidden away in both gtk and qt apps.


> What complexity?

What is a wl_registry_listener and why do I need it? What is a simple XGetImage() equivalent on Wayland? On Xlib function names at least give you an idea about what they are supposed to be doing.

> Also, how often do you write gui apps without any framework?

As soon as you have a Toolkit you don't need Wayland anymore. Windows then are just additional nodes in the object tree. There was even a demonstration of GTK applications running in a framebuffer without X11 way before Wayland even existed. If Wayland can only be used sanely with toolkits it indeed is completely pointless.


It was some time ago I looked into it, but wl_registry_listener registers a callback for when the compositor “declares” what protocol extensions it supports. This is painfully missing from X, but this makes Wayland much more modular/extensibly.

Wayland’s abstraction is basically a buffer. The client simply creates a buffer either in shared memory, or directly on the GPU and then passes the compositor a handle to the buffer. That’s it.

Also, it is sort of ingenious to compare the two — the XLib is a higher level lib than libwayland. There is absolutely no reason why someone could not create a wrapper for this — although I again ask, how often does one create an only X/only Wayland window without a framework.


Wayland doesn't have the equivalent of XGetImage for various reasons, but screen capture applications can use the ScreenCast flatpak portal to select window sources: https://flatpak.github.io/xdg-desktop-portal/portal-docs.htm...

I am not sure what you mean by Wayland can only be used sanely with toolkits, any GUI usually need a toolkit or some equivalent, even under X. if you are not using a pre-existing toolkit and are writing your own routines to draw buttons and text boxes and such, that would be implementing your own toolkit.


How much of that 1 million lines of code actually gets executed? Also, everyone runs the X server, so its code gets a lot of testing. This isn't true for window managers -- there's a long tail of them. This is just one data point, but in my experience I've had window managers crash far more often than X servers (since they get less love).


Actually most of the crashes I had with Sway were in fact related to compositing and window management. including stupid stuff like crashing because sway couldn't decide which window to focus after hiding another.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: