sshd is started by systemd. systemd has several ways of starting programs and wa...

xorcist · on April 5, 2024

This seems like a clear case of premature optimization. During the three decades sshd has existed I have never seen a real world situation where the equivalent of Type=exec was not enough.

The time window where sshd is started and not yet ready to receive connections is short, and clients will have a connection timeout orders of magnitude larger. The notify functionality is more relevant for things like Java middleware processes and clients that lack the functionality to poll and wait. Under most situations none of this is relevant for sshd. These patches solve a problem very few people have.

If you really have this problem, the systemd readiness is far from enough to solve the problem. The readiness is sent too early and there could still be permission problems that would cause sshd to be ready but the connection to fail. Even more relevant is local firewall rules that are completely out of scope for a readiness check!

Polling for readiness is the only robust way.

zokier · on April 5, 2024

> The readiness is sent too early

What makes you say that? The readiness notification is sent after the sshd has opened the listen socket, it literally is accepting connections at that point.

xorcist · on April 6, 2024

Accepting connections, yes, but for a client that is dependent on being able to establish ssh connections that is not enough. They would want to be notified on the ability to make successful connections.

Keep in mind that this is only a problem in special situations, almost no regular Linux servers carry any services that care about being able to establish ssh connections, that cannot reconnect and use appropriate socket options. For other services than ssh, such as databases, this is much more common. And for those, it is not enough that the server has opened a listening socket.

When building distributed systems, this is something you need to think about. Not so such with local systems. But the same principles apply. And the only robust way is to poll for readiness. Signalling readiness is both complex, when the dependency chains are non-trivial, and prone to error, when readiness signals arrive out of order or are dropped or fail for some reason. This could be because of operations failure but make for hard to debug cases. Dependency chains that mysteriously stop because of permission problems with out of band traffic is both classic and unnecessary problem.

All of this complexity go away when polling. This is why you should adopt this design in the somewhat unlikely case you have clients that depend on being able to make ssh connections.