Interesting -- how did you handle redeploys? Given that game servers are statefu...

zemo · on March 22, 2022

> it seems like redeploying a server without machinery to do things like dynamically allocate ports/service discovery for an upstream load balancer would be tricky

like most things with running servers, it's not that hard, there's just an industry dedicated to making people think it's hard (the tech industry).

Every game has a room code to identify the game, and a websocket open handshake has a URL in it. Every room listens on a unix domain socket and nginx just routes the websocket to the correct domain socket based on the URL.

pipe_connector · on March 22, 2022

How do you handle deploying a new version of nginx without forcibly closing connections?

zemo · on March 22, 2022

you don't. That requires new nodes, so you have to drain nodes and replace them with new nodes that have the upgrade installed before adding them to the pool that hosts games. That process sucked but how often are you upgrading nginx?

pipe_connector · on March 22, 2022

Very often if your load balancer is custom. For example, we have an edge service that fulfils the role you have nginx for, but it handles both websockets and raw tcp traffic. Our edge service is the gateway for authentication and authorization -- from that service we can connect users to chat rooms, matchmaking, or actual game instances. We could get away with just nginx + room ids and manual upgrades to the pool of game/matchmaking/chat services for internal traffic possibly, but we ship updates to our edge service all the time and that process needs to be painless, so I was curious how others have done this.

We're currently using systemd sockets with Accept=no, then multiple edge services can accept() from the same socket that's always open and bound to a known port. Once a new service has started, we can signal the old service to shutdown for however long it needs by no longer listening on the socket and letting connections drain naturally. We're thinking about changing to dynamically allocated ports/sockets which is pretty natural in the container orchestration world.

zemo · on March 23, 2022

ah so you're very in the weeds.

I considered getting rid of nginx and writing the proxy layer myself but found nginx to be good enough for my needs so I avoided it. I might have been better served by HAProxy, which to my understanding has better semantics around connection draining, which nginx reserves for the pro edition.

My project didn't have auth, all clients used websockets, and all clients could retry. Haven't messed with systemd sockets myself. Using unix domain sockets and the filesystem to do dynamic sockets worked pretty well for me.

> Once a new service has started, we can signal the old service to shutdown for however long it needs by no longer listening on the socket and letting connections drain naturally. We're thinking about changing to dynamically allocated ports/sockets which is pretty natural in the container orchestration world.

this sounds pretty reasonable to me, what sort of problems are you having that make you want to change?