Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know what kind of programming you're doing, but in network apps, if you have a thread per client and lots of clients (like a web server), you end up with lots of threads waiting on responses from slow clients, and that takes up memory. The time blocked on the syscall has nothing to do with your own machine's performance.

But on the other hand, if your server is behind a buffering proxy so it's not streaming directly over the Internet, it might not be a problem.



> But on the other hand, if your server is behind a buffering proxy so it's not streaming directly over the Internet, it might not be a problem.

This is one instance of a larger pattern I've been noticing. When using some languages (like Python and Ruby) in the natural, blocking way, a back-end web application typically needs multiple processes per machine, because it doesn't handle many concurrent requests per process. Combine this with the fact that each thread has to block while waiting on the client, and you have to add more complexity around the application server processes to regain efficiency. The proxy in front of those servers is one example. Another is an external database connection pool like PgBouncer. Speaking of the database, to avoid wasting memory while waiting on it, you may end up introducing caching sooner than you otherwise would. And when you do, the cache will be an external component like Redis, so all of your many processes can use it. Or you might use a background job queue just to avoid tying up one of your precious blocking threads, even for something that has to happen right away (e.g. sending email). And so on.

Contrast that with something like Go or Erlang (and by extension Elixir), where the runtime offers cheap concurrency that can fully use all of your cores in a single process, built on lightweight userland threads and asynchronous I/O, while the language lets you write straightforward, sequential code. In such an environment, a lot of the operational complexity that I described above can just go away. Simple code and simple ops -- seems like a winning combination to me.


Cooperative multitasking is much easier to implement and administer than preemptive multitasking, and always has been. But there are cases where it isn't good enough, and if you hit those then you need a system that can do preemptive multitasking gracefully - which often means you end up with just as much complexity as if you'd used preemptive multitasking from the start, but with the complex parts being less well-tested.


>"But there are cases where it isn't good enough, and if you hit those then you need a system that can do preemptive multitasking gracefully ..."

What are some of those use cases where userland threads are no longer good enough? In what areas do they fall short?


Essentially any time you have to run something that's not completely trusted to not block a thread - which could be user-supplied code (or "code" - matching a regex is unsafe in most implementations, rendering PostScript is famously Turing-complete) or just a third-party dependency.

At my first job we had a prototype that performed 2x faster (on average) by using Go-style async, but we couldn't trust our libraries enough to eliminate bugs from blocking dispatcher threads. So we stuck with traditional multithreading.


It's all true, and yet most webservers were like that 20 years ago - and they still managed to run even fairly high-traffic websites on hardware much less powerful than what we have today. I would argue that >90% of the web doesn't really need the extra throughput that async gives you at the cost of extra complexity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: