Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find "distributed systems" to be a huge source of imposter syndrome. Despite having worked almost exclusively with distributed applications for several years now, it is difficult to consider myself experienced. When I'm asked if I've worked with distributed systems, I don't think they are asking me if I've managed a Hadoop cluster. They are interested in building new applications using some of the primitives discussed in this post. All of these links are great, but the fact is that building and operating tools like this is hard. In addition to consensus primitives, your system may need very precise error handling, structured logging, distributed tracing, resource monitoring, schema evolution, etc. In the end, I probably pause for a second too long when answering that question, but I don't think its because of a lack of experience, quite the opposite!


Do you know what the CAP theorem is, and can explain it to me like I'm 5? Can you tell me how a SQL DB fits into it, and where something like DynamoDB fits into it?

Congratulations, you are better than 95% of the people that I've interviewed out there saying they are experienced building distributed systems. Including system/solution architects.


Sad, but true. That said... the hiring pool tends to be biased towards people that others have passed on.


Sure, but the OP was talking about their difficulty during interviews; I'm just saying, they're almost assuredly better than the majority of the rest of the hiring pool. The company has an opening they need filled; they can only fill it with people from the hiring pool, not those who aren't in the hiring pool.


I'm realizing I've only worked in distributed systems as well, but I'd never feel comfortable telling potential employers I'm an expert. Being an expert in distributed systems seems almost too broad. At a high level couldn't it be expertise at integration, accessible logging, and configuration?


Distributed systems research has been going on since the 70's and Unix Neckbeards have probably forgotten more about them than we have learned, so actually I think impostor syndrome is a bit warranted with them.

The actual hard stuff is not even these papers, it's the implementations that are way more complex than some algorithm or architectural pattern. Anyone who says "X is better than Y" is fooling themselves because it's only the implementation context that matters.

The only thing you can say for certain is that reducing the amount of components and complexity in the system often results in better outcomes.


> The only thing you can say for certain is that reducing the amount of components and complexity in the system often results in better outcomes.

No, there are a few other things that you can say for certain:

Watch out for positive-only feedback loops, you absolutely need negative feedback as well - or only. Eg. exponential back-off.

Sometimes, you just need a decentralized solution, rather than a distributed one, and you don't have to have the same answer at every scale (eg. distributed intra-datacenter, decentralized inter-datacenter, or vice-versa).

Loose coupling is your friend.

Sure, add an extra layer of indirection, but you probably need to pay more attention to cache invalidation than you think.

Throughput probably matters more than latency.

Reducing the size/number of writes will probably help more than trying to speed them up.

Multi-tenancy is a PITA for systems in general, and distributed ones are no exception (aside: there is probably a huge business for multi-tenancy-as-a-service, if anyone manages to solve it in a general-purpose way), but a series of per-customer single-tenant deployments may be worse, especially if they are all on different versions of the code. Here be dragons.

Don't overthink it. Start with a naive implementation and go from there (see loose coupling above).


Maybe there are some other things you can say for certain. But as to some of your points:

> Watch out for positive-only feedback loops, you absolutely need negative feedback as well - or only. Eg. exponential back-off.

Agreed it may need negative feedback, but I'm not sure about always.

If your service has a latency SLA, exponential back-off might kill your SLA (depending on wordage and where the back-off is). The fix is to soft reject requests (RST rather than dropping packets) when you can't meet the demand. This change may allow you to meet your SLA if it's written to prioritize low latency over service unavailability.

This is it's own negative feedback loop, but change from sending RSTs to silently dropping and you no longer have the feedback.

> Sometimes, you just need a decentralized solution, rather than a distributed one

Agreed

> Loose coupling is your friend.

Until it isn't? :)

> add an extra layer of indirection, but you probably need to pay more attention to cache invalidation

Fixes for additional layers tend to increase system complexity compared to fixes for fewer layers.

> Throughput probably matters more than latency.

Until it doesn't :)

> Reducing the size/number of writes will probably help more than trying to speed them up.

Depending on 20 different things... You really have to account for all the system's limits (and business use cases) and find the solution that matches the implementation needs.

> there is probably a huge business for multi-tenancy-as-a-service

Sure, it's called EKS :-) Just build more clusters... Don't worry, we'll bill you...

> Don't overthink it

Yes and no; Yes, in that there will always be unknowns. But no, in that often improvements in communication will provide better solutions without extra work. Think smarter, not harder!


Every maxim has caveats and exceptions. Including this one.


Yes. The area is fairly broad. In my opinion - I think the question to ask for is: Do you have the distributed systems mindset? Not Are you an expert on Distributed Systems


It is definitely a broad term, and I think that disciplined implementation of the things that you mentioned is the real key. It just isn't as exciting to talk about.


I don't trust anyone who comes back with quick answers without any hesitation or pause. I remember a lecture from a while ago, I can't remember who was giving it but they were someone well known, where they asked people to implement bubble sort. The results that came back ranged from something like 20 lines to 2000 lines and every single one had bugs. I don't know how anyone who works in this industry for any length of time doesn't follow up every answer with, "...but there's probably some problem that I'm not thinking of at the moment".


I feel interviews are to blame. The interview is a place where doing this is often not OK. It signals to folks that that is expected day to day.


I wonder if this source of imposter syndrome applies to any field where the job roles and success criteria are not clearly defined and vary from company to company (ie data analyst, product owner, distributed systems developer).


Explain concurrency like I am five and give a concurrency example problem and solution.


Distributed systems, in my experience, most often fall prey to "not invented here" syndrome.


great time to bring up the Dunning-Kruger phenomenon in the interview lol


hahaha

for the uninitiated,

The Dunning–Kruger effect is a cognitive bias in which people with low ability at a task overestimate their ability.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: