Remix.run Logo
alixanderwang 5 days ago

> I’m often alone on this. Engineers look at complex systems with many interesting parts and think “wow, a lot of system design is happening here!” In fact, a complex system usually reflects an absence of good design.

For any job-hunters, it's important you forget this during interviews.

In the past I've made the mistake of trying to convey this in system design interviews.

Some hypothetical startup app

> Interviewer: "Well what about backpressure?"

>"That's not really worth considering for this amount of QPS"

> Interviewer: "Why wouldn't you use a queue here instead of a cron job?"

> "I don't think it's necessary for what this app is, but here's the tradeoffs."

> Interviewer: "How would you choose between sql and nosql db?"

> "Doesn't matter much. Whatever the team has most expertise in"

These are not the answers they're looking for. You want to fill the whiteboard with boxes and arrows until it looks like you've got Kubernetes managing your Kubernetes.

mkozlows 5 days ago | parent | next [-]

(For context, I've conducted hundreds of system design interviews and trained a dozen other people on how to do them at my company. Other interviewers may do things differently or care about other things, but I think what I'm saying here isn't too far off normal.)

I think three things about what you're saying:

1. The answers you're giving don't provide a lot of signal (the queue one being the exception). The question that's implicitly being asked is not just what you would choose, but why you would choose it. What factors would drive you to a particular decision? What are you thinking about when you provide an answer? You're not really verbalizing your considerations here.

A good interviewer will pry at you to get the signal they need to make a decision. So if you say that back-pressure isn't worth worrying about here, they'll ask you when it would be, and what you'd do in that situation. But not all interviewers are good interviewers, and sometimes they'll just say "I wasn't able to get much information out of the candidate" and the absence of a yes is a no. As an interviewee, you want to make the interviewer's job easy, not hard.

2. Even if the interviewer is good and does pry the information out of you, they're probably going to write down something like "the candidate was able to explain sensibly why they'd choose a particular technology, but it took a lot of prodding and prying to get the information out of them -- communications are a negative." As an interviewee, you want to communicate all the information your interviewer is looking for proactively, not grudgingly and reluctantly. (This is also true when you're not interviewing.)

3. I pretty much just disagree on that SQL/NoSQL answer. Team expertise is one factor, but those technologies have significant differences; depending on what you need to do, one of them might be way better than the other for a particular scenario. Your answer there is just going to get dinged for indicating that you don't have experience in enough scenarios to recognize this.

philjohn 5 days ago | parent | next [-]

+1 on the signal. A great candidate won't need you to pry further than asking about backpressure, they'll explain WHY it's not necessary for the qps, what qps it would start becoming necessary, and how they would build it into their design down the line if the service takes off.

One of the things I tell people preparing for system design interviews is the more senior you are, the more you need to drive the interview yourself, knowing when to go deep, what to go deep on, and how to give the most signal to the interviewer.

cryptonector 5 days ago | parent [-]

As the interviewee I'd use such a question to demonstrate my knowledge of the topic. For this question I'd point out that thread-per-CPU w/ async I/O designs w/ maximum live connections and clever connection pool acceptance & eviction policies, and limited buffering, together intrinsically limit oversubscription, but I would still talk extensively about health monitoring and external circuit breaking, as well as the use of 429/whatever and/or flow control (where available) at the protocol level to express backpressure. I would then use this to harp on the evils of thread-per-client designs, and also why they happen, as well as the various alternatives.

Make the interviewer tell you when they've had enough and change topics.

iamcreasy 5 days ago | parent | next [-]

I beg you to write an article expanding on this points. I'll pay to read this article.

cryptonector 5 days ago | parent [-]

Really!

I've written about this in my comments here...

Here's a brief summary:

- typical thread-per-client programming is terribly wasteful because it needs large stacks that must be able to grow (within reason), and this leads programmers to smear client state all over the stack, which then means that the memory and _cache_ footprint of per-client state is huge even though the state is highly compressible, and this is what reduces efficiency when you want to C10K (serve 10,000 clients), and this is what led to C10K techniques in the 90s

- you can most highly compress said per-client program state by using continuation passing style (CPS) async I/O programming, either hand-coded or using modern async functions in languages that have them -- this approach tends to incentivize the programmer to compress program state into a much smaller structure than an execution stack, which therefore greatly reduces the per-client memory and _cache_ footprint of the program

Note that reducing the memory and cache footprint of a program also reduces its memory bandwidth footprint, and increases cache locality and reduces latency. This means you can server more clients with the same hardware. THIS is the point of C10K techniques.

All other techniques like fibers and green threads sit on the spectrum from hand-coded CPS async I/O programming to thread-per-client sequential programming. You get to pick how efficient your code is going to be.

Now, when you apply C10K techniques you get to have one thread-per-CPU -- look ma'! no context switches -- which also improves latency and efficiency, naturally. But there's another neat thing to thread-per-CPU: it becomes easier to manage the overall health of the service, at least per-CPU, because you can now manage all your connected clients, and so you can have eviction policies for connections, and admittance policies too. In particular you can set a maximum number of connections, which means that you can set maxima that only slightly oversubscribe the hardware's capabilities.

Otherwise [and even if you do the thread-per-CPU thing, though it's less important to have circuit breakers if you do] you must have some way to measure the health of your service, and you need to monitor it, and you need your monitor to be able to "break the circuit" by telling your service to start rejecting new work. This is where HTTP status 429 and similar come into play -- it's just a way to express backpressure to clients, though flow control will also do, if you have that available to exercise. You'll still need to be able to monitor load, latencies, and throughput for thread-per-CPU services, naturally, so you know when you need to add HW. And of course you'll want to build services you can scale horizontally as much as possible so that adding hardware is easy, though too you need to be able to find and stop pathological clients (dealing with DoS and DDoS almost always requires components external to your services).

Make sure all middleware can respond appropriately to backpressure, including having their own circuit breakers, and you have a pretty resilient system -- one that under extreme pressure will be able to shed load and continue to function to some degree.

You'll need to be able to express client priorities (so that you can keep servicing part of your load) and quotas (so that pathological high-priority clients don't DoS you).

There's much more to say, naturally.

BTW, I checked today and LLMs seem to know all this stuff, and more than that they'll be able to point you to frameworks, blogs, and lots of other docs. That said, if you don't prompt them to tell you about the thread-per-CPU stuff, they won't.

Keep in mind that C10K techniques are expensive in terms of developer time, especially for junior developers.

belZaah 5 days ago | parent | next [-]

There’s also a systems level rationale to this. Without good isolation, you’ll get a feedback loop: threads start to step on each other’s toes. This leads to slower response times. Which, at a given request pressure, leads to more parallel threads. Which slows them down even more. If there’s a brief peak in pressure, that drops the response time below a critical point, such a system will never recover and you’ll get a server furiously computing without an apparent reason only to behave normally after a restart.

cryptonector 4 days ago | parent | next [-]

Yes, and thus circuit breakers. By sizing offered capacity to some factor of actual capacity you can limit the effects of too much demand to causing backpressure naturally (rejecting requests) instead of timeouts and retries. This then allows you some level of access -- such as to your health and diagnostics end-points, because CPU usage doesn't become so high that you can't even run those.

jiggawatts 4 days ago | parent | prev [-]

Most systems cap the size of their thread pool and put excess requests into a queue.

jiggawatts 4 days ago | parent | prev [-]

Good summary of the theory, but the weird thing is that every time I’ve rewritten code to use async the total throughout went down by about 10%… which is what I estimate is the overheads introduced by the compiler-generated async state machinery.

I’m yet to see a convincing set of A/B comparisons from a modern language. My experiences don’t line up with the conventional wisdom!

cryptonector 3 days ago | parent [-]

That could be because you're still smearing state on the stack? With async functions one can do that, and so you still have stacks/fibers/threads, and so you've not gained much.

With a CPS approach you really don't have multiple stacks.

Oh, and relatedly the functional core, imperative shell (FCIS) concept comes in here. The imperative shell is the async I/O event loop / executor. Everything else is functional state transitions that possibly request I/O, and if you represent I/O requests as return values to be executed by the executor, then you can have those state transitions be functional. The functional state transition can use as much stack as it wants, but when it's done the stack is gone -- no stack use between state transitions.

Now naturally you don't want state transitions to have unbounded CPU time, but for some applications it might have to be the case that you have to allow it, in which case you have problems (gaaah, thread cancellation is such a pain!).

The point of FCIS is to make it so it's trivial to test the state transitions because there is nothing to mock except one input, one state of the world, and check the output against what's expected. The "imperative shell" can also be tested with a very simple "application" and setup to show that it works w/o having to test the whole enchilada with complex mockup setups.

dennis_jeeves2 3 days ago | parent | prev [-]

Lol

Juliate 4 days ago | parent | prev | next [-]

What you're describing is how the interview process _is_ disconnected from the actual needs, and how it's good to literally "play the game" to get in.

But on the other side, that kind of interview process is itself also a signal candidates might take to avoid playing the game, knowing that most (not all, of course) companies probing for the wrong signals during the interview process are indicative of how they do function as a whole.

(been in both types of companies, and in both sides of the table)

elliotto 5 days ago | parent | prev | next [-]

OP has stated that the system design interview exists as a way to performatively answer a set of questions that don't relate to actually being a good system designer. In your response, you have claimed that their proposed truthful answers don't give a lot of 'signal' and that you would prefer candidates to engage with the performative nature of the process. This is true - but is not an argument against the OP's claim, which is that reciting a set of facts about system design in an interview != being a good system designer.

jackblemming 5 days ago | parent | prev [-]

All that “signal” nonsense can be parroted by both an LLM and someone who read “how to pass system interviews”. Yea, great “signal”.

cryptonector 4 days ago | parent [-]

Not really, not in live, oral interviews.

Though I once had a case of the person we thought we were hiring and the person we got being different people. The fix for that is to always have one final in-person interview.

_fat_santa 5 days ago | parent | prev | next [-]

This goes back to "interviews go both ways". All those answers you gave are very reasonable and if I was your interviewer I'd pass you with flying colors. On the other hand if you're interviewing at a place that doesn't pass you with flying colors for those responses, that really says more about them than it does about you and may not be a great place to work.

But to your point, many times one interviews for a job they don't really have the luxury of getting rejections and need to land somewhere fast so they can keep paying the mortgage. So while yes interviewing is a two way street, there's still quite a bit of calibration to make sure you land on the other person's side of the street so to speak.

atomicnumber3 5 days ago | parent | next [-]

If I was your interviewer, I would: respect your answers a lot, not be able to check off anything on my rubric, try to explain this in the debrief, get told we have to stick to the rubric to counter bias, and then watch while they pass on you for someone who decided to play architecture jenga instead. I would potentially even consider emailing you to apologize later, then not do it because I'd probably get in trouble for exposing us to liability or something because apologizing can be construed as admission of guilt.

yojo 5 days ago | parent | next [-]

If a candidate doesn’t ask clarifying questions that lead them to an understanding of QPS, storage requirements, and throughput considerations, that’s a mark against.

At that point, if you want to see them design a distributed system with all the bells and whistles, you should stop them, tell them the kind of traffic they need to handle, then let them go again.

If they persist in designing a system that cannot handle the specified load, they have probably failed the interview.

ndriscoll 5 days ago | parent | next [-]

The problem with this is people seem to have mismatched understandings of what a single system can handle. e.g. my 8 year old quad core i5 desktop with a bit of batching optimization can handle 5 digit requests per second with 15 ms p99 with some nontrivial application logic doing several joins. I don't think I've tried that same benchmark on a modern minipc, but I expect it should be similar. That's well above what most companies will ever need to handle. Visa advertises they can process ~70k tps worldwide.

Last time I interviewed I was asked about designing a system to handle 10s of thousands of events per minute, and if you thought about the problem a little you'd realize most of them didn't require real work to be done. I answered something along the lines of "you don't need to do anything special. Just normal postgres/mysql usage can handle more than that on a laptop". After I got hired I learned the rubric had some expected answers about queues (e.g. Kafka) in it. No idea why still.

sgarland 5 days ago | parent | next [-]

Because web devs are so used to terrible design, poorly-optimized DB schemas, and networked storage latency that they have no idea what a single server (or indeed, a humdrum desktop) is capable of.

Like when I inform teams complaining of “slow queries” that the DB is executing them in sub-msec time. No idea what the rest of your stack is doing, but good luck with figuring that out - it ain’t me.

jiggawatts 4 days ago | parent | prev [-]

“Prove that you can apply solutions to yesterday’s problems today.” is a good strategy except in industries where today is exponentially different to yesterday.

gopher_space 5 days ago | parent | prev [-]

I’m also going to need a dollar value on your data and a list of consequences. We will spend our allotted time together in Excel.

willio58 5 days ago | parent | prev | next [-]

I’ve interviewed dozens of people and while I rarely do system design questions and our process isn’t nearly as check-all-the-boxes, it’s funny how accurate your comment still is. Near the later stages especially, politics starts coming in.

belinder 5 days ago | parent | prev [-]

Exactly, it would only work if you have enough sway with your boss and the willingness to take responsibility for the hire

nostrademons 5 days ago | parent | prev | next [-]

If I were the interviewer, I'd try to adjust the problem statement with some hypotheticals to tease out their depth of knowledge:

> "That's not really worth considering for this amount of QPS"

"What if Michael Jackson dies and your (search|news|celebrity gossip) service gets a spike in traffic way beyond the design parameters? How would you anticipate and mitigate such an event?"

(Extra points if the answer is not necessarily backpressure but they start talking about DDoS mitigation, outlier detection, caching or serving static results from extremely-common queries, spinning up new capacity to adjust to traffic spikes, blackholing traffic to protect the overall service, etc.)

> Interviewer: "Why wouldn't you use a queue here instead of a cron job?" "I don't think it's necessary for what this app is, but here's the tradeoffs."

"What if you have a subset of customers that demand faster responses than a cron job can provide?"

(And then that can become a discussion about splitting off traffic based on requirements, whether it's even worth adding the logic to split traffic vs. just using a queue for everyone, perhaps making direct API requests without either a queue or cron job for requests from just those customers, relying on the fact that they are not numerous or these requests are infrequent to trade capacity for latency, etc.)

> How would you choose between sql and nosql db?"

I would've expected the candidate to at least be able to talk about indexing, tradeoffs of joining in the DB vs. in the application, schema migrations and upgrades, creating separation between data-at-rest vs. data-in-flight, etc. If they can't do that and just handwave away as "whatever the team is most comfortable with", that's a legit hole in their knowledge. Usually you ask system design interviews of senior candidates that will be deciding on architecture and, if not hiring out the team directly, providing input to senior managers who will be hiring, so you can swap out the team nearly as easily as swapping out the architecture.

tacitusarc 5 days ago | parent | next [-]

Exactly this. I don’t want someone who will design complex, bloated systems, but I DO want them to be able to articulate tradeoffs and reasons why various components might be useful.

throwawaythekey 5 days ago | parent | prev [-]

>I would've expected the candidate to at least be able to talk about indexing, tradeoffs of joining in the DB vs. in the application, schema migrations and upgrades, creating separation between data-at-rest vs. data-in-flight, etc.

The problem is that many of these trade-offs only applied to older databases. The more relevant axis is about how distributed the db is, the replication type etc.

neilv 5 days ago | parent | prev [-]

> that really says more about them than it does about you and may not be a great place to work.

If a really good "tech" engineer ruled out all the places that are bad at interviewing, they would probably be unemployed.

You have to look past bad interviewing practice, to some degree.

> there's still quite a bit of calibration to make sure you land on the other person's side of the street so to speak.

Exactly. But if they try to Leetcode you, you have to decide whether you have any self-respect at all, or you're all just playing house together.

uberduper 5 days ago | parent | prev | next [-]

This is awful advice. Simple and elegant design does not start with dismissing potential problems.

Those questions are all prompts to have a discussion in lieu of tech trivia hour. Those responses do not demonstrate wisdom, they reveal a lack of maturity. It's not the interviewers fault you refuse to be interviewed.

titanomachy 5 days ago | parent [-]

I agree, the responses give the vibe of "your questions are dumb and I'm too smart to waste the effort to engage with them." If you don't want the job, then don't interview!

dondraper36 5 days ago | parent | prev | next [-]

Yes, and this is exactly why LinkedIn-driven development exists in the first place. Listing a million technologies looks much more impressive on paper to recruiters than describing how you managed to only use a modular monolith and a single Postgres instance to make everything work.

corytheboyd 5 days ago | parent | prev | next [-]

As well as the “two-way street” point made in a sibling comment, I feel like a good interviewer would say “this is great, I would keep it simple too, but I am testing your knowledge of $thing right now.” If the person won’t stop talking about the wrong thing, that’s a bad sign of course.

ramraj07 5 days ago | parent | prev | next [-]

Do you _want_ to work in these places? In my experience, if they expect you to run kube using kube in the interview, thats exactly what they do in their ststems as well.

UK-AL 5 days ago | parent [-]

These are the places that actually pay well.

dondraper36 5 days ago | parent | next [-]

There's another reason for that. Deep in my heart, I would love to be part of a team that works on truly data-intensive applications (as Martin Kleppmann would call them) where all the complexity is justified.

For example, I am more of the "All you need is Postgres" kind of software engineer. But reading all those fancy blog posts on how some team at Discord works with 1 trillion messages with Cassandra and ScyllaDB makes me envious.

Also, it seems that to be hired by such employers you need to prove that you already have such experience, which is a bit of a catch-22 situation.

stavros 5 days ago | parent | next [-]

I feel like the phrase "all you need is Postgres" has the (often unspoken) continuation of "until you actually get to a trillion messages".

In other words, the developers you're envious of didn't start with Cassandra and ScyllaDB, they started with the problem of too many messages. That's not an architectural choice, that's product success.

dondraper36 5 days ago | parent [-]

Absolutely. To put it differently, unfortunately not everyone has a chance to be part of a product's organic evolution from "all we need is Postgres" to "holy crap, we're a success, what is Cassandra by the way?"

SatvikBeri 5 days ago | parent [-]

As a data point, I've been at two data-intensive startups where they eventually needed to pull (some) of their table-like data out of postgres, and for both that was past a $100MM valuation.

This varies by domain of course, but non-postgres solutions are generally built for very specific problems – they're worse than postgres at everything except one or two cases.

DanielHB 5 days ago | parent | prev [-]

Only places that are making good money can afford to have overengineering.

Overengineering is more prevalent the more money a company makes and companies who overengineers will pay good money to keep the overengineering working.

no_wizard 5 days ago | parent [-]

Something about my old CTO and VP of Eng I respected is they were still technical enough to call out this kind of thing. For as big as that company was they really held down complexity and overengineering to a real minimum.

Unfortunately the rest of the executive has leaned on them so hard about AI boosting productivity they aren’t able to avoid thst becoming a mess

DanielHB 4 days ago | parent | next [-]

It is a shame that so many companies try to scale by just hiring a lot of people, the more people you have in a single project the more overengineering you will end up with.

Some of it is consequence of managing so many individual contributors, I still believe a lot of companies use microservice stuff as a way to scale to more teams than to more scalability/reliability/observability.

Some of it is just people coming up with clever solutions (and leaving after the fact) and a lot from resume-driven development.

jrs235 4 days ago | parent | prev [-]

In other words they believed in principles other than increasing personal power

paulddraper 5 days ago | parent | prev [-]

Usually out of necessity

Swizec 5 days ago | parent | prev | next [-]

> These are not the answers they're looking for.

These ARE the answers we are looking for. As the system design interview (I’ve done hundreds) I want you to start with these answers then we can layer on complexity if you’ve solved the problem and there’s time left to go into navel gazing mode.

Seeing the panic slowly build in mid-level engineers’ eyes as it dawns on them that not every problem can be solved by caching is pretty fun too. “Ok cool you’ve cached it there, now how do you fill the cache without running into the same performance issue?”

Aurornis 5 days ago | parent | next [-]

> I want you to start with these answers then we can layer on complexity if you’ve solved the problem and there’s time left to go into navel gazing mode

Exactly. Part of the interview is explaining when and why these techniques are necessary as part of demonstrating your understanding.

If the candidate gives non-answers like “I don’t think it matters because you’re a startup” or “I’d just use whatever database I’m comfortable with” that’s not demonstrating knowledge at all. That’s dismissing the question in a way that leaves the interviewer thinking you don’t have that knowledge, or you don’t take their problems seriously enough to put thought into them. There is a type of candidate who applies to startups because they think nothing matters and they can YOLO anything together for a few years before moving on to the next job, and those are just as bad as the super over-engineering candidates.

The interview is your chance to show you know the topics and when to apply them, not the time to argue that the startup shouldn’t care about such matters.

Swizec 5 days ago | parent | next [-]

> The interview is your chance to show you know the topics and when to apply them, not the time to argue that the startup shouldn’t care about such matters.

A good way to answer these, I think, is some version of ”We probably won’t run into these issues at the scale we’re talking about, but when we run into A, B, C problems, we can try X, Y, Z solutions.”

This shows that you’re making a conscious tradeoff and know when the more complex solutions apply. Extra points if you can explain specifically how you’ll put measures in place to know when A, B, C happened and how you would engineer the system such that adding X, Y, Z is easy.

Also it looks amazing if you’re aware that vertical scaling can buy you a lot of time for comparably little money these days. Servers get up to 128 CPUs with 64TB of RAM on one machine :)

paulddraper 5 days ago | parent [-]

Right, and you might be small in $year but presumably you expect to grow and they don’t want to replace the team because they can’t think how to operate in any other circumstances.

throwawaythekey 5 days ago | parent | prev [-]

> Part of the interview is explaining when and why these techniques are necessary as part of demonstrating your understanding.

The slightly altered "explain when and why these techniques are *not* necessary" is much less appreciated.

nlawalker 5 days ago | parent | prev | next [-]

> I want you to start with these answers then we can layer on complexity if you’ve solved the problem and there’s time left to go into navel gazing mode.

Do you tell people this explicitly? If so, good on you; if not, please start! I think one of the biggest problems with interviews these days is misaligned expectations, particularly interviewees coming in assuming that what's desired is immediate evidence that they're so experienced in solving FAANG-scale problems that it's their default mode.

dondraper36 5 days ago | parent | next [-]

I believe even at FAANG-like companies, only a lucky minority is involved at that level of scale. Most developers just use the available infrastructure and tools without working on the creation of S3 or BigTable.

dmurray 5 days ago | parent [-]

This famous blog post [0] suggests that the default behaviour at Google at least is for everything to deal with massive scale. Doesn't mean everyone is involved in creating massive-scale infrastructure like S3 or BigTable, but it does mean using that kind of infrastructure from the start

[0] https://www.lesswrong.com/posts/koGbEwgbfst2wCbzG/i-don-t-kn...

Swizec 5 days ago | parent | prev | next [-]

> Do you tell people this explicitly?

Yes and no. I give them rough scale numbers to design for. Part of the interview is knowing why I’m telling you this.

no_wizard 5 days ago | parent [-]

Or asking to get there, I find that to also be acceptable

renewiltord 5 days ago | parent | prev [-]

At the level where this matters, the skill to figure it out from context is important. You aren’t the guy converting spec to code. You’re the spec maker.

nlawalker 5 days ago | parent [-]

I agree, but I think my point is that the interview context and expectations can differ radically different from the role context, depending on the interviewer. If the expectation of the interviewer is that the interviewee should be asking questions to determine scale needs, then they should be explicit about that. For all the interviewee knows, you're going to ding them and ultimately fail them for asking too many questions and not exhibiting knowledge and experience.

Swizec 5 days ago | parent [-]

> For all the interviewee knows, you're going to ding them and ultimately fail them for asking too many questions and not exhibiting knowledge and experience.

I start the interview with “I am here in the role of PM and co-engineer so you can bounce ideas off of me and ask any questions”

Stakeholders won’t start their asks with “Please ask me questions to make sure you’re building the right thing”. Asking clarifying questions is a baseline expectation of the role

dondraper36 5 days ago | parent | prev [-]

This also happens because plenty of candidates learn the buzzwords and patterns without understanding the trade-offs and nuances. With a competent enough interviewer, the shallowness of knowledge can be revealed immediately.

Aurornis 5 days ago | parent [-]

Identifying candidates who repeat buzzwords without understanding tradeoffs is easy. It’s part of the questioning process to understand the tradeoffs.

The problem with the comment above is that it’s not discussing tradeoffs at all. It’s just jumping to conclusions and dodging any discussion of tradeoffs.

If you answer questions like that, it’s impossible to tell if the candidate is being wise or if they’re simply BSing their way around the topic and pretending to be smart about it, because both types of candidates sound the same.

It’s easy to avoid this problem by answering questions as asked and mentioning tradeoffs. Trying to dismiss questions never works in your favor.

dondraper36 5 days ago | parent [-]

Yes, I would probably phrase it like this. "Under the current load, I would go super simple and use X, which can work fine long enough until it doesn't. And then we can think about horizontal scaling and use Y and Z". Then proceed with a deeper discussion of Y and Z, probably.

After all, interviewing and understanding what your interviewer expects to hear is also a valuable skill (same as with your boss or client).

no_wizard 5 days ago | parent [-]

Even better would be to clarify under the current load and if reasonably expected future load is similar, I would use X for Y reasons.

Sometimes the “trick” is in todays load is not tomorrows

didibus 5 days ago | parent | prev | next [-]

You're equating simplicity of the design with simplicity of the problem.

It's good not to over engineer, over engineering can be a cause of unneeded complexity, but when complexity is warranted the ability to solve for it simply is also needed.

More importantly though, you haven't explained or rationalized why?

It's not needed for this QPS? Oh ya? Why not? What's your magic threshold? When would it be needed? How do you plan for the team to know that time is approaching? If it's needed later how would you retrofit it? Is that going to be a simple addition? How do you know the max QPS won't be too high and that traffic won't be spiky? What if a surprise incident occurred that caused the system to overload, how would your design, without backpressure, handle that, how would you mitigate and recover?

In system design there's no real right answer, as an interviewer you're looking for the candidate to demonstrate their ability to identify the point of concerns, reason through the possibilities, explain their decisions and trade offs, and so on.

zellyn 5 days ago | parent | prev | next [-]

I recently had an interview like this. Felt like half the answers I gave were of the form, “You can do scaling/sharding/partitioning thing X here, but once again, for an internal app I’d try really hard to avoid doing any of that”. If you’re interviewing with capable, experienced developers, they’ll appreciate that answer (at least, I got the offer on this one!)

reactordev 5 days ago | parent | prev | next [-]

Louder for the back.

It’s like people crave complexity because it makes them, indispensable? Like if you’re the only one who knows how the billing reconciliation service works, they couldn’t possibly fire you?

They will.

Being pragmatic is something I look for in engineers. So long as they understand where to draw the line (and use a queue instead of cron). However that’s usually several years away at this point and them being able to say “You don’t need that, all you need is…” is welcome. Then again, that’s probably why I got fired. :shrug:

9dev 5 days ago | parent [-]

I believe the reason is far more mundane: Complex systems are more interesting, with all the shiny knobs and levers and mysterious thingamabobs. Developers have a tendency to get nerd-sniped by interesting problems, and picking overly complex solutions to solve them at an abstract level scratches that itch very succinctly. In my experience, senior engineers learn to control this urge, and staff engineers can accurately decide when to break the rule and the complexity is warranted.

santiagobasulto 5 days ago | parent | prev | next [-]

I’ve been in software for 20 years and it’s the first time I hear “back pressure”. Am I too old already?

danhite 4 days ago | parent | next [-]

> I’ve been in software for 20 years and it’s the first time I hear “back pressure”. Am I too old already?

I first wrote code 50 years ago (I am 63yo) so yes, imo we are too old, but ...

It is worth noting that systems concepts/techniques often have analogues aka different names and histories in different fields and subfields.

If I were to "explain" back pressure to an ordinary person I might model my analogy to the logic of this ~classic joke:

Bob: Let's go to Trendio(TM) for dinner tonight!? Carol: Oh, nobody goes there anymore, it's too crowded!

Also, often a modern take-this-for-granted concept may be seen as an outgrowth of previous problems or solutions.

For example back pressure is conceptually adjacent to the clever~hack/design of random backoff in Ethernet.

Or if talking to a math geek or traffic planner you might relate it to ~modern understanding of congestion including oddities like possibly removing roads/routes to ~paradoxically improve traffic flow.

We are deep in the Information Age barreling towards Singularities, so none of us, young or old, see and understand but a tiny fraction of where we've been, are, or might be going.

Cue Calvin & Hobbes cartoon of us racing downhill in a fragile box.

Perhaps, as others have essentially suggested, merging your mind with an ~AI will help (albeit temporarily, imo). I prefer to think of us/greybeards as potentially Wise, yet, paradoxically, clueless.

Beginner's Mind, with likely no time/future for Mastery, is still potentially pleasant, and I would argue useful for Debugging.

Obviously this modern AI tsunami is phase shifting us all into debug~mode anyway, eh?

Aurornis 5 days ago | parent | prev | next [-]

Backpressure occurs at many levels, even down to a single machine doing something. If you ever have a producer and a consumer interacting and the consumer can’t consume as fast as the producer can produce, you need some way to have the producer pause or slow down until the consumer catches up. That’s back pressure.

RaftPeople 5 days ago | parent | prev | next [-]

> it’s the first time I hear “back pressure”. Am I too old already?

It's the opposite, as you get older you will feel this more and more.

marcosdumay 5 days ago | parent | prev | next [-]

It's a sign that you didn't get into the "let's distribute every problem" rabbit hole. I don't think it correlates with age.

But the keep the concept in your mind in case you have to distribute some problem. It's a central one.

cygn 5 days ago | parent | prev | next [-]

https://medium.com/@jayphelps/backpressure-explained-the-flo...

stavros 5 days ago | parent | prev | next [-]

You've just never played Factorio.

santiagobasulto 5 days ago | parent [-]

I have never played Factorio nor knew about it. It seems to be a very good game, thanks for the recommendation!

stavros 5 days ago | parent [-]

Unfortunately, it's too good. At least you'll learn all about backpressure in the days you spend lost to the world!

dennis_jeeves2 3 days ago | parent | prev | next [-]

Yes

(but worry yea not, just like someone said of another term: "Dependency Injection" is a 25-dollar term for a 5-cent concept, something is similar for this term. )

stefanfisk 5 days ago | parent | prev | next [-]

Here’s a basic example https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/....

bcrosby95 5 days ago | parent | prev | next [-]

Services, systems, and/or databases eventually provide back pressure when they fail or get overloaded. The idea is to design in back pressure to let the system degrade gracefully rather than fail chaotically.

no_wizard 5 days ago | parent | prev [-]

Somewhere surprising but if you never dealt with scaling issues of a certain nature it may have never came up.

Though you might be familiar with other terms that effectively mean the same thing, like counter pressure

Aurornis 5 days ago | parent | prev | next [-]

> > Interviewer: "Well what about backpressure?"

> > "That's not really worth considering for this amount of QPS"

There is a good way and a bad way to communicate this in interviews.

If an interviewer is asking about back pressure, they’re prompting you to demonstrate your knowledge of back pressure and how and when it would be applied. Treating it as an opening to debate the validity of the question feels like dodging the question or attempting to be contrarian. Explaining when and where you would choose to add back pressure would be good, but then you should go on to answer the question.

This question hits close to home for me because I was once working at a small startup that was dealing with a unique problem where back pressure really was the correct way to manage one of our problems, but we had a number of candidates do exactly what you did: Scoff at the idea that such a topic would be relevant at a startup.

If we’ve been dealing with a problem for months and a candidate comes in and confidently tells us that problem isn’t something we would experience and dismisses our question, that’s not a positive signal.

> > Interviewer: "How would you choose between sql and nosql db?"

> > "Doesn't matter much. Whatever the team has most expertise in"

This is basically a softball question. Again, if you provide a non-answer or try to dismiss the question it feels like you’re either dodging the topic or trying to be contrarian. It’s also a warning sign to the interviewer that you might gravitate toward what’s easy for you instead of right for the project.

This one also resonates with me because I spent years of my life making MongoDB do things that would have been trivial if earlier developers had used something like SQLite instead. The reason they chose MongoDB? Because the team was familiar with it. It was hell to be locked into years of legacy code built around the wrong tool for the job because some early employees thought it didn’t matter “because startup”

As an interviewer, let me give some advice: If an interviewer asks a question, you should answer the question. Anything that feels like changing the subject, dodging the question, or arguing the merits of the question feels like the candidate either doesn’t understand the topic or wants to waste time by debating the question.

It can be very valuable to explain when and why a topic would become necessary, right before you explain it. Instead of “this application has low QPS and therefore I will not answer your question” (not literally what you said, but how it comes across) you could instead explain how the need for back pressure could be avoided first by scaling servers appropriately and then go on to answer the question that was asked.

cryptonector 5 days ago | parent [-]

Re: SQL vs NoSQL my take is that one should always start with SQL and get good at SQL, then if and when you ever find yourself with a need to scale that you can't meet in any way other than to use a NoSQL, then switch to NoSQL. Nine times out of ten you'll never need to switch.

zeroq 5 days ago | parent | prev | next [-]

Not only theory crafting during interviews but a lot of real life design is driven by what's known as resume driven development. The worst part - some of that is later presented at large conferences as successful and go-to solutions.

One time I was working in a body leasing company and our team was hired by bigco for an internal project. Two months earlier an internal employee was tasked to research the project and develop a prototype. When we started all major set pieces were written in stone. Month later said employee left. When we later checked the job listing he likely applied to our tech stack mirrored that to a letter. He got free training, a resume and a new job. We were stuck with these decisions for 3 years.

Another time a local branch of another bigco was trying to carve out a major piece of internal cake. Head-of was hired, team was quickly ramped up and they started cooking their foothold. Then a series of major power shifts happened couple levels above our pay grade and another branch came out with competitive strategy. We had a 2 days long internal brainstorm involving 50 people to come up with arguments and strategies how to defend our approach. We bet on blue, they were selling red. Life's were at stake. And many truly believed that blue was the way to go, and red was a recipe for disaster. Two days later we had a rock solid presentation that was trashing red approach. But if course most of these decisions are not made by nerds and middle mgmt do eventually the company placed their bet in red and the whole dept became redundant. No one likes to lose their jobs, so our blue head-of quickly turned his cloak and the team became an outsourcing provider for the winning team. What makes this story particularly funny is the fact that the head-of immediately started campaign of conference presentations where he sweard that all his life he believed that red was the future that will eventually trump blue, and any competition that is still using blue is destined to fail in short future.

bccdee 4 days ago | parent | prev | next [-]

I think the right tack, if you're going to dismiss something as needlessly complex, is to call out the circumstances that would make it necessary and then describe what you'd do under those conditions.

"Backpressure? I don't think you'll have enough traffic to make backpressure necessary. The mode of failure here is that you run out of queue space and start dropping messages, and it's not a big deal if some messages get dropped here. But if we do decide that dropped messages are causing problems, and if it starts becoming a regular occurrence (we'll set up observability), here's how the producer can poll the queue size and return an error to the user under heavy load.

abound 5 days ago | parent | prev | next [-]

You don't need to entirely forget this. I've made a habit of regularly seeking out job opportunities and interviewing even when I'm entirely happy with my job, which is to say I've done a ton of these kinds of interviews (on both sides of the table).

Unless the initial question requirements are insane (build Twitter at Twitter scale), I start with the smallest, dumbest thing that will work. Usually that's a single machine/VM talking to a database (or even just SQLite!). Compute and storage are so fast these days that you could comfortably run your fledgling service on a Raspberry Pi, even serving three or four-digit QPS depending on the workload.

Of course, you still have to "play the game" in the interview, so make sure to be clear about how you'd change this as requirements changed (higher QPS, more features, etc)

AbstractH24 4 days ago | parent | prev | next [-]

Tools that reduce the barrier to entry to creating things make it easier to solve problems with less scale to pay for the overhead. Generative AI is among these tools, but so are low code platforms, so is React, so is AWS, heck, so is the power grid. But in recent times generative AI is a big leap forward.

We’re at the start of another cycle of a lot of niche products followed by the rise of big Acme megacorps who conquer them all economies of scale that compete on margin. It comes just as we’re at the tail end of this cycle with tech as we knew it for the last 50 or so years.

pragmatic 5 days ago | parent | prev | next [-]

Yes, and then you get the job there and regret it bc they’ll have either have an over engineered Rube Goldberg contraption or they have system envy bc they’ve read about this architecture in blogs and THINK they need K8s and they still fix all their problems.

LoganDark 5 days ago | parent | prev | next [-]

I don't think they're completely not looking for that entire type of answer, but those examples are pretty dry and don't really go into the reasoning for your opinion, which is probably what they're worried about. Whenever you say something isn't worth considering or doesn't seem necessary, you should be explaining exactly why you think that, and exactly where it would be worth considering or seem necessary, because otherwise you just look like someone who simply doesn't care about whatever kind of scalability they're asking about.

renewiltord 5 days ago | parent | prev | next [-]

You can always say “Since we’ve got only x QPS, I’m going to do A. If we had say y QPS, I’d do B but that would impact the rest of the design. Let me know if you anticipate growth to y and I can show you how I’d do it”

The point of an interview is to lay bare one’s thought process entirely so that the interviewer has full awareness of the person you are. And to likewise extract that from the interviewer. Getting or transmitting less information is just underutilizing the time. Interviewers are also flawed and may not be good enough at extracting the information from you.

If you’re an ideal decision maker, you will likely out-skill the majority of interviewers. You’re being hired to make their org succeed. So just do that.

I think people who describe system designs frequently fail to demarcate the space they’re operating in, so subsequent engineers cannot determine whether the original designer failed to consider something or whether the original designer considered and dismissed something. The point is to be able to express this concisely.

IMHO, doing it well means that not only do you get it right but you send the information down through time so that subsequent observers understand why and also get it right consequently.

torginus 5 days ago | parent | prev | next [-]

Who does this? Why make something 10x as complicated as it needs to be, when you could just use the simple thing and get 10x as far? It's not like there's not enough work to do.

cryptonector 5 days ago | parent [-]

Who accumulates decades of legacy code?!

Real companies do. The moment you deploy one line of code, it's legacy. It goes from there. Soon you have to build systems that interface with other systems you'd rather were better architected and designed, except you have to deal with them as they are. Then your product becomes one of these, and with no need to maintain or expand it for a long time, it rots a bit, and now someone has to pick it up or interface with it, and your product made things more complex, and the complexity can't be magic wand waved away.

ajmurmann 5 days ago | parent | prev | next [-]

In some ways it's worse. There are also project review interviews. "We had a Rails/Django/whatever monolith that was backed by Postgres and we didn't need a SPA" makes for a less impressive session with many companies. This creates a lot of incentive to overcomplicate/"future proof" things for resume building.

throwaway7783 5 days ago | parent | prev | next [-]

This. There is also really no easy way of telling how an interviewer is thinking. One interviewer thought not having a warehouse in the design was a mistake, and the other one though having a caching solution made things too complex. It is completely a hit or miss with interviews

jstummbillig 5 days ago | parent | prev | next [-]

If you know that those are not the answers they are looking for, you can reasonably pass by modifying the answer only slightly, while still getting your point across.

If you can't, you might be getting interviewed by people you do not what to work with and you should want to know that.

flashgordon 5 days ago | parent [-]

Except these are the people in your way of getting that job that could be potentially life/career changing for you financially or otherwise. In this market or depending on your situation that would be hard to ignore.

jstummbillig 5 days ago | parent [-]

I think that's a red herring. You are a knowledge worker. You are paid to disagree when necessary. Yes, people will probably take offense when you say "that's just a dumb question" but if they can't at least be approached when you offer your opinion in a palatable(sic) way, that's simply not going to work.

Understand what is being asked. Your insight on a topic is being tested. Offer an answer that does not read like a dodge or a coin flip.

kenny239 a day ago | parent | prev | next [-]

lol that's sad and real. modern software engineering has a lot of bloatware, costing security, etc.

master_crab 5 days ago | parent | prev | next [-]

Nothing against your content…but Kubernetes does manage Kubernetes.

That becomes obvious when you start bootstrapping an HA cluster with multiple control plane nodes.

K8s is not for the faint of heart…or rational system designers ;)

ajuc 5 days ago | parent | prev | next [-]

Answer what they want and finish with "but in practice it doesn't matter for this much traffic and would be wasted effort".

People ask for fizzbuzz in parallel not because it's practical.

ozgrakkurt 5 days ago | parent | prev | next [-]

You don’t want to work at a company like this anyway.

0manrho 5 days ago | parent | next [-]

True, but people generally don't want to get evicted or have their utilities turned off either. If you need a job you need a job, and the numbers out of Cali's job market puts a lot of tech people in a position where they might not have the luxury of waiting for the "right fit". As always, YMMV and the world is a big place, everyone's different, yadda yadda.

__turbobrew__ 5 days ago | parent | prev [-]

If you want to get top dollar at a FAANG you will need to go through these type of system design interviews. You could say you shouldn’t work for a FAANG which is fair, but FAANG pays top dollar.

try_the_bass 5 days ago | parent [-]

So if you're after the FAANG money, you have to play the FAANG interview game. If you're unwilling to play the FAANG interview game, then maybe you shouldn't be pursuing FAANG money.

UK-AL 5 days ago | parent | next [-]

It's not just FAANG that do these interviews anymore.

Tiny startups do them now as well.

__turbobrew__ 4 days ago | parent | prev [-]

Doing FAANG interviews is a necessary, but not sufficient condition to getting a FAANG salary. As others have said, non-FAANGs are trying to do FAANG interviews, and I would tell them to pound sand.

mupuff1234 5 days ago | parent | prev | next [-]

I think you might be missing the point.

Your answers are completely valid but you have to communicate to the interviewer that you considered the possibilities and the tradeoffs.

If the interviewer needs to "forcefully" extract from you the logic behind your design choices than a lot of times that's enough to fail you.

cavisne 5 days ago | parent | prev | next [-]

The interviewer is just another engineer trying to understand if you are someone who they can have a design discussion with.

Dismissive answers that assume they are needlessly over complicating things tells them exactly what they need to know

paulddraper 5 days ago | parent | prev [-]

Why would someone ask about low QPS? Seems it would evoke the “whatever” answers you gave.

> SQL and NoSQL don’t matter much

Database is literally the most important architectural decision possible, next to the application programming language.

(Prove me wrong)