Remix.run Logo
hatthew 4 days ago

My work involves petabyte scale data, and the algorithms are very straightforward:

- What you want to do is probably trivially O(kn).

- There isn't a <O(kn) algorithm, so try to reduce overhead in k.

- Cache results when they are O(1), don't when they are O(n).

- If you want to do something >O(kn), don't.

- If you really need to so something >O(kn), do it in SQL and then go do something else while it's running.

None of that requires any DSA knowledge beyond what you learn in the first weeks of CS101. Instead, what's useful is knowing how to profile to optimize k, knowing how SQL works, and being able to write high quality maintainable code. Any smart algorithms that have a large time complexity improvement will probably be practically difficult to create and test even if you are very comfortable with the underlying theoretical algorithm. And the storage required for an <O(n) algorithm is probably at least as expensive as the compute required for the naive O(n) algorithm.

My general impression is that for small-scale problems, a trustworthy and easy algorithm is fine, even if it's inefficient ($100 of compute < $1000 of labor). For large-scale problems, domain knowledge and data engineering trumps clever DSA skills. The space between small- and large-scale problems is generally either nonexistent or already has premade solutions. The only people who make those "premade solutions" obviously need to feel it in their bones the way you describe, but they're a very very small portion of the total software population, and are not the target audience of this article.

fuzztester 4 days ago | parent [-]

>My work involves

As the GP said:

>>What someone says on this topic says more about what things they have worked on in their life than anything else.

hatthew 4 days ago | parent [-]

TFA isn't saying "DSA is useless", it's saying "intermediate/advanced DSA is not useful for most people". It's obvious that it's useful for some people, but I think even most people working on "large scale systems" should probably value general software engineering skills over DSA skills. The very few people who actually need DSA skills already know that the advice "you don't need DSA" doesn't apply to them.

danielmarkbruce 4 days ago | parent [-]

> The very few people who actually need DSA skills already know that the advice "you don't need DSA" doesn't apply to them.

This is right. And most of those people know a lot of their job is very far removed from many other software engineers. But the prevalence of the idea "you don't really use DSA in practice" does suggest many people building applications where DSA isn't as applicable seem to misunderstand the situation. It matters in some sense - it explains why interviews at google are the way they are, why universities teach what they teach, what one should do if they really like such things.