| ▲ | A distributed queue in a single JSON file on object storage(turbopuffer.com) | |||||||||||||
| 28 points by Sirupsen 3 days ago | 10 comments | ||||||||||||||
| ▲ | pjc50 an hour ago | parent | next [-] | |||||||||||||
Several things going on here: - concurrency is very hard - .. but object storage "solves" most of that for you, handing you a set of semantics which work reliably - single file throughput sucks hilariously badly - .. because 1Gb is ridiculously large for an atomic unit - (this whole thing resembles a project I did a decade ago for transactional consistency on TFAT on Flash, except that somehow managed faster commit times despite running on a 400Mhz MIPS CPU. Edit: maybe I should try to remember how that worked and write it up for HN) - therefore, all of the actual work is shifted to the broker. The broker is just periodically committing its state in case it crashes - it's not clear whether the broker ACKs requests before they're in durable storage? Is it possible to lose requests in flight anyway? - there's a great design for a message queue system between multiple nodes that aims for at least once delivery, and has existed for decades, while maintaining high throughput: SMTP. Actually, there's a whole bunch of message queue systems? | ||||||||||||||
| ▲ | Normal_gaussian an hour ago | parent | prev | next [-] | |||||||||||||
The original graph appears to simply show the blocking issue of their previous synchronisation mechanism; 10 min to process an item down to 6 min. Any central system would seem to resolve this for them. In any organisation its good to make choices for simplicity rather than small optimisations - you're optimising maintenance, incident resolution, and development. Typically I have a small pg server for these things. It'll work out slightly more expensive than this setup for one action, yet it will cope with so much more - extending to all kinds of other queues and config management - with simple management, off the shelf diagnostics etc. While the object store is neat, there is a confluence of factors which make it great and simple for this workload, that may not extend to others. 200ms latency is a lot for other workloads, 5GB/s doesn't leave a lot of headroom, etc. And I don't want to be asked to diagnose transient issues with this. So I'm torn. It's simple to deploy and configure from a fresh deployment PoV. Yet it wouldn't be accepted into any deployment I have worked on. | ||||||||||||||
| ▲ | soletta 2 hours ago | parent | prev | next [-] | |||||||||||||
The usual path an engineer takes is to take a complex and slow system and reengineer it into something simple, fast, and wrong. But as far as I can tell from the description in the blog though, it actually works at scale! This feels like a free lunch and I’m wondering what the tradeoff is. | ||||||||||||||
| ||||||||||||||
| ▲ | jamescun an hour ago | parent | prev | next [-] | |||||||||||||
This post touches on a realisation I made a while ago, just how far you can get with the guarantees and trade-offs of object storage. What actually _needs_ to be in the database? I've never gone as far as building a job queue on top of object storage, but have been involved in building surprisingly consistent and reliable systems with object storage. | ||||||||||||||
| ▲ | dewey an hour ago | parent | prev | next [-] | |||||||||||||
Depending on who hosts your object storage this seems like it could get much more expensive than using a queue table in your database? But I'm also aware that this is a blog post of an object storage company. | ||||||||||||||
| ▲ | isoprophlex an hour ago | parent | prev | next [-] | |||||||||||||
Is this reinventing a few redis features with an object storage for persistence? | ||||||||||||||
| ||||||||||||||
| ▲ | jstrong 13 minutes ago | parent | prev [-] | |||||||||||||
that's A choice. | ||||||||||||||