Remix.run Logo
ttfvjktesd 3 days ago

Digital Ocean is also using Ceph[1]. I think these cloud providers could easily have 100s of PBs Clusters at their size, but it's not public information.

Even smaller company's (< 500 employees) in today's big data collection age often have more than 1 PB of total data in their enterprise pool. Hosters like Digital Ocean hosts thousands of these companies.

I do think that Ceph will hit performance issues at that size and going into the EB range will likely require code changes.

My best guess would be that Hetzner, Digital Ocean and similar, maintain their own internal fork of Ceph and have customizations that tightly addresses their particular needs.

[1]: https://www.digitalocean.com/blog/why-we-chose-ceph-to-build...