Remix.run Logo
Flow: Actor-based language for C++, used by FoundationDB(github.com)
151 points by SchwKatze 9 hours ago | 39 comments
SoKamil 7 hours ago | parent | next [-]

FoundationDB is awesome testing wise as they have deterministic simulation testing [1] that can simulate distributed and operating system failures.

> We wanted FoundationDB to survive failures of machines, networks, disks, clocks, racks, data centers, file systems, etc., so we created a simulation framework closely tied to Flow. By replacing physical interfaces with shims, replacing the main epoll-based run loop with a time-based simulation, and running multiple logical processes as concurrent Flow Actors, Simulation is able to conduct a deterministic simulation of an entire FoundationDB cluster within a single-thread! Even better, we are able to execute this simulation in a deterministic way, enabling us to reproduce problems and add instrumentation ex post facto. This incredible capability enabled us to build FoundationDB exclusively in simulation for the first 18 months and ensure exceptional fault tolerance long before it sent its first real network packet. For a database with as strong a contract as the FoundationDB, testing is crucial, and over the years we have run the equivalent of a trillion CPU-hours of simulated stress testing.

[1]https://pierrezemb.fr/posts/notes-about-foundationdb/#simula...

gioazzi 5 hours ago | parent [-]

And they went on to build Antithesis to deliver the same capabilities to other systems, pretty cool stuff!

[1]: https://antithesis.com/company/backstory/

menaerus 4 hours ago | parent [-]

Pretty cool. For it to scale they are building their own deterministic hypervisor too [0], but also a new distributed database to support their workloads more efficiently [1].

[0] https://antithesis.com/blog/deterministic_hypervisor

[1] https://antithesis.com/blog/2025/testing_pangolin

ttul 8 hours ago | parent | prev | next [-]

Type-safe message-passing is such a wonderful programming paradigm - and not just for distributed applications. I remember using QNX back in the 1990s. One of its fabulous features was a C message passing library allowing you to send arbitrary binary structs from one process to another. In the context of realtime software development, you often find yourself having one process that watches for events from a certain device, modify the information somehow, and then pass it on to another process that ends up doing something else. The message-passing idiom was far superior to what was available in Linux at the time (pipes and whatnot) because you were able to work with C structs. It was not strictly type safe (as is the case with FoundationDB’s library), but for the 1990s it was pretty great.

mrbnprck 7 hours ago | parent [-]

I remnber that ASN.1 does sth similar. You'd give a ASN.1 notation to a language generator (aka producing C) and not have to worry about parsing the actual structure anymore!

IshKebab 5 hours ago | parent [-]

Literally every schema-based serialisation format does this. ASN.1 is a pretty terrible option.

The best system for this I've ever used was Thrift, which properly abstracts data formats, transports and so on.

https://thrift.apache.org/docs/Languages.html

Unfortunately Thrift is a dead (AKA "Apache") project and it doesn't seem like anyone since has tried to do this. It probably didn't help that there are so many gaps in that support matrix. I think "Google have made a thing! Let's blindly use it!" also helped contribute to its downfall, despite Thrift being better than Protobuf (it even supports required fields!).

Actually I just took a look at the Thrift repo and there are a surprising number of commits from a couple of people consistently, so maybe it's not quite as dead as I thought. You never hear about people picking it for new projects though.

computably 3 hours ago | parent | next [-]

FB maintains a distinct version of Thrift from the one they gave to Apache. fbthrift is far from dead as it's actively used across FB. However in typical FB fashion it's not supported for external use, making it open source in name (license) only.

As an interesting historical note, Thrift was inspired by Protobuf.

mrbnprck 2 hours ago | parent | prev [-]

Very true. ASN.1 is mostly not a great fit, yet has been the choice for everything to do with certificates and telecommunication protocols (even the newer ones like 5G for things like RRC AND NGAP) Mostly for bit-level support and especially long-term stability. * and looking back in time ASN.1 has definetly proven its LTS.

actually never heard of thrift until today, thanks for the insight :)

websiteapi 8 hours ago | parent | prev | next [-]

I'm always hearing about FoundationDB but not much about who uses it. I know Deno and obviously Apple is using it. Who else? I'd love to hear some stories about it.

CharlesW 7 hours ago | parent | next [-]

Snowflake uses it: https://www.snowflake.com/en/blog/how-foundationdb-powers-sn...

Tigris uses it: https://www.tigrisdata.com/blog/building-a-database-using-fo...

A good collection of papers, blog posts, talks, etc.: https://github.com/FoundationDB/awesome-foundationdb

preetamjinka 3 hours ago | parent | next [-]

https://discovery.hgdata.com/product/foundationdb

This "Who is hiring" post for Tesla mentions FoundationDB [0].

Firebolt [1] uses it.

FoundationDB is used at Datadog [2].

[0] https://news.ycombinator.com/item?id=26306170

[1] https://www.firebolt.io/blog/decomposing-firebolt-transactio...

[2] https://news.ycombinator.com/item?id=36576775

adammarples 3 hours ago | parent | prev [-]

Snowflake article from 2018, I wonder if it's still true

throwawaydbb 3 hours ago | parent [-]

Yes. They hire engineers specifically to work on it.

dpedu 8 hours ago | parent | prev | next [-]

My company (Matterport, YC Winter '12) uses it to store metadata about 3d models. I really don't have that much to say about it because it's not my primary area of focus, and besides that, has been extremely reliable and hands-off, administration-wise. I particularly love that you can change redundancy modes on the fly, for example those listed here[1], and FDB will automatically re-arrange data to your liking, all without downtime. It handles offline/missing or replacing nodes quite well, and I credit my coworker's great efforts to make it work on top of Kubernetes for making our lives so much easier.

1: https://apple.github.io/foundationdb/configuration.html#choo...

quettabit 7 hours ago | parent | prev | next [-]

At s2.dev (a serverless datastore for real-time streaming data), we started with DynamoDB for our metadata store, but our access patterns kept running into per-partition throughput limits. We switched to FoundationDB, and it’s been great so far.

ghc 6 hours ago | parent | prev | next [-]

There might be a good reason for the lack of stories. FoundationDB runs critical infrastructure I work on, but I never actually have to think about it.

I've never spent less time thinking about a data store that I use daily.

adobrawy 8 hours ago | parent | prev | next [-]

Snowflake uses it as primary database for their metadata. https://www.snowflake.com/en/blog/how-foundationdb-powers-sn...

arohner 5 hours ago | parent | prev | next [-]

Griffin Bank UK uses it for our entire system (https://griffin.com)

mannyv 6 hours ago | parent | prev | next [-]

From what I've heard El Toro uses it to keep track of the billions of data points it harvests from the world every minute.

nish__ 8 hours ago | parent | prev | next [-]

Apple uses it for iMessage I believe.

otabdeveloper4 5 hours ago | parent | prev [-]

It's legacy technology. MongoDB is basically the same thing under the hood, and more "standard".

boris 7 hours ago | parent | prev | next [-]

The strangest thing about Flow is that its compiler is implemented in C#. So if you decide to use it in your C++ codebase, you now have a C#/.Net dependency, at least at build time.

boxfire 6 hours ago | parent | next [-]

It’s also funny because it’s a small, incomplete, incompatible subset of c++… seems like a perfect LLVM / clang rewriter case too, it would be easy to convert and be pure c++. Hell even a clang plugin to put the compile time into one process wouldn’t be awful. But i wonder looking at the rewrites if there’s not a terribly janky way to not need a compiler, if at some runtime cost of contextual control flow info.

jermaustin1 7 hours ago | parent | prev [-]

I wonder why that decision was made. I know why I, a C# developer, would make that decision, but why Apple?

atn34 7 hours ago | parent | next [-]

The original developers (before Apple bought the company) used Visual Studio on Windows

rdtsc 4 hours ago | parent | prev | next [-]

Someone knew C# and was good at parsers, would be my guess. It could have just as easily been Scala or something else.

jeffbee 7 hours ago | parent | prev [-]

This entire codebase was acquired by apple in a state of substantial completion and since then relatively little has changed.

culebron21 7 hours ago | parent | prev | next [-]

At first glance, it looks like Rust's channels with a polymorphic type -- when you receive from a channel, you do match and write branches for each variant of the type.

But I wonder if this can be a better abstraction than async. (And whether I can build something like this in existing Rust.)

srinikhilr 4 hours ago | parent | prev | next [-]

iirc there was a ticket/doc about FoundationDB deprecating usage of this and moving to C++ coroutines.

maxmcd 4 hours ago | parent [-]

Maybe this? https://forums.foundationdb.org/t/swift-or-c-20-coroutine-wh...

pmarreck 8 hours ago | parent | prev | next [-]

how does this compare to the inbox and supervisor model of erlang/elixir?

yetihehe 8 hours ago | parent | next [-]

It doesn't. It's "promise" based, not "communicating sequential processes". Erlang has more preemptive scheduling, a "thread" can be preempted at any time, here you can only be synchronized when you wait for result. It is called "actor-based", because only functions tagged as "actor" can call waiting functions.

This is more node.js-like communication than erlang.

jacquesm 8 hours ago | parent | next [-]

By they looks of it they changed the word 'async' to 'actor' because they thought it was cool not because it actually uses the actor pattern. Which to me seems to be namespace pollution.

voidmain 7 hours ago | parent | next [-]

If I were designing it today rather than in... 2008?, I would use the terms 'async' and 'await' because they are a lingua franca now. And for a modern audience that already knows what promises are it probably makes sense to start the explanation with that part. But the thing as a whole was intended to build lightweight asynchronously communicating sequential processes with private state that can be run locally or in a distributed way transparently, restarted on failure, etc. I don't think the choice of terms was obviously a crime at the time.

junon 8 hours ago | parent | prev [-]

Unfounded guess, they probably didn't want to bump into the new C++ keywords for async/await.

thesz 2 hours ago | parent | prev [-]

They build channels on top of these "promises" and "futures" and this made them square into communicating sequential processes category. Also, you can look at promise-future pair as a single-element channel, again, it's CSP.

BTW, Erlang does not implement CSP fully. Its' interprocess communication is TCP based in general case and because of this is faulty.

hawk_ 8 hours ago | parent | prev [-]

Ok a related note, how does it compare to SeaStar?

thisisauserid 8 hours ago | parent | prev [-]

How did they come up with such an original and unique name? Apple does it again.

Hayvok 7 hours ago | parent [-]

FoundationDB was originally a startup, purchased by Apple in 2015.