Remix.run Logo
kushal2048 a day ago

Lets unpack it one by one.

> What protects recent/ hot in-memory data if a node dies? There is a WAL implementation in Okapi. This one https://github.com/okapi-core/okapi/tree/main/okapi-wal . However it hasn't been battle tested so it isn't integrated. For now durability is done by periodic snapshots and yes, in the event of catastrophic failure Okapi will lose data in flight. Durability is being actively worked on. Will be part of the next release.

>How does sharding and failover work? If a shard is down, can reads fan out to replicas, and how are writes handled? In sharded mode, the replication factor is 1. This is done to prevent write fan-outs. This will be picked up after single-node durability issues are solved. But it won't be part of the next release.

> When memory gets tight, what’s the backpressure plan The plan here is to put a fixed size in-memory buffer and swap pages in-and-out. This should usually suffice as Okapi rolls data up as soon as it arrives which reduces memory consumption. After swapping pages gets saturated, we'll take a look at some production tests and decide if other optimizations are necessary.

> How do you handle late or out-of-order samples after a rollup/export—can you backfill/compact Parquet to fix history? Yup, some part of this is available today via the concept of admission window. All in-memory data can be written to and all data in admission window is held in memory. The default admission window is 24hrs so upto 24hrs old data can be ingested. As for backfilling, Okapi follows a simple schema for partitioning data, backfilling can be done externally by writing a parquet file. We'll document our schema to ensure compatibility.

> Will there be any plans for data models and different metric types for the hot in memory store like gauge, counter etc. Everything is a gauge in Okapi and count ops will be implemented as reductions. Count is already present as a minutely,hourly,secondly statistic.

> The sub-ms reads are great, is there a Linux version for the performance reports so it's easier to compare with other products?

Working on it. We'll publish a new detailed benchmark.

>are you able to share the memory/ CPU overhead/ GC details etc. for the benchmarks? I allocated 6gigs of JVM memory. GC is standard Java on OpenJDK 22 without any tuning.

> Considering most people use some other solution like prometheus etc. Support for the Prometheus ecosystem is actively being worked on. Support for Prometheus and OTel style ingestion with PromQL queries will be part of the next release.

> Will Okapi be able to serve a single query across hot (memory) + cold (Parquet) seamlessly ? Yes ! We have a metrics-proxy that already does this but currently not interoperable with Parquet. It uses Okapi's internal format which optimized for range style scans. Only scan queries are supported right now.

> Snapshots can slow ingest—are those pauses tunable and bounded? Any metrics/alerts for export lag, memory pressure, or cardinality spikes?

Yes the snapshot pause is tunable via parameter --checkpoint.gapMillis . Default is 1hr. Right now memory pressure, CPU can be measured by JMX, okapi doesn't publish these metrics. Okapi intends to be a high cardinality engine, so we don't frown on cardinality spikes.

> A couple of end-to-end examples (for queries) and a Helm chart/Terraform module would make trials much easier. Coming soon !

> Are there any additional monitoring and observability implemented or have plans for Okapi itself? Yes we ourselves will emit fleet metrics via OTel. This is not being done right now but will be soon. But we won't emit Okapi metrics into Okapi as it might cause spiralling.

> Overall: promising approach with a strong cost/flexibility angle Thank you !

TLDR coming soon: fixed-memory buffer, durability improvements, deployment improvements, one very long and detailed benchmark report, OTel / Prometheus style suport, PromQL queries.