Remix.run Logo
latch a day ago

Author of ccache here.

I've barely touched Go in over a decade, but if I did, I'd probably still use ccache if I didn't need cutting edge (because I think the API is simple), but not if I needed something at huge scale.

When I wrote ccache, there were two specific features that we wanted that weren't readily available:

- Javing both a key and a subkey, so that you can delete either by key or key+subkey (what ccache calls LayeredCache).

- Having items cached that other parts of the system also have a long-living reference to, so there's not much point in evicting them (what ccache calls Tracking and is just a separate ARC mechanism that overrides the eviction logic).

It also supports caching based on arbitrary item size (rather than just a count of items), but I don't remember if that was common back then.

I've always thought that this, and a few other smaller features, make it a little bloated. Each cached item carries a lot of information (1). I'm surprised that, in the linked benchmark, the memory usage isn't embarrassing.

I'm not sure that having a singl goroutine do a lot of the heavy-lifting, to minimize locks, is a great idea. It has a lot of drawbacks, and if I was to start over again, I'd really want to benchmark it to see if it's worth it (I suspect that, under heavy write loads, it might perform worse).

The one feature that I do like, that I think most LRU's should implement, is to have a [configurable] # of gets before an item is promoted. This not only reduces the need for locking, it also adds some frequency bias to evictions.

Fun Fact: My goto interview question was to implement a cache. It was always rewarding to see people make the leap from using a single data structure (a dictionary) to using two (dictionary + linked list) to achieve a goal. It's not a way most of us are trained to think of data structures, which I think is a shame.

(1) https://github.com/karlseguin/ccache/blob/master/item.go#L22

maypok86 a day ago | parent [-]

Putting aside performance metrics (latency, throughput, hit rate, memory usage), here's what I don't like:

1. I don't really see how the API is simpler. ccache has tons of methods like `GetsPerPromote`, `PercentToPrune`, `Buckets`, `PromoteBuffer`, `DeleteBuffer`. How is a user supposed to know what values to set here? Honestly, even with all the time I've spent digging through cache implementations, I don't fully understand what should be configured there. Otter simply doesn't need any of these - you just specify the maximum size and the cache works.

2. Numerous methods like `tracking` and `promote` are again unnecessary for otter. Just `getIfPresent` and `set`/`setIfAbsent` and you're good to go.

3. The lack of loading and refreshing features seems like a significant drawback, as they typically provide major benefits for slow data sources.

latch a day ago | parent [-]

I don't disagree. It's like 13 years old. `GetWithoutPromote` was added in 2022, I assume someone asked for it, so I added it. That kind of stuff happens, especially when you stop building it for your own needs.

For the most part, you use a default config and use Get/Fetch/Set. Besides the excuse of its age, and not being seriously worked on for a long time (a decade?), I do think we both have a bias towards what's more familiar. What are the `ExpiryCalculator`, `Weigher`, etc... configuration options of Otter? (or `GetEntryQuietly`, `SetRefreshableAfter` ...)

maypok86 a day ago | parent [-]

I believe `ExpiryCalculator` is fairly self-explanatory. For example, `ExpiryWriting` returns an `ExpiryCalculator` that specifies the entry should be automatically evicted from the cache after the given duration from either its creation or value update. The expiration time isn't refreshed on reads.

`Weigher` is also likely clear from its doc. Many developers are at least familiar with this concept from other languages or libraries like ristretto and ttlcache.

`GetEntryQuietly` retrieves the cache entry for a key without any side effects - it doesn't update statistics or influence eviction policies (unlike `GetEntry`). I genuinely think this is reasonably clear.

I'm completely baffled why `SetRefreshableAfter` made this list. If you understand refreshing, it's obviously just `SetTTL` but for the refresh policy.

Honestly, I mostly disagree about the options being unclear. I suspect `Executor` is the only one that might confuse users after reading the docs, and it's mainly for testing anyway. My core complaint is the first point in my comment - tuning the cache requires deep understanding of its internals. Take ristretto's `NumCounters` parameter: users don't understand it and often just set it to `maxCost * 10` like the README example. But this completely breaks when using custom per-entry costs (like byte sizes).

But as I mentioned when reviewing sturdyc, it probably comes down to personal preference.