Remix.run Logo
vidarh 5 hours ago

I was about to say they have a self-hosting guide, but I see they use third party services that seem absolutely pointless for such a tiny dataset. For comparison, I have a project that happily analyzes 150 million tokens worth of Claude session data w/some basic caching in plain text files on a $300 mini pc in seconds... If/when I reach billions, I might throw Sqlite into the stack. Maybe once I reach tens of billions, something bigger will be worthwhile.

keks0r 5 hours ago | parent | next [-]

There is also a docker setup in there to run everything locally.

vidarh 4 hours ago | parent [-]

That's great. It's still over-engineered given processing this data in-process is more than fast enough at a scale far greater than theirs.

4 hours ago | parent | prev [-]
[deleted]