Remix.run Logo
neomantra 7 hours ago

> MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window.

I maintain an OSS SDK for Databento market data. A year ago, I naively wrapped the API and certainly felt this pain. Having an API call drop a firehose of structured data into the context window was not very helpful. The tool there was get_range and the data was lost to the context.

Recently I updated the MCP server [1] to download the Databento market data into Parquet files onto the local filesystem and track those with DuckDB. So the MCP tool calls are fetch_range to fill the cache along with list_cache and query_cache to run SQL queries on it.

I haven't promoted it at all, but it would probably pair well with a platform like this. I'd be interested in how people might use this and I'm trying to understand how this approach might generally work with LLMs and DuckLake.

[1] https://github.com/NimbleMarkets/dbn-go/blob/main/cmd/dbn-go...

TacticalCoder 29 minutes ago | parent [-]

> A year ago, I naively wrapped the API and certainly felt this pain.

Most people, before being confronted to it, have no idea how big market data feeds really are: I certainly had no idea what I was getting into. There's a reason all these subscriptions are that pricey.

Here's an example of the pricing for the OPRA feed for Databento you mentioned:

https://databento.com/pricing#opra

We're talking about feeds that sustain 25+ Gb/s and can have spikes at twice or even three times that. And that's only for options market data.

I mean: even people with 25 GB/s fiber (which we can all agree ain't the most common and that's an understatement) at home still can't dream of getting the entire feed.

Having a bandwith big enough, storing, analyzing such amount of data: everything becomes problematic as such scales.

As to me I'm reusing my brokers' feeds (as I already pay for them): it's not a panacea but I get the info I need (plus balances/orders/etc. tied to my accounts).