Remix.run Logo
TacticalCoder 9 hours ago

> Next I'm going to set it loose on 263 GB database of every stock quote and options trade in the past 4 years.

Options quotes alone for US equities (or things that trades as such, like ADS/ADR) represent 40 Gbit per second during options trading hours. There are more than 60 million trades (not quotes, only trades) per day. As the stock market is opened approx 250 days per year (a bit more), that's more than 60 billion actual options trades in 4 years. If we're talking about quotation for options, you can add several orders of magnitude to these numbers.

And I only mentioned options. How do you store "every stock quote and options trade in the past 4 years" in 263 GB!?

jtbaker 9 hours ago | parent | next [-]

> And I only mentioned options. How do you store "every stock quote and options trade in the past 4 years" in 263 GB!?

I think this would be pretty straightforward for Parquet with ZSTD compression and some smart ordering/partitioning strategies.

dataviz1000 8 hours ago | parent | prev [-]

I see, I said "stock quote" instead of "minute aggregates". You are correct that data set is much larger and at ~1.5TB a year [0] I did not download 6TB of data onto my laptop. Every settled trade options or stocks isn't that big.

[0] https://massive.com/docs/flat-files/stocks/quotes