| ▲ | kqr 3 hours ago | |
Super interesting! You can click the "live" link in the header to see how they performed over time. The (geometric) average result at the end seems to be that the LLMs are down 35 % from their initial capital – and they got there in just 96 model-days. That's a daily return of -0.6 %, or a yearly return of -81 %, i.e. practically wiping out the starting capital. Although I lack the maths to determine it numerically (depends on volatility etc.), it looks to me as though all six are overbetting and would be ruined in the long run. It would have been interesting to compare against a constant fraction portfolio that maintains 1/6 in each asset, as closely as possible while optimising for fees. (Or even better, Cover's universal portfolio, seeded with joint returns from the recent past.) I couldn't resist starting to look into it. With no costs and no leverage, the hourly rebalanced portfolio just barely outperforms 4/6 coins in the period: https://i.xkqr.org/cfportfolio-vs-6.png. I suspect costs would eat up many of the benefits of rebalancing at this timescale. This is not too surprising, given the similiarity of coin returns. The mean pairwise correlation is 0.8, the lowest is 0.68. Not particularly good for diversification returns. https://i.xkqr.org/coinscatter.png > difficulty executing against self-authored plans as state evolves This is indeed also what I've found trying to make LLMs play text adventures. Even when given a fair bit of help in the prompt, they lose track of the overall goal and find some niche corner to explore very patiently, but ultimately fruitlessly. | ||
| ▲ | falcor84 2 hours ago | parent | next [-] | |
Agreed, and I'd also love to see a baseline of human performance here, both of experienced quant traders and of fresh grads who know the theory but never did this sort of trading and aren't familiar with the crypto futures market. | ||
| ▲ | fragmede an hour ago | parent | prev [-] | |
> find some niche corner to explore very patiently, but ultimately fruitlessly. What, so they're better at my hobbies than me? Someone give Claude a 3d printer! | ||