Remix.run Logo
WarmWash 4 hours ago

I recently did my taxes using all three models (My return is ~50 pages, much more than a standard 1040).

GPT (codex) was accurate on the first run and took 12 minutes

Gemini (antigravity) missed 1 value because it didn't load the full 1099 pdf (the laziness), but corrected it when prompted. However it only spent 2 minutes on the task.

Claude (CC) made all manner of mistakes after waiting overnight for it to finish because it hit my limit before doing so. However claude did the best on the next step of actually filing out the pdf forms, but it ended up not mattering.

Ultimately I used gemini in chrome to fill out the forms (freefillableforms.com), but frankly it would have been faster to manually do it copying from the spreadsheets GPT and Gemini output.

I also use anti-gravity a lot for small greenfield projects(<5k LOC). I don't notice a difference between gemini and claude, outside usage limits. Besides that I mostly use gemini for it's math and engineering capabilities.