Remix.run Logo
gallerdude 4 days ago

Very interesting - have you tried using `o1` yet? I made a program which makes LLM's complete WORDLE puzzles, and the difference between `4o` and `o1` is absolutely astonishing.

gallerdude 4 days ago | parent | next [-]

4o-mini: 16% 4o: 50% o1-mini: 97% o1: 100%

* disclaimer - only n=7 on o1. Others are like 100-300 each

simonw 4 days ago | parent | prev [-]

OK, that was fun. I just tried o1-preview on today's Wordle and it got it on the third guess: https://chatgpt.com/share/673f9169-3654-8006-8c0b-07c53a2c58...

gallerdude 3 days ago | parent [-]

With some transcribing (using another LLM instance) I’ve even gotten it to solve NYT mini crosswords.