| ▲ | maxwells-daemon a day ago | ||||||||||||||||
Aristotle's output is formally verified in Lean, so you can run it for days on a hard problem and be assured that the answer, no matter how complex, is right without needing to manually check it. Claude Code can write lean, but we do a heck of a lot of RL on theorem proving, so Aristotle winds up being much better at writing Lean than other coding agents are. | |||||||||||||||||
| ▲ | jjmarr 21 hours ago | parent | next [-] | ||||||||||||||||
Seeing a task-specific model be consistently better at anything is extremely surprising given rapid innovation in foundation models. Have you tried Aristotle on other, non-Lean tasks? Is it better at logical reasoning in general? | |||||||||||||||||
| |||||||||||||||||
| ▲ | Davidzheng 20 hours ago | parent | prev [-] | ||||||||||||||||
how strong is your internal informal LLM at theorem-proving before the formalization stage? or it's combined in a way so that is not measurable? | |||||||||||||||||