| ▲ | culi a day ago | |||||||
Kimi K2 is the model that most consistently passes the clock test. I agree it's definitely got something unique going on | ||||||||
| ▲ | davej a day ago | parent | next [-] | |||||||
Nice! I'm curious, what does this service cost to run? I notice that you don't have more expensive models like Opus but querying the models every minute must add up over time (excuse pun)? | ||||||||
| ||||||||
| ▲ | eunos a day ago | parent | prev [-] | |||||||
Lol why's GPT 5 broken on that test. DeepSeek surprisingly crisp and robust | ||||||||