| ▲ | syndacks 6 hours ago | |
How do people evaluate creative writing and emotional intelligence in LLMs? Most benchmarks seem to focus on reasoning or correctness, which feels orthogonal. I’ve been playing with Kimmy K 2.5 and it feels much stronger on voice and emotional grounding, but I don’t know how to measure that beyond human judgment. | ||
| ▲ | mohsen1 2 hours ago | parent [-] | |
I am trying! https://mafia-arena.com I just don't have enough funding to do a ton of tests | ||