| ▲ | creddit 5 hours ago | |||||||
Playing with this some more and it's actively not good. Just basic mathematical errors riddling responses. Did some basic adversarial testing where its responses are analyzed by Gemini and Gemini is finding basic math errors across every relatively (relative to Opus, Gemini or GPT can handle) simple ask I make. Yikes. | ||||||||
| ▲ | smlacy an hour ago | parent [-] | |||||||
Post actual results, make a blog post. Don't just say "this sucks" without tangible evidence. Otherwise you're doomed to "sample size of one" level of relevance. | ||||||||
| ||||||||