| ▲ | raw_anon_1111 14 hours ago | ||||||||||||||||
For the most part, I don’t do chatbots except for a couple of RAG based chatbots. It’s more behind the scenes stuff like image understanding, categorization, nuanced sentiment analsys, semantic alignment, etc. I’ve created a framework that lets me test the quality in automated way between prompt changes and models and I compare costs/speed/quality. The only thing that requires humans to judge the qualify out of all those are RAG results. | |||||||||||||||||
| ▲ | biophysboy 14 hours ago | parent [-] | ||||||||||||||||
So who is the winner using the framework you created? | |||||||||||||||||
| |||||||||||||||||