| ▲ | raviisoccupied 5 hours ago | |
I have been working on a web app called Beval - Simple evaluations for your AI product. In my day to day as a Product Manager working in a team that ships AI products, I often found myself wanting to do 'quick and dirty' LLM based evaluation on conversation transcripts and traces. I found myself blocked by 'Gemini in Google Sheets', it was too slow and cumbersome, and it didn't handle eval changes well. And because I was exploring, it wasn't helpful to try and set up something more robust with the team. To fix the problem I eventually learned to call the OpenAI API in python, but I really felt that I wanted a 'product' to help me and potentially help others. So this weekend I built https://beval.space | ||