Remix.run Logo
raviisoccupied 5 hours ago

I have been working on a web app called Beval - Simple evaluations for your AI product.

In my day to day as a Product Manager working in a team that ships AI products, I often found myself wanting to do 'quick and dirty' LLM based evaluation on conversation transcripts and traces. I found myself blocked by 'Gemini in Google Sheets', it was too slow and cumbersome, and it didn't handle eval changes well. And because I was exploring, it wasn't helpful to try and set up something more robust with the team.

To fix the problem I eventually learned to call the OpenAI API in python, but I really felt that I wanted a 'product' to help me and potentially help others.

So this weekend I built https://beval.space