Remix.run Logo
girvo 18 hours ago

> It is still not great at debugging things.

It's so fascinating to me that the thread above this one on this page says the opposite, and the funniest thing is I'm sure you're both right. What a wild world we live in, I'm not sure how one is supposed to objectively analyse the performance of these things

ssl-3 12 hours ago | parent | next [-]

It's great at some things, and it's awful at other things. And this varies tremendously based on context.

This is very similar to how humans behave. Most people are great a small number of things, and there's always a larger set of things that we may individually be pretty terrible at.

The bots the same way, except: Instead of billions of people who each have their own skillsets and personalities, we've got a small handful of distinct bots from different companies.

And of course: Lies.

When we ask Bob or Lisa help with a thing that they don't understand very well, they usually will try to set reasonable expectations . ("Sorry, ssl-3, I don't really understand ZFS very well. I can try to get the SLOG -- whatever that is -- to work better with this workload, but I can't promise anything.")

Bob or Lisa may figure it out. They'll gather up some background and work on it, bring in outside help if that's useful, and probably tread lightly. This will take time. But they probably won't deliberately lie [much] about what they expect from themselves.

But when the bot is asked to do a thing that it doesn't understand very well, it's chipper as fuck about it. ("Oh yeah! Why sure I can do that! I'm well-versed in -everything-! [Just hold my beer and watch this!]")

The bot will then set forth to do the thing. It might fuck it all up with wild abandon, but it doesn't care: It doesn't feel. It doesn't understand expectations. Or cost. Or art. Or unintended consequences.

Or, it might get it right. Sometimes, amazingly-right.

But it's impossible to tell going in whether it's going to be good, or bad: Unlike Bob or Lisa, the bot always heads into a problem as an overly-ambitious pack of lies.

(But the bot is very inexpensive to employ compared to Bob or Lisa, so we use the bot sometimes.)

AstroBen 17 hours ago | parent | prev [-]

Give them real world problems you're encountering and see which can solve them the best, if at all

A full week of that should give you a pretty good idea

Maybe some models just suit particular styles of prompting that do or don't match what you're doing