Remix.run Logo
dylan604 5 hours ago

Isn't this precisely how AI started? It was a bunch of humans under the hood doing the logic when the companies said it was AI. Then we removed the humans and the quality took a hit. To fix that hit, 3rd party companies are putting humans back in the loop? Isn't that kind of like putting a band-aid on the spot where your arm was just blown off?

mschulkind 2 hours ago | parent | next [-]

No, not really.

If you have an AI that can answer 90% of queries correctly AND now this is the key, it knows which 90% it can answer correctly, human in the loop can be incredibly valuable to answer that other 10%.

dhorthy 2 hours ago | parent [-]

hah yeah I don't know how soon we will be on great accuracy for the latter, for things like "send an email", people tend to just block everything for approval, because clicking approve 90 times hand editing 10 times is a lot better than copying 90 things from one app to another and then 10 things copy, hand edit, send

although I do have some ideas on how you could use vector similarity against past executions to get a 1-100 score on how likely a given action is to be approved rejected. You could set a dial to "anything below 60 just auto-reject it and provide the past feedback to the model preemptively". This would need a lot of experimentation, might even be a research angle (if it hasn't been tried already)

(thinking like cosine * {1 if approved, -1 if rejected} and normalize the score 1-100. You could maybe even weight rejection in 0 to -1 based on sentiment)

dhorthy 3 hours ago | parent | prev [-]

yeah it's an interesting point. I can only guess that we didn't do a good enough job of learning from the humans while they were doing their jobs...seems like traditional ML or even LLM tech might be good enough that we can take another pass? Overall the thesis of humanlayer is that you should do all this super gradually, move the needle from 1% AI to 99%+, and have strong SLOs/SLAs around when you pause that needle moving because quality took a hit.