Remix.run Logo
dhorthy 4 hours ago

hah yeah I don't know how soon we will be on great accuracy for the latter, for things like "send an email", people tend to just block everything for approval, because clicking approve 90 times hand editing 10 times is a lot better than copying 90 things from one app to another and then 10 things copy, hand edit, send

although I do have some ideas on how you could use vector similarity against past executions to get a 1-100 score on how likely a given action is to be approved rejected. You could set a dial to "anything below 60 just auto-reject it and provide the past feedback to the model preemptively". This would need a lot of experimentation, might even be a research angle (if it hasn't been tried already)

(thinking like cosine * {1 if approved, -1 if rejected} and normalize the score 1-100. You could maybe even weight rejection in 0 to -1 based on sentiment)