Remix.run Logo
daxfohl a day ago

Seems like it would be easy for foundation model companies to have dedicated input and output filters (a mix of AI and deterministic) if they see this as a problem. Input filter could rate the input's likelihood of being a bypass attempt, and the output filter would look for censored stuff in the response, irrespective of the input, before sending.

I guess this shows that they don't care about the problem?

jamiejones1 21 hours ago | parent [-]

They're focused on making their models better at answering questions accurately. They still have a long way to go. Until they get to that magical terminal velocity of accuracy and efficiency, they will not have time to focus on security and safety. Security is, as always, an afterthought.