Remix.run Logo
OJFord 8 hours ago

Is that even possible while still training on 'things written by humans' (and not expressly for training purposes) though?

wredcoll 8 hours ago | parent | next [-]

It doesn't have to be perfect. A hypothetical law could be phrased something like "not allowed to intentionally influence the user into thinking the llm is a human", which sure, is up to judges at the end, but it also gives a clear indication of things to avoid doing intentionally.

basisword 8 hours ago | parent | prev [-]

I feel like you could do it via the system prompt quite easily (but maybe that's my lack of knowledge showing).