Remix.run Logo
traverseda a day ago

I don't understand why problems like this aren't solved by vector similarity search. Indiana Jones lives in a particular part of vector space.

Two close to one of the licensed properties you care to censor the generation of? Push that vector around. Honestly detecting whether a given sentence is a thinly veiled reference to indiana jones seems to be exactly the kind of thing AI vector search is going to be good at.

genericone a day ago | parent | next [-]

Thinking of it in terms of vector similarity does seem appropriate, and then definition of similarity suddenly comes into debate: If you don't get Harrison Ford, but a different well-known actor along with everything else Indiana-Jones, what is that? Do you flatten the vector similarity matrix to a single infringement-scale?

htrp a day ago | parent | prev [-]

Not worth it to compute the embedding for Indy and a "bull-whip archaeologist" most guardrails operate at the input level it seems?

gavmor a day ago | parent [-]

> Not worth it to compute the embedding for Indy

If IP holders submit embeddings for their IP, how can image generators "warp" the latent space around a set of embeddings so that future inferences slide around and avoid them--not perfectly, or literally, but as a function of distance, say, following a power curve?

Maybe by "Finding non-linear RBF paths in GAN latent space"[0] to create smooth detours around protected regions.

0. https://openaccess.thecvf.com/content/ICCV2021/papers/Tzelep...