Remix.run Logo
vincnetas 2 hours ago

We could call this "generative adversarial network" (GAN) :)

https://en.wikipedia.org/wiki/Generative_adversarial_network

wwind123 an hour ago | parent [-]

This kind of approach would generally still need human guidance, otherwise these models might get stuck in weird niche corners of the problem space that would not be relevant to any real world project.

ben_w 13 minutes ago | parent [-]

We could call this "reinforcement learning from human feedback" (RLHF) :)

https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...