Remix.run Logo
awakeasleep 2 days ago

Thats not a real rebuttal.

First, in the pre training stage humans curate and filter the data thats actually used for training.

Then in the fine tuning stage people write ideal examples to teach task performance

Then there is reinforcement learning from human feedback RLHF where people rank multiple variations of the answer an AI gives, and thats part of the reinforcement loop

So there is really quite a bit of human effort and direction that goes into preventing the garbage-in garbage-out type situation you're referring to