Remix.run Logo
hausrat 8 hours ago

This has very little to do with someone making the LLM too human but rather a core limitation of the transformer architecture itself. Fundamentally, the model has no notion of what is normal and what is exceptional, its only window into reality is its training data and your added prompt. From the perspective of the model your prompt and its token vector is super small compared to the semantic vectors it has generated over the course of training on billions of data points. How should it decide whether your prompt is actually interesting novel exploration of an unknown concept or just complete bogus? It can't and that is why it will fall back on output that is most likely (and therefore most likely average) with respect to its training data.

winddude an hour ago | parent | next [-]

> This has very little to do with someone making the LLM too human but rather a core limitation of the transformer architecture itself.

It has almost everything to do with it. Models have been fine-tuned to generate outputs that humans prefer.

anuramat 8 hours ago | parent | prev | next [-]

wdym by "prompt and vector is small"? small as in "less tokens"? that should be a positive thing for any kind of estimation

in any case, how is this specific to transformers?

chrisjj 8 hours ago | parent | prev [-]

> How should it decide whether your prompt is actually interesting novel exploration of an unknown concept or just complete bogus?

It shouldn't. It should just do what it is told.

philipwhiuk 2 hours ago | parent [-]

Remember that all it's actually 'doing' is predicting more text.