Remix.run Logo
thorum 4 days ago

> Six of the eleven picked the same movie

This is surely the greatest weakness of current LLMs for any task needing a spark of creativity.

torginus 4 days ago | parent | next [-]

I have noticed this too - often when one model volunteered the wrong answer - such as making up a nonexistent API, I asked another, and it gave me the exact same thing! It's highly unlikely that two totally independent models would make up the same fictional thing.

There must be something strange going on (most likely training on each others' wrong outputs, but I dunno)

joseda-hg 4 days ago | parent [-]

I've been burned by getting a deprecated version of an API Or hallucinated that a method of X library should exist in Y because they're similar

Timwi 4 days ago | parent | prev [-]

This is definitely something very early LLMs could do that has kind of gotten beat out of them. I used to ask ChatGPT to simulate a text adventure game, but now if you try that you always get exactly the same one.

sireat 4 days ago | parent [-]

Curious, what kind of prompt gives you the same text adventure game?

Surely it is a question of prompting some context(in UI mode) or with additional kicker of temperature (if using API)?

At the very least some set up prompt such as "Give me 5 scenarios for text adventure game" would break the sameness?

There have always been theories that OpenAI and other LLM providers cache some responses - this could be one hypothesis.

karmakaze 4 days ago | parent [-]

I'm now imagining 5 hipster AIs writing those stories--different in predictable ways.