|
| ▲ | Tarq0n 7 hours ago | parent | next [-] |
| Not definitively. LLMs are stochastic with respect to input, temperature and the exact prompt. It's possible that the model was already capable of it but never received the exact right conditions to produce this output. |
| |
| ▲ | teiferer 7 hours ago | parent [-] | | Every model is able to solve each problem, given the right prompt. (Worst case, the prompt contains the solution.) | | |
| ▲ | 8 minutes ago | parent | next [-] | | [deleted] | |
| ▲ | pontifier 2 hours ago | parent | prev [-] | | Interesting... Exhaustive brute force prompting might expose previously unknown capabilities in existing models. Seems like a whole can of worms. |
|
|
|
| ▲ | imiric 7 hours ago | parent | prev | next [-] |
| > So this is proof of the models actually getting stronger (previous generations of LLMs were unable to solve this one). No, it's not. While I don't dispute that new models may perform better at certain tasks, the fact that someone was able to use them to solve a novel problem is not proof of this. LLM output is nondeterministic. Given the same prompt, the same LLM will generate different output, especially when it involves a large number of output tokens, as in this case. One of those attempts might produce a correct output, but this is not certain, and is difficult if not impossible for a human not expert in the domain to determine this, as shown in this thread. |
| |
| ▲ | notahacker an hour ago | parent [-] | | As others have pointed out, a key part of the prompt used here may have been "don't search the internet" as it would most likely have defaulted to starting off with existing approaches to that problem... |
|
|
| ▲ | jb1991 7 hours ago | parent | prev [-] |
| Minor aside, these models do not return the same answer every time you prompt it. Makes it harder to reason over their effectiveness. |
| |
| ▲ | rjh29 7 hours ago | parent [-] | | You don't need to say "Minor aside" either. Thankfully language is a creative endeavour not a scientific one. | | |
| ▲ | rjh29 an hour ago | parent | next [-] | | Context: parent originally said "you should not say 'worth mentioning', if it's worth mentioning you can just say it". That sentence has now been edited out so my comment looks weird. | |
| ▲ | 7 hours ago | parent | prev [-] | | [deleted] |
|
|