| ▲ | Moosdijk 6 hours ago | |||||||
Interesting. Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety, this approach seems to apply the same principle within one run, looping back internally. Instead of big models that “brute force” the right answer by knowing a lot of possible outcomes, this model seems to come to results with less knowledge but more wisdom. Kind of like having a database of most possible frames in a video game and blending between them instead of rendering the scene. | ||||||||
| ▲ | nl 2 hours ago | parent | next [-] | |||||||
> Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety I'm not sure what you mean here, but there isn't a difference in the number of times a model runs during inference. | ||||||||
| ▲ | omneity 4 hours ago | parent | prev [-] | |||||||
Isn’t this in a sense an RNN built out of a slice of an LLM? Which if true means it might have the same drawbacks, namely slowness to train but also benefits such as an endless context window (in theory) | ||||||||
| ||||||||