Remix.run Logo
root_axis 3 days ago

The suggested requirements are not engineering problems. Conceiving of a model architecture that can represent all the systems described in the blog is a monumental task of computer science research.

crazylogger 3 days ago | parent | next [-]

I think the OP's point is that all those requirements are to be implemented outside the LLM layer, i.e. we don't need to conceive of any new model architecture. Even if LLMs don't progress any further beyond GPT-5 & Claude 4, we'll still get there.

Take memory for example: give LLM a persistent computer and ask it to jot down its long-term memory as hierarchical directories of markdown documents. Recalling a piece of memory means a bunch of `tree` and `grep` commands. It's very, very rudimentary, but it kinda works, today. We just have to think of incrementally smarter ways to query & maintain this type of memory repo, which is a pure engineering problem.

root_axis 3 days ago | parent [-]

The answer can't be as simple as more sophisticated RAGs. At the end of the day, stuffing the context full of crap can only take you so far because context is an extremely limited resource. We also know that large context windows degrade in quality because the model has a harder time tracking what the user wants it to pay attention to.

jibal 3 days ago | parent | prev [-]

It's software engineering.