| ▲ | basch 5 hours ago | |||||||||||||||||||||||||
I have a dumb performance question. Why when asking a model to change text in a minor way; are we not asking it to generate the operational transformations necessary to modify the text, and then just executing the ot on the existing text vs reproducing every token? Maybe tools are doing that more than I realize? | ||||||||||||||||||||||||||
| ▲ | XYen0n 4 hours ago | parent | next [-] | |||||||||||||||||||||||||
The only thing a model can output is tokens; to achieve this, a tool of converting tokens into operational transformations is required. For example, I have an ast-grep skill, it will instruct the model to generate ast-grep rules and run ast-grep to perform file modifications. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | sigmoid10 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
The simple answer is: because it is not necessary to achieve the same final output. Most LLMs today are trained as autoregressive token predictors. They fundamentally can't work any other way. But we know how to train them really well and they have many applications beyond editing text. Diffusion LLMs exist too, which work a bit closer to what you describe, but they are not yet at the same level of intelligence since training methods are not that mature and they are generally less flexible as well. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | jfim 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
I've seen Claude use sed to edit files on other hosts instead of copying the file back and forth to edit it. Not quite full blown OT but it's going in that direction. | ||||||||||||||||||||||||||
| ▲ | cryptoz 4 hours ago | parent | prev [-] | |||||||||||||||||||||||||
This is the approach I take with code edits to existing files at Code+=AI; I wrote a blog post with a simple example of AST modification to illustrate: https://codeplusequalsai.com/static/blog/prompting_llms_to_m... | ||||||||||||||||||||||||||