▲ | machiaweliczny a day ago | |
I feel like diffusion would be much more useful for code it it could only mark tokens as "valid" if they were passing code checks. So it could be thought as adding more of "semantic chunks" instead of just words. Not sure how to validate it as some additions always will result in invalid code. You could argue that running tests, linters is the same but I think one could make it that generations are validated much more often with diffusion models. Example: You remove some function, you also remove all uses of it. You can't use not existing variable etc. This could be trained on well commited git repos or stalking/stealing the work of developers via editor |