Remix.run Logo
sunshowers 7 hours ago

I'm not sure about research, but I've used LLMs for a few things here at Oxide with (what I hope is) appropriate judgment.

I'm currently trying out using Opus 4.5 to take care of a gnarly code reorganization that would take a human most of a week to do -- I spent a day writing a spec (by hand, with some editing advice from Claude Code), having it reviewed as a document for humans by humans, and feeding it into Opus 4.5 on some test cases. It seems to work well. The spec is, of course, in the form of an RFD, which I hope to make public soon.

I like to think of the spec is basically an extremely advanced sed script described in ~1000 English words.

AlexCoventry 5 hours ago | parent [-]

Maybe it's not as necessary with a codebase as well-organized as Oxide's, but I found gemini 3 useful for a refactor of some completely test-free ML research code, recently. I got it to generate a test case which would exercise all the code subject to refactoring, got it to do the refactoring and verify that it leads to exactly the same state, then finally got it to randomize the test inputs and keep repeating the comparison.