▲ | MarkMarine 2 days ago | |
Works great until it’s stuck and it starts just refactoring the tests to say true == true and calling it a day. I want the inverse of black box testing, like the inside of the box has the model in it with the code and it’s not allowed to reach outside the box and change the grades. Then I can just do the Ralph Wiggum as a software engineer loop to get over the reward hacking tendencies | ||
▲ | 8n4vidtmkvmk 2 days ago | parent [-] | |
Don't let it touch the test file then? I usually give context to the LLM about what it's allowed to touch. I don't do big sweeping changes though. Don't trust LLM for that. For small, focused changes its great |