The “change capture”/straight jacket style tests LLMs like to output drive me nuts. But humans write those all the time too so I shouldn’t be that surprised either!
What do these look like?