Nothing about this is moving goalposts - you and/or the person(s) conducting this study are the ones being misleading!
If you want to measure time to complete a complex task, then measure that. LOC is an intermediate measure. How much more productive is "55% more lines of code"?
I can write a bunch of garbage code really fast with a lot of bugs that doesn't work, or I can write a better program that works properly, slower. Under your framework, the former must be classified as 'better' - but why?
I read the study you reference and there is literally nothing in the study that talks about whether or not tasks were accomplished successfully.
It says:
* Junior devs benefited more than senior devs, then presents a disingenuous argument as to why that's the senior devs' fault (more experienced employees are worse than less experienced employees, who knew?!)
* 11% of the 55% increase in LOC was attributed directly to LLM output
* Makes absolutely no attempt to measure whether or not the extra code was beneficial