▲ | bunderbunder 6 days ago | |
From what I've experienced, this depends very much on the programming language, platform, and business domain. I haven't tried it with Rails myself (haven't touched Ruby in years, to be honest), but it doesn't surprise me that it would work well there. Ruby on Rails programming culture is remarkably consistent about how to do things. I would guess that means that the LLM is able to derive a somewhat (for lack of a better word) saner model from its training data. By contrast, what it does with Python can get pretty messy pretty quickly. One of the biggest problems I've had with it is that it tends to use a random hodgepodge of different Python coding idioms. That makes TDD particularly challenging because you'll get tests that are well designed for code that's engineered to follow one pattern of changes, written against a SUT that follows conventions that lead to a completely different pattern of changes. The result is horribly brittle tests that repeatedly break for spurious reasons. And then iterating on it gets pretty wild, too. My favorite behavior is when the real defect is "oops I forgot to sort the results of the query" and the suggested solution is "rip out SqlAlchemy and replace it with Django." R code is even worse; even getting it to produce code that follows a spec in the first place can be a challenge. |