I have no doubt that the writer is better at translating than AI, but I have to say that AI translation has gotten so good that I'm not sure how much longer translation work will be there, or rather it might end up being more about auditing.

For example, I just read the Lawrence Ellsworth translation of The Three Musketeers, which I very thoroughly enjoyed. I don't speak or read French, but from my understanding Ellsworth's translation is considered one of the more accurate translations of the work.

Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

They were honestly remarkably similar. As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation. I do think that the prose for the Ellsworth translation was a bit better, but the prose for the Fable one was actually perfectly readable. Again, I don't speak French so I cannot say for sure, but I do not believe that I would have gotten a significantly different experience had I read the Fable version instead of the Ellsworth version.

Now, it's possible (and likely) that this is somewhat self-fulfilling; Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it; sadly since I do not speak any language outside of English, there's sort of a catch-22: the only way I can compare the accuracy of a translation is to compare against other translations, but if other translations exist then that will likely influence the results, and if a translation doesn't already exist then I have no way of auditing it.

I'm still going to continue reading through Ellsworth's translations for the subsequent stories simply because that feels more canonical, and as I said I do think the prose was a bit better.

▲

Wowfunhappy 5 hours ago | parent | next [-]

> Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

This isn’t a great test, because Claude almost certainly has multiple translations of The Three Musketeers in its training data.

▲

tombert 5 hours ago | parent [-]

Read the last two paragraphs :)

▲

svara 2 hours ago | parent | next [-]

The things is, this is almost certainly what's happening.

You can (could, maybe they 'fixed' it by now) get sota LLMs to reproduce entire novels near verbatim.

The idea of giving it parallel texts of those novels in different languages, to train it on translation, is so obvious it'd just be strange if the AI labs didn't do it.

In fact DeepL was doing basically that more than 10 y ago.

▲

Wowfunhappy 5 hours ago | parent | prev | next [-]

Oops, I legitimately missed the second-to-last paragraph.

I still think there are better tests you could do. Ideally, you would choose a book that was published recently—after the model’s cut-off date—which is considered to be a good translation. But even something like The Girl With the Dragon Tattoo, which is not particularly new and by no means obscure, would be better than a famous work of literature like The Three Musketeers that has many translations.

	▲	tombert 4 hours ago \| parent [-]
		Almost certainly correct, though I've noticed that these LLMs like to complain when you give it stuff that is still in copyright. The Three Musketeers is thoroughly public domain everywhere so in that sense it's a good test, but of course because it's public domain everywhere there are lots of translations to crib from so I acknowledge it's not a great test because the training data almost certainly contains a competent translation. Even if Fable didn't have Ellsworth's translation, it certainly has the William Barrow translation, which would still get it like 80+% of the way there. My wife speaks Spanish, I should get her to do some kind of comparison with a Spanish book that doesn't have English translations.

▲

card_zero 5 hours ago | parent | prev [-]

They say "yes, I admit it, this is all invalid".

	▲	tombert 3 hours ago \| parent [-]
		No, they are a disclaimer that it's possible that the data isn't conclusive. Not the same thing as saying "it's all invalid".

▲

geon 5 hours ago | parent | prev | next [-]

> I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

I'm pretty sure the Ellsworth translation is in the corpus. You basically instructed claude to regurgitate it.

The llms all have the more famous books memorized. You can trick them to recite them more or less word for word.

▲

tombert 5 hours ago | parent [-]

I mentioned this specifically in my comment :)

▲

stdbrouw 3 hours ago | parent [-]

... yet you still conclude "AI translation has gotten so good", so which is it?

▲

tombert 3 hours ago | parent [-]

I do think it's gotten pretty good. I'm just acknowledging my limitations in the matter. It's not a contradiction.

	▲	oytis 3 hours ago \| parent [-]
		Try translating some prose from English to another language, then, in a different model, back to English

▲

j_w 4 hours ago | parent | prev | next [-]

As somebody who regularly reads translated works, including the occasional machine translation (MTL), they (MTL) suck. You got a hugely biased result, which you recognize.

Translation is hard. If you're familiar with reading translations from specific languages MTL works have a very specific smell to them, it's a bit hard to describe but it's there. A good translation is miles (kilometers, for those outside of the US) above MTL.

That's not to say that perhaps the latest LLMs will have better translation abilities, but that they are generally crap currently. Maybe they are fine for something very short, but absolutely not for longer content.

▲

Swizec 5 hours ago | parent | prev | next [-]

> As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation.

Crucially the full translation was part of ChatGPT’s training set. Recall is a pretty solved problem in machine learning.

How well does it translate a French novel published yesterday? Where neither the original novel nor any translations are in the training set yet? Or might not even exist!

I tried asking ChatGPT to translate a letter I wrote in Slovenian this weekend. It got the general gist but missed a lot of the nuance. Completely missed several of the little touches of tone where the right choice of synonym conveys a whole bunch of information.

▲

tombert 5 hours ago | parent [-]

Did no one actually finish reading my comment?

▲

Swizec 5 hours ago | parent | next [-]

I feel like that wasn’t there when I started writing my comment. I also have a bad habit of quickly posting and then adding over a few minutes.

Glad we agree :)

	▲	tombert 5 hours ago \| parent [-]
		Guess I have no way of proving it, but I pinky swear that I didn't edit it in later! But yeah, I broadly do agree; if I read other languages I could find a book that hadn't been thoroughly translated to English and then I could give a proper analysis on how good the translation is, but since I'm a very stereotypical American I know exactly one language (and sometimes my comprehension of even that is questionable).

▲

zipy124 5 hours ago | parent | prev [-]

Welcome to the internet

▲

no_multitudes 2 hours ago | parent | prev | next [-]

> Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it

The `cp` program on my computer also has the remarkable ability to produce a faithful translation of The Three Musketeers when provided one as input.

▲

JeremyNT 3 hours ago | parent | prev | next [-]

Honestly, translations of fiction are themselves creative works, and the translator needs to really understand both cultures and needs to write cohesively throughout the work. I'm not sure this is even really a question of "can it translate" so much as "can it create a good work of fiction" which is a much higher bar. So maybe the model can mimic the style (especially given that it was probably trained on existing translations) but could it really do so from scratch in a way that is actually compelling? I'm not so sure.

Of course as for the poor OP... is this a majority of what working translators are paid to do?

I suspect a lot of translation is just grunt work - technical and business documents. The lack of a cohesive voice with considered style is perhaps not really much of an issue in those. The expectations are just much lower; text that conveys the basic meaning is a much lower bar to clear.

She's probably better than a bot at that stuff, at least for now, but my concern is that it won't be "enough" better for businesses to justify her continued employment. And this is my general feeling about this stuff across society, in basically all domains.

▲

exe34 5 hours ago | parent | prev | next [-]

> Again, I don't speak French so I cannot say for sure

This reminds me of the adage, that ChatGPT is really great at everything except my own work.

	▲	tombert 5 hours ago \| parent \| next [-]
		Yeah, that's why I put the caveat in there. I have no real way to verify the result outside of checking against "known good" translations, though if the known-good translation exists then there's not exactly a lot of reason to do the AI translation in the first place. I suspect if I knew another language I would be able to find errors in the translation.
	▲	rootusrootus 5 hours ago \| parent \| prev [-]
		Yes, it is another variation on the Gell-Mann Amnesia Effect. I have a number of non-developers in my circle of friends who think Claude is about to put me out of work. They think it is just a great tool for them, not a replacement. Of course!

▲

layer8 5 hours ago | parent | prev | next [-]

I see the difficulties more in other areas, such as technical translations, specialist books, user manuals, and translating UIs, where contextual information and a back and forth with the client is needed to clarify details, and (for user manuals and UIs) the translator has to put themselves in the mind of the user and has to consider the possible contexts and use cases.

▲

bombcar 5 hours ago | parent | prev | next [-]

You're very likely to get a somewhat circular reference; the key (for me) is that for 90% of the usages, "standard translation LLMs" are just fine - I still recommend a translator but they're more of a proof-reader for both languages, catching where something slipped through.

▲

ixtli 5 hours ago | parent | prev | next [-]

This is sort of missing the point-- people who dont deal with linguistics dont understand that there are multiple types of translation. There's word for word (which is what you're talking about) and sense for sense. If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access. The ways in which this impacts the translation will forever be unknown to you and in the worst case lost forever.

So i guess in the end it just matters how important the work is.

	▲	tombert 5 hours ago \| parent \| next [-]
		Actually I was talking about tonally as well. A raw "word for word" translation (which I also tried) made the story somewhat hard to follow and very dry, but just asking it to keep the same kind of jovial swashbuckling tone of the original made something pretty similar to Ellsworth's translation. Again, before someone decides to "correct" me on this, I am aware that it's very likely that the Ellsworth translations are part of the training set so it's not directly a fair comparison.
	▲	senordevnyc an hour ago \| parent \| prev \| next [-]
		If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access. What’s the intent and context that a human translator of a text is typically privy to that an LLM is not?
	▲	vel0city 3 hours ago \| parent \| prev [-]
		> If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access. Assuming lots of material local to the context one is wanting to translate is included, why couldn't it potentially access that additional context?

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

turtletontine 4 hours ago | parent | prev | next [-]

> … considered one of the more accurate translations of the work.

I think you’re missing a big point of translating literary works. A purely “accurate”, phrase-by-phrase translation is often not very good; the actual literary style, the feeling and the allusions and references, often get lost that way. A good translation of literary work requires a lot of deliberate choices by the translator to deviate from literal translations in ways that convey the style of the original, or an extra layer of meaning that would be lost by an “accurate” translation of a phrase. Also, being consistent with these choices matters a lot, which OP claims LLMs are less good at.

▲

jimbo808 5 hours ago | parent | prev | next [-]

LLMs are now being aggressively manipulated for propaganda purposes. Powerful people have realized that people believe LLMs, and treat them as authoritative sources of fact.

The number of lies, lies by omission, deceptive distortions, and fallacious argument tactics they generate is absurd, and increasing rapidly. Translation, when done as a service you are paid for, can't be relied on by propaganda bots.

	▲	smallpipe 4 hours ago \| parent [-]
		Do you have examples?

▲

mjmsmith 5 hours ago | parent | prev | next [-]

An interesting counter-example: https://xcancel.com/ValerioCapraro/status/206506665753442336...

▲

layer8 5 hours ago | parent [-]

I wonder if “Just 3 words: you’re not alone” would have been acceptable. :)

	▲	mjmsmith 4 hours ago \| parent [-]
		The Empire Strikes Back: "I'm your dad."

▲

paulddraper 5 hours ago | parent | prev | next [-]

	▲	tombert 5 hours ago \| parent [-]
		Already mentioned in the comment lol.

▲

zuzululu 6 hours ago | parent | prev [-]

This moment is coming for software developers too

▲

tombert 6 hours ago | parent | next [-]

Yeah almost certainly, especially the ones who made a career out of "copypaste from StackOverflow", which is most engineers.

But even the good engineers should likely be a little worried.

▲

rootusrootus 5 hours ago | parent | prev | next [-]

More specifically, it is coming for coders. If you make your living by banging out lines of code all day, then you may want to be looking at adjusting your career trajectory. But if that is your job, you are either very junior, or a bit foolish for getting into that situation.

▲

zuzululu 5 hours ago | parent [-]

so what is software developer doing if writing code is not part of their job

I don't see how not writing code is being offered as a moat, it seems like that is just translating business/stakeholder requirements to architecture/biz processes which is exactly the type of low hanging fruit that AI will capture first

or was it your point that the position sits closer to the stakeholders (relatively compared to those lifting) thus immune from replacement by AI

or is your argument that your taste is exquisite that no AI will be able to match it like it already has with software so far and it will not improve beyond the current state

▲

tombert 4 hours ago | parent | next [-]

If you get to senior level then most of your job probably is not writing code, but planning things out. The code is largely an implementation detail.

At least that's how it was for me, maybe other peoples' careers are different.

▲

lelanthran 3 hours ago | parent | next [-]

> If you get to senior level then most of your job probably is not writing code, but planning things out.

If they're so good at banging out code now, they're coming for that too, you know.

▲

tombert 3 hours ago | parent [-]

I don't necessarily disagree, but there's gotta be a name for some kind of "infinite extrapolation" fallacy, where you assume that the current rate of progress will continue indefinitely.

That might happen, but I don't think it's implied, at least given literally every other bit of technology that has ever happened in history ever.

	▲	lelanthran 3 hours ago \| parent [-]
		> I don't necessarily disagree, but there's gotta be a name for some kind of "infinite extrapolation" fallacy, where you assume that the current rate of progress will continue indefinitely. I am not assuming they'll continue indefinitely, but it's a small step from writing code to planning out the code to write, and another small step from planning a coding project to planning a software project, etc. These are all small steps, and because the act of specification + planning paid less than specification + planning + programming, what reason do you have for thinking that specification + planning is valuable enough to keep the salaries the same as specification + planning + programming?

▲

bluefirebrand 4 hours ago | parent | prev [-]

Yes, my career has been different. At my workplaces seniors still have to code because they dont want to hire juniors

The "planning things out" has moved to another layer, called "architects"

▲

pwython 4 hours ago | parent | prev | next [-]

Same thing architects do if drawing lines gets automated: architecture.

Would you trust living in a high rise designed by AI?

Designing a system that survives production is the job.

▲

skydhash 5 hours ago | parent | prev [-]

So what a lab researcher doing if typing articles is not part of the job?

	▲	jujube3 4 hours ago \| parent [-]
		Well--well look. I already told you: I deal with the god damn customers so the engineers don't have to. I have people skills; I am good at dealing with people. Can't you understand that? What the hell is wrong with you people? https://www.reddit.com/r/ProductManagement/comments/uy1ot1/w...

▲

ixtli 5 hours ago | parent | prev | next [-]

I think this collapses a global, complex heirarchy of software engineering workers into a single monolith and serves only to advertise for frontier LLM providers. the point where you no longer need engineers is not going to be reached by making LLMs better and better.

▲

VBprogrammer 5 hours ago | parent | prev | next [-]

I think there is going to be a long time before all of the obscure knowledge of a decent software developer can be completely replaced by AI. Though the job is going to change beyond recognition. It already has in many ways.

▲

daveguy 5 hours ago | parent | prev [-]

But not before a huge crash in optimism about their capabilities. Specifically wrt accuracy, reliability, efficiency, and organization/architecture.