Machine Translation Showdown: 5 MTs tested using a classical Japanese excerpt

By | October 12, 2020

As a person working in the IT industry who does literary translation on the side, I have a huge interest in machine translation technology. Part of me looks forward to future developments, but another part hopes that machines never surpass humans in the art of translation. 

Over time I’ve written a few articles on machine translation (MT), but I haven’t done much in this area lately. While I think it will still be quite some time until machines can perform translation of literary works as good as an expert (at least for the English/Japanese language pair, which has so many key differences), I think translation technologies are gradually maturing, perhaps at a faster rate than I thought. Far before machine translation takes over completely, I think we will see more and more uses of machines to help with the process, in the form of machine-aided translation. Eventually, machines may be able to do a reasonable first pass that only requires double checks and refinement in a few areas.

In any case, I decided to a comparison of MTs with a short literary excerpt as the input text. I chose the first paragraph of “Shell” (貝殼) by the well-known author Akutagawa Ryunosuke  (芥川龍之介). It’s only 3 sentences, with relatively straightforward grammar, except for perhaps the last sentence which is a bit tricky. Please note that this text uses older Japanese notation, so you’ll see things like だつた instead of だった. (I have written an article with some hints for reading classical literature here.)

I executed these translations on October 8, 2020, and I am sure these machines will gradually evolve over time. For the text I just directly cut and paste from the source text into the machine translator site, using Chrome browser on a Mac.

I will first show the original text and then show the translations from each of the five machine translation services I chose, plus my own human translation for comparison (this is not intended to be the “best” or even a “perfect” translation, only a comparison point against the MTs, and I purposefully didn’t read the entire story to make it a fair comparison). I will show them anonymized so you can decide for yourself which is the best machine translation and which is the human one. Then I will reveal the translations one by one, ranked by best MT first. I’ll give a few comments about each after the translation. I will be focusing on obvious errors or unnatural language, and will not point out all things I would have, had I been analyzing a human translation and giving feedback. 

In the comments I will be using the terms S1, S2, S3 to refer to the first, second, and third sentence of the translation.

Original Japanese text from Aozora Bunko 

[Note: Furigana has been removed since it will not render properly]

彼等は田舎に住んでゐるうちに、猫を一匹飼ふことにした。猫は尾の長い黒猫だつた。彼等はこの猫を飼ひ出してから、やつと鼠の災難だけは免れたことを喜んでゐた。

Here are the anonymized human and MT entries, in random order:

[Entry #1]

They had a cat while living in the country. A cat is a black cat with a long tail. After they took out this cat, they rejoyed that only the calamity of him and his fox had been spared.

[Entry #2]

While they live in the country country, is 一匹飼 ふことにした in cats. Black cat だつた where the cat has a long tail. They were pleased with only the misfortune of a guy and the mouse having avoided 免 after 飼 ひ produced this cat.

[Entry #3]

They decided to keep a cat while they lived in the country. The cat was a long-tailed black cat. They were glad, after they had brought the cat, that they had escaped the misfortune of him and the mouse.

[Entry #4]

While living in the country, they decided to get a cat. It was a black cat with a long tail. They were glad that, since they had adopted the cat, they had been spared from the misfortune of him and the mice.

[Entry #5]

While living in the country, they decided to get a cat. It was a long-tailed black cat. They were glad that, with the cat’s arrival, at least their mouse problems were finally taken care of.

[Entry #6]

While living in the countryside, they decided to keep a cat. The cat was a long-tailed black cat. They were pleased that they had been spared only the calamity of him and the mouse since he raised the cat.

Machine translations ranked from best to worst (with comments)

DeepL [Entry #4]

While living in the country, they decided to get a cat. It was a black cat with a long tail. They were glad that, since they had adopted the cat, they had been spared from the misfortune of him and the mice.

Comments: 

This is by far the best translation and pretty impressive for a machine, if you ask me. I especially liked how it rendered S2 using “it”, whereas all the other MTs were lacking here (either by using the incorrect “a”, or repeating “cat” which sounds a bit awkward, despite it being that way in the original text). I also liked the verb “get” in S1 which sounds very natural and is something I would expect a machine to come up with.

I see three big problems with this translation. One is the awkwardness of S3, especially regarding the commas. The second is that “him” in “the misfortune of him and the mice” in S3 is incorrect, as the misfortune is only caused by the mice, not the cat. Finally, the word やっと (やつと in the older writing style) is not reflected anywhere.

But overall I think not only is this passage readable, but even enjoyable to read.

Mirai Translator (trial version) [Entry #3]

They decided to keep a cat while they lived in the country. The cat was a long-tailed black cat. They were glad, after they had brought the cat, that they had escaped the misfortune of him and the mouse.

Comments:

While this passage (from the second-best MT in my comparison) is pretty good, it has a bunch of mistakes or areas of awkward phrasing. First, “keep” a cat in S1, while understandable and technically correct, sounds a bit odd to me. The repetition of “cat” in S2 also feels a bit odd, but it’s a pretty minor issue.

For S3, the three errors in the previous MT are also present here, but in addition to those I feel “brought the cat” is very unnatural, and “escaped the misfortune” is also a bit weird (the latter is translated nicely as “spared by the misfortune of” in the previous MT). Also “the mouse” (singular) is used instead of plural, and I think plural is a more natural choice.

Because of the mistakes and unnatural areas I feel the result is overall pretty awkward.

Google Translate [Entry #6]

While living in the countryside, they decided to keep a cat. The cat was a long-tailed black cat. They were pleased that they had been spared only the calamity of him and the mouse since he raised the cat.

Comments:

S1 and S2 is similar to the previous MT, however S3 really gets butchered, especially the end part where it is not clear what “he” refers to. The previous “him” is referring to the cat, so did the cat raise itself? It’s not the people living in the countryside, since they were identified as plural. The wording of “calamity” is also a bit odd, and the “only” also sounds odd (although technically it is in the original text.) The repeated “they” at the beginning of S3 is also a bit awkward.

By this point the resultant translation is not that useful for understanding or appreciating the text, as it will likely confuse the reader.

Microsoft Translator (through Bing) [Entry #1]

They had a cat while living in the country. A cat is a black cat with a long tail. After they took out this cat, they rejoyed that only the calamity of him and his fox had been spared.

Comments:

This translation is noticeably worse than the others, and has serious problems in each of the sentences. For example, in S1 it looks like the cat was always there from the beginning, whereas in the original text they people living there decided to get a cat, which is a huge difference. In S2 we can see that the article for “cat” was improperly chosen, leading to an awkward sentence. Finally, in S3 there are several issues, including the odd appearance of a fox. Even if there is some old usage where 鼠 (nezumi) means “fox”, from the context it should be clear it is mouse/mice that is being discussed. Not to mention, is “rejoyed” really a word?

Weblio Translator [Entry #2]

While they live in the country country, is 一匹飼 ふことにした in cats. Black cat だつた where the cat has a long tail. They were pleased with only the misfortune of a guy and the mouse having avoided 免 after 飼 ひ produced this cat.

Comments: 

This translation is a total train wreck and doesn’t render any of the sentences even close to correct, and there are Japanese characters throughout, perhaps places where the logic didn’t know what to do. As a side note, I think some of the issues here (like “country country”) were caused by the embedded Furigana that came over with the cut-and-paste, although the other MTs seemed to handle that correctly.

Coming from what is apparently a Japanese company, this is quite surprising, but surely this MT will evolve along with the others.

Human Translation [Entry #5]

While living in the country, they decided to get a cat. It was a long-tailed black cat. They were glad that, with the cat’s arrival, at least their mouse problems were finally taken care of.

Conclusion

With a small sample text we saw a huge variation in translation quality––everything from DeepL’s surprisingly natural translation (despite a few issues) to Weblio’s embarrassingly error-filled one. Nevertheless, this little exercise confirmed my gut feeling that literary JP/EN MT has a long way to go, even for relatively simple texts. 

By the way, I enjoyed writing up this comparison and hope to do more of these articles in the future. If you enjoyed this, please consider liking, commenting, or RTing it. Would you like to see comparison using a longer text? A more modern one?

(Note: title image created using Canva.com)

***

Update[10/13]: It looks that they have already made changes to the DeepL algorithm, resulting in a worse translation than before (with at least one or two extra mistakes). I will not put the translation here since I am sure it will change again, but just keep in mind this stuff is in constant flux.

(Visited 840 times, 1 visits today)

3 thoughts on “Machine Translation Showdown: 5 MTs tested using a classical Japanese excerpt

  1. Jane

    This is very interesting, thank you. Please do more of this research and display the results for us. I’m interested in both old and new Japanese literary texts.

    Reply
    1. locksleyu Post author

      Thanks for the comment, glad you found it interesting! I’ll see if I can do another followup article on this topic in the future.

      Reply

Leave a Reply

Your email address will not be published.