Pronominal object clitics in preverbal position are a hard nut to crack for Google Translate

Unlike English, some Romance languages not only allow—but sometimes require—pronominal object clitics in preverbal position (Hanson & Carlson, 2014; Labotka et al., 2023). That is, instead of saying La maestra ha detto il nome (Italian) ‘The teacher has said the name’, Italian allows Il nome lo ha detto la maestra (literally, ‘The name it has said the teacher’), which could translate as ‘The name has been said by the teacher’, ‘The teacher has said the name’, or even ‘It is the teacher that has said the name’. The form Il nome lo ha detto la maestra is a marked phrasing that increases the attention to the agent of the action. Furthermore, when the clitic is in preverbal position, the degree of focus on the agent is also dependent on the context. For instance, the focus is light in Lo ha detto la maestra, whereas it is stronger in Lo ha detto la maestra, non l’assistente “It’s the teacher that’s said it, not the assistant”.

The agent-focus can sometimes be relinquished in translations (‘The teacher has said it’). In other cases, it must be preserved by applying an intonational or written emphasis on the agent (‘The teacher has said it’), or by implementing the idiomatic form ‘The teacher said’, or a marked syntax—for instance, with an it-cleft, as in “It’s the teacher that’s said it”, or by adding ‘oneself’, as in ‘The teacher said it herself’.

Now, how does Google Translate (GT) deal with these translations in May 2023? To translate Lo ha detto la maestra, GT opts for ‘The teacher said’, which is a good, idiomatic option. When it comes to translating to Spanish, GT returns ‘El profesor dijo’, which is the direct equivalent of the English translation.1 This option is valid in some varieties of Spanish in America. Nonetheless, it must be noted that a more direct translation from the Italian form would have been very good in Spanish (Lo ha dicho la maestra).

GT has greater trouble when the content of the sentence is slightly less frequent. For instance, Lo cerca la maestra ‘Him is seeking the teacher’ is stripped of its markedness in GT’s rendering in English—i.e., ‘The teacher is looking for him’. Preserving the agent-focus—e.g., “It’s the teacher that’s looking for him”—would require some syntactic liberties, and hence entail some risks. So, playing it safe is understandable. Our next step is checking the translation to some Romance languages that allow the same movement to preverbal position present in the original Italian sentence Lo cerca la maestra. Aside from GT, the sentence could be well translated into Romanian as Îl caută dăscăliţa, or into Spanish as Lo busca la profesora. In contrast, for both translations, GT returns the equivalents of the English translation—i.e., Profesorul îl caută and El profesor lo busca, again discarding the focus on the agent—unnecessarily in these cases, due to the overlap in the grammars.2

Suggesting a better translation in Google Translate

Submitting a better translation is always an option.

In fairness, machine translation is an absolute feat overall, provided enough caution is practised. With the expansion of language models, machine translation is only going to improve. So, how much of a piece of cake will it be for GT to crack some of these syntactic details in time, and to preserve syntactic forms across languages when the systems match?

References

Hanson, A. E. S., & Carlson, M. T. (2014). The roles of first language and proficiency in L2 processing of Spanish clitics: Global effects. Language Learning, 64(2), 310-342. https://doi.org/10.1111/lang.12050

Labotka, D., Sabo, E., Bonais, R., Gelman, S. A., & Baptista, M. (2023). Testing the effects of congruence in adult multilingual acquisition with implications for creole genesis. Cognition, 235, 105387. https://doi.org/10.1016/j.cognition.2023.105387


  1. We can again ignore the errors of lexical gender in the translations.↩︎

  2. We can again ignore the errors of lexical gender in the translations.↩︎

comments powered by Disqus