Hallucinations in Large Multilingual Translation Models

What are Hallucinations in Large Multilingual Translation Models?

Sometimes, large language models produce facts but other times, their outputs are, well, fiction.

The paper “Hallucinations in Large Multilingual Translation Models” examines hallucinations in machine translation systems. These are situations where the generated translation diverges significantly from the meaning of the source. Hallucinations cause a variety of challenges, especially in systems designed to handle multiple languages. They significantly affect translation of languages that have limited resources available for machine learning and language processing tasks.

The study sheds light on the causes, implications, and potential solutions to address this critical issue. The authors, including Head of Aveni Labs, Dr Alexandra Birch, and Dr Barry Haddow, Aveni’s Head of Natural Language Processing, also explore how hallucinations in multilingual translation models differ from those in small bilingual models.

It highlights the risks associated with hallucinated translations, like spreading misinformation or generating offensive content. The study also compares how these large language models hallucinate versus how the machine translation tools we already use hallucinates. It calls for robust detection methods to check for these errors and mitigate them more effectively.

By analysing artificially induced hallucinations and their detection methods, the researchers provide valuable insights into the reasons behind them and propose measures to improve the overall quality of translations.

Key takeaways from Hallucinations in Large Multilingual Translation Models:

Hallucinations in machine translation systems pose significant challenges, especially in multilingual and low-resource language settings.
Toxic patterns in training data can contribute to the generation of hallucinated translations, highlighting the importance of data quality.
Large language models produce qualitatively different hallucinations compared to neural machine translation models, making tailored evaluation methods necessary.
Techniques like introducing slight variations and using backup systems with different strengths can help reduce hallucinations and improve translation quality.
Addressing hallucinations in machine translation systems is crucial for ensuring accurate, reliable, and trustworthy language translations.

Download research paper