Translation with LLMs through Prompting with Long-Form Context
Collections
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Rights
Attribution 4.0 International
Subjects
Abstract
Stable generation of text in low-resource languages is an unsolved issue in large language models. While Large Language Models (LLMs) can often produce good translations despite not being explicitly trained for this task, this does not hold for low-resource languages. LLMs are both more likely to generate off-target text (text in another language than intended) when prompted to translate to a low-resource language, and show increased instability in translation quality across prompt templates in low-resource languages. This study implemented a prepended monolingual text prompting method in the target language and used a context-and topic-aware with few-shot machine translation (MT). We quantified these methods for low-, mid-, and high-resource languages using OpenAI GPT-4o-mini and Google Gemini-1.5-flash. Gemini results showed that the use of contexttopic-aware with few-shot MT (CTAFSMT) significantly boosted the performance for the three language categories. However, this was not consistently observed in the case of ChatGPT. It was found that the significance of the results depended on the language itself rather than the level of its resources. This study is part of Stanford University's meta-study on whether LLMs can generate novel research ideas. The code, prompts, and results of the study can be found at https://github.com/HuthaifaAshqar/Translationwith-LLMs.
