LLMs Still Struggle with Non-English Languages

📅 May 12, 2026 📊 Source: SEACrowd

Large Language Models (LLMs) for languages other than English still lag 12-18 months behind their English counterparts, according to research from SEACrowd and other AI benchmarking organizations.

The Language Gap, By the Numbers

English: GPT-4, Claude 3.5, Gemini 2.5 — all trained primarily on English data
Chinese, Spanish, Japanese: 6-12 months behind English in quality
Hindi, Arabic, Portuguese: 12-18 months behind, with noticeable gaps in idioms
Indonesian, Vietnamese, Thai: 18-24 months behind
Low-resource languages: 2+ years behind, if supported at all

Why the Gap Exists

It comes down to training data. The internet is approximately 60% English. Chinese is about 2%. Indonesian is roughly 0.5%. Less training data means worse performance — it is that simple.

The Gap Is Closing

The good news: organizations like SEACrowd, AI4Bharat, and EleutherAI are actively building multilingual datasets. Models like Llama 3 and Mistral have significantly improved non-English support.

What This Means for You

If you are using AI in a non-English language, set your expectations accordingly. For critical tasks — legal documents, medical information, financial advice — always have a human review the output.

Enjoyed this? Check out more AI Facts or read our Practical Tips.