AI Power User

Local AI Translation Beats Google Translate — and It's Private

A Czech speaker's field test of on-device LLM translation against the cloud incumbents

By Jakub Jirák Jun 22, 2026 5 min read

I’m Czech. I live my professional life in English and my personal life in a language with seven grammatical cases, three genders, and a verb-aspect system that breaks machine translation in ways English speakers never see. Czech↔English is my daily, involuntary benchmark — and for years, Google Translate was the tool you tolerated, not the tool you trusted.

That’s over. A quantized open-weight model running on my Mac now translates better than Google Translate on the texts I actually care about — and the contract, the medical report, the personal letter never leave my machine. Here’s the evidence, the setup, and the honest limits.

Why LLMs translate differently — not just better

Google Translate’s lineage is sentence-level neural machine translation: it sees one sentence, emits one sentence, and forgets. An LLM translates the way a human does — holding the whole document in context. That difference shows up in exactly the places Czech punishes:

Formality. Czech, like German or French, distinguishes formal vykání from informal tykání. Google Translate picks one mid-document and switches randomly. An LLM told “this is a letter to a friend” stays informal for ten paragraphs.

Idioms. I tested the Czech idiom “mít máslo na hlavě” (literally “to have butter on one’s head” — to be guilty of the thing you criticize). Google Translate, June 2026: “to have butter on your head.” Qwen3 14B on my Mac: “the pot calling the kettle black” — and when I asked for a more literal register, it explained the idiom. You cannot ask Google Translate a follow-up question.

Gender and reference. Czech past-tense verbs encode the speaker’s gender. Translating into Czech, sentence-level tools guess; an LLM instructed “the narrator is a woman” gets every verb form right across the whole text.

Tone. Asked to translate a sarcastic Czech email politely-but-keeping-the-edge, the LLM did it. This is steerable translation — a category the phrase-based world simply doesn’t have.

The baseline nobody installs: Apple’s offline Translate

Before reaching for an LLM, know what’s already on your Mac. Apple’s Translate app (and the system-wide translate option in the right-click menu) supports fully offline mode: Settings → General → Language & Region → Translation Languages, download Czech and English, then toggle On-Device Mode.

Quality is roughly “Google Translate minus a few percent” — fine for menus and quick gist, sentence-level, no context, no steering. It’s the floor. Everything below is about raising the ceiling.

The setup: Ollama plus one good prompt

Hardware-wise, any Apple Silicon Mac with 16 GB handles a capable translation model; 32 GB+ lets you run the genuinely good ones.

brew install ollama
ollama pull qwen3:14b
ollama pull gemma3:12b

After months of testing on Czech↔English, my ranking for European languages: Qwen3 14B is the best quality-per-GB I’ve found (Czech output is grammatical, natural, rarely calques English word order); Gemma 3 12B is a close second with slightly better English prose going the other direction; an 8B-class model is noticeably weaker on Czech morphology — fine for gist, not for documents. If you have 64 GB, a 32B model closes most of the remaining gap to DeepL/GPT-class quality.

The prompt matters as much as the model. Mine, battle-tested:

Translate the following text from Czech to English.
Preserve the original tone, formality level, and formatting.
Translate idioms to natural English equivalents, not literally.
Output only the translation, no commentary.

A quick CLI test:

ollama run qwen3:14b "Translate to English, natural tone, \
translation only: 'To je přesně ono, trefil jste hřebíček \
na hlavičku.'"
# → "That's exactly it — you've hit the nail on the head."

Idiom in, equivalent idiom out. Offline. On a laptop.

A system-wide translate hotkey

Translation you have to open an app for is translation you won’t use. Two ways to make it ambient:

Raycast (my choice). Raycast’s AI Commands can point at local models via Ollama. Create a custom command — Settings → AI → Custom Commands — with the prompt above and {selection} as the input, model set to your Ollama instance. Bind it to ⌥T. Now: select any text in any app, hit ⌥T, the translation appears in a floating window with one-key copy. Round-trip on my M2 Max with Qwen3 14B: about 2 seconds for a paragraph.

Shortcuts (free, no third-party apps). Build a shortcut: Receive Text from Quick Action → Get Contents of URL POSTing to http://localhost:11434/api/generate with the prompt wrapped around the input → Get Dictionary Value response → Show Result. Enable it as a Quick Action with a keyboard shortcut, and it works from the Services menu in every Mac app — and syncs to your iPhone’s Share Sheet for free.

Document-level translation that keeps the formatting

The trick for whole documents: don’t translate documents — translate Markdown. Convert, translate, convert back, and structure survives perfectly because the LLM treats ##, tables, and bold markers as tokens to preserve (the prompt’s “preserve formatting” line does real work here).

# .docx → Markdown
pandoc smlouva.docx -t gfm -o smlouva.md

# translate (one file, one shot — a 14B model handles
# ~10 pages in a single 32k context comfortably)
ollama run qwen3:14b "Translate this Czech document to English. \
Preserve ALL Markdown formatting exactly. Translation only: \
$(cat smlouva.md)" > smlouva_en.md

# Markdown → .docx
pandoc smlouva_en.md -o smlouva_en.docx

For longer documents, split on ## headings and loop — chunking on structural boundaries keeps each chunk self-coherent. A 25-page contract takes my Mac Studio about 8 minutes unattended. Headings, numbered clauses, tables: intact.

The privacy case is the whole case

Here’s the thing: even if local translation were merely equal in quality, I’d still use it, because think about what people actually paste into translate boxes. Employment contracts. Medical discharge reports. Divorce paperwork. Letters to family. Visa documents.

Google’s free Translate processes your text on Google’s servers under terms that permit service improvement; DeepL Pro promises deletion, but it’s still a promise about someone else’s computer. When my mother needed her cardiology report understood in English for a second opinion, that document went through qwen3:14b on my desk with Wi-Fi off — verifiably nowhere. For lawyers, doctors, and anyone under GDPR obligations, “the data never left the device” isn’t a preference, it’s the difference between allowed and prohibited. That’s not paranoia; that’s the actual compliance line.

Honest limits

Local LLM translation is not magic, and pretending otherwise helps nobody:

Rare language pairs. Czech↔English works because both are well represented in training data. Czech↔Vietnamese or Estonian↔Portuguese will silently pivot through English and compound errors. For low-resource languages, dedicated systems and human translators still win decisively.

Highly technical jargon. Legal terms-of-art and clinical vocabulary are where a 14B model will confidently choose a plausible-but-wrong term. Promlčení (statute of limitations) once came out as the vaguer “lapse of claims.” For anything with consequences, I translate locally, then verify the load-bearing terms — and nothing with legal effect skips a human professional.

Hallucination under length. Past its comfortable context, a model may compress or skip sentences — quietly. Chunk long documents and spot-check that paragraph counts match.

The honest summary: for the 95% of translation that is real life — emails, articles, letters, reports — a local model on a Mac now beats the free cloud incumbents on quality and wins on privacy by an infinite margin. The 5% with legal or medical stakes still deserves a human. Set up the hotkey this weekend; you’ll stop noticing the language barrier within a week.