emu — AI For The Rest Of Us

emu is a local-first, multimodal AI desktop application that runs state-of-the-art AI models entirely on your own computer. Chat with text, images and audio; dictate by voice; have replies read aloud; and extract text from documents — all with a clean, minimal interface, and with no cloud account, no telemetry, and no network calls for inference.

A product of CloudKites AI Lab (CloudKites Pty Ltd), Sydney, Australia.

💚 Free to download, use, and share. emu is distributed free of charge. You may use it and redistribute the complete, unmodified installer for free (no selling, no paid bundling). See License below.

⬇️ Download (Windows)

Build	For	Installer
GPU (CUDA)	NVIDIA GPU (CC ≥ 7.0) — fastest; real-time speech & OCR	emu-1.0.0-windows-cuda-setup.exe
CPU	Any 64-bit PC with AVX2 (Intel Haswell 2013+ / AMD 2015+)	emu-1.0.0-windows-cpu-setup.exe

The installers are code-signed and self-contained — the GPU build bundles the NVIDIA CUDA/cuDNN runtime, so there is nothing else to install. On first run, emu offers to download the model bundle(s) you choose into your local models folder.

⬇️ Download (Linux)

Build	For	Installer
GPU (CUDA)	NVIDIA GPU (CC ≥ 7.0) — fastest; real-time speech & OCR	emu-1.0.0-linux-cuda.run
CPU	Any 64-bit PC with AVX2 (Intel Haswell 2013+ / AMD 2015+)	emu-1.0.0-linux-cpu.run

The Linux installers are self-contained (the Qt runtime, all required plugins, and — for the GPU build — the CUDA/cuDNN runtime are bundled). They are distributed over HTTPS and are not code-signed. After downloading, mark the installer executable and run it:

chmod +x emu-1.0.0-linux-cuda.run   # or emu-1.0.0-linux-cpu.run
./emu-1.0.0-linux-cuda.run

Launching emu after install: the installer adds emu to your application menu — just click it. Or run the installed binary directly:

"<install-folder>/emu"      # e.g. ~/emu/emu  — runs from anywhere, no wrapper needed

Requires a glibc-based 64-bit Linux (glibc ≥ 2.38) with a working X11/XCB display. Getting a model: open the model menu at the top and pick one to download, or just type a prompt and press Enter — if no model is installed yet, emu downloads a recommended one automatically and sends your message once it's ready.

Documentation: User Guide (PDF) · Product Information (PDF) · License

✨ Features

Multimodal chat — text, plus attach an image or audio clip.
🎤 Speech-to-text — dictate prompts on-device; auto-detect or choose from 20 languages.
🔊 Text-to-speech — read replies aloud with four engines, including multilingual output and on-device Vietnamese voice cloning.
📄 Document OCR — extract text and tables from images and PDFs.
Multilingual — chat models natively handle dozens of languages; English and Vietnamese are first-class across chat, speech and voice.
100% private & offline — every model runs on your CPU or NVIDIA GPU; nothing leaves your device. No account, no telemetry.
Persistent chats, Markdown rendering, model picker, and resource-aware loading.

🧠 Models (in this repository)

Each model is a single self-contained .nbq bundle (weights + tokenizer + helpers in one file), auto-downloaded by emu on first use. You only need the model(s) for the features you want.

File	Capability	Base model · Provider · License
`gemma-4-12b-it-UD-Q2_K_XL.nbq`	Chat (text) — compact 2-bit, ~5 GB	Gemma 4 12B · Google DeepMind · Gemma Terms
`gemma-4-12b-it-qat-q4_0.nbq`	Chat (text), higher accuracy — 4-bit, ~7.5 GB	Gemma 4 12B · Google DeepMind · Gemma Terms
`gemma-4-e4b-it-q4_k_m.nbq`	Chat + vision + audio (multimodal)	Gemma 4 E4B · Google DeepMind · Gemma Terms
`qwen3.5-4b-q4_k_m.nbq`	Chat (text), light & fast	Qwen 3.5 4B · Alibaba Qwen · Apache-2.0
`qwen3.5-9b-q4_k_m.nbq`	Chat (text), stronger	Qwen 3.5 9B · Alibaba Qwen · Apache-2.0
`qwen3-asr-0.6b.nbq`	Speech-to-text (20 languages)	Qwen3-ASR · Alibaba Qwen · Apache-2.0
`qwen3-tts-0.6b.nbq`	Text-to-speech (multilingual, preset voices)	Qwen3-TTS · Alibaba Qwen · Apache-2.0
`gwen-tts-vn-0.6b.nbq`	Text-to-speech — Vietnamese voice cloning	gwen-tts · G-Group AI Lab · MIT
`pocket-tts-en.nbq`	Text-to-speech — fast English streaming	Pocket TTS · Kyutai · CC-BY-4.0
`vieneu-tts.nbq`	Text-to-speech — Vietnamese / English	VieNeu-TTS · pnnbao-ump · OpenMOSS · Apache-2.0
`ppocr-v6.nbq`	Document OCR (detection + recognition)	PP-OCRv6 · PaddlePaddle · Apache-2.0

Bundled AI models remain under their own licenses — see THIRD_PARTY_LICENSES.md and each model's own terms. Use of the Gemma models is additionally subject to the Gemma Terms of Use and Prohibited Use Policy.

💻 Which model for my computer?

Your PC	Recommended chat model	Notes
NVIDIA GPU, 12 GB+ VRAM	Gemma 4 12B (4-bit); Qwen 3.5 9B	Highest quality, fast
NVIDIA GPU, ~8 GB VRAM	Gemma 4 12B (2-bit); Qwen 3.5 4B	Fits an 8 GB card
CPU only (AVX2, 16 GB+ RAM)	Qwen 3.5 4B; Gemma 4 E4B	Quick without a GPU
Image / audio understanding	Gemma 4 E4B (multimodal)	CPU or GPU

The GPU (CUDA) build gives real-time text-to-speech and OCR in seconds per page. The CPU build runs every feature (OCR and the largest models are slower). emu's Auto device setting uses the GPU when available and falls back to the CPU otherwise.

📜 License

✅ Free to use — personal or commercial, on any number of devices.
✅ Free to redistribute — you may share the complete, unmodified official installer, free of charge, with all notices intact.
❌ No selling / no paid bundling, no modification, no reverse-engineering.
The source code of emu and its Numbat ML toolkit is confidential and is not distributed.

Full terms: LICENSE. "emu", "Numbat" and "CloudKites" are trademarks of CloudKites Pty Ltd.

⚠️ AI output notice

emu uses generative AI models. Output may be inaccurate, incomplete, or fabricated even when it appears confident, and is not a substitute for professional advice — verify important information against trusted sources.

Contact: contact@cloudkites.com · cloudkites.com

Downloads last month: -; Downloads are not tracked for this model. How to track