Higgs Audio v3 Studio Runtime Files

This repository hosts the downloadable runtime files for Higgs Audio v3 Studio, a Windows desktop app for local Higgs Audio v3 TTS, voice cloning, speech continuation, and multi-speaker generation.

GitHub app repository: https://github.com/Saganaki22/Higgs-Audio-v3-Studio

This repository is not the original upstream model release. It provides GGUF model builds, the Windows CUDA engine DLL package, checksums, and a manifest used by the desktop app downloader.

What This Is

Higgs Audio v3 Studio is a Rust/Tauri desktop application that runs a ported native C++/CUDA implementation of Higgs Audio v3 locally.

The app provides:

Local TTS generation
Voice clone workflow
Continue speech workflow
Multi-speaker workflow
Speaker Gallery for reusable speaker identities
WAV/MP3 export
Local API server
API streaming endpoint
Whisper-assisted reference transcript workflow
Model/engine download UI
Hardware telemetry and VRAM diagnostics
Engine dependency diagnostics for missing CUDA/MSVC runtime DLLs

App Download

Use the desktop app from GitHub releases:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio/releases

Source code:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Runtime Repository Layout

The app expects this Hugging Face repository layout:

/
├─ manifest.json
├─ checksums/
│  └─ SHA256SUMS.txt
├─ engines/
│  ├─ audiocpp_engine.dll
│  ├─ cublas64_13.dll
│  ├─ cublasLt64_13.dll
│  ├─ MSVCP140.dll
│  ├─ VCOMP140.DLL
│  ├─ VCRUNTIME140.dll
│  └─ VCRUNTIME140_1.dll
└─ models/
   ├─ higgs-q4_k_m/
   │  └─ q4_k_m.gguf
   ├─ higgs-q5_k/
   │  └─ q5_k.gguf
   ├─ higgs-q6_k/
   │  └─ q6_k.gguf
   ├─ higgs-q8_0/
   │  └─ q8_0.gguf
   └─ higgs-bf16/
      └─ bf16.gguf

Available Downloads

File	Purpose
`manifest.json`	App downloader manifest with file names, sizes, hashes, and recommended model metadata
`checksums/SHA256SUMS.txt`	SHA256 checksums for engine package and model files
`engines/audiocpp_engine.dll`	Windows CUDA engine DLL used by the Tauri desktop app
`engines/cublas64_13.dll`	NVIDIA CUDA 13 cuBLAS runtime DLL required by the engine
`engines/cublasLt64_13.dll`	NVIDIA CUDA 13 cuBLASLt runtime DLL required by cuBLAS
`engines/MSVCP140.dll`	Microsoft C++ runtime DLL
`engines/VCOMP140.DLL`	Microsoft OpenMP runtime DLL
`engines/VCRUNTIME140.dll`	Microsoft Visual C++ runtime DLL
`engines/VCRUNTIME140_1.dll`	Microsoft Visual C++ runtime DLL
`models/higgs-q4_k_m/q4_k_m.gguf`	Smaller quantized model
`models/higgs-q5_k/q5_k.gguf`	Balanced K-quant model
`models/higgs-q6_k/q6_k.gguf`	Higher-quality K-quant model
`models/higgs-q8_0/q8_0.gguf`	Recommended default model
`models/higgs-bf16/bf16.gguf`	Highest-fidelity BF16 model

Recommended VRAM

Model	Recommended VRAM	Notes
`q4_k_m`	8 GB	Smaller model for lower VRAM systems
`q5_k`	9 GB	Balanced K-quant option
`q6_k`	10 GB	Higher-quality K-quant option
`q8_0`	12 GB	Recommended default quality/speed balance
`bf16`	16 GB	Highest-fidelity testing model

Engine Requirements

The prebuilt engine package is intended for:

Windows x64
NVIDIA RTX 30xx, 40xx, or 50xx GPU
CUDA 13 compatible NVIDIA driver
Higgs Audio v3 Studio 0.2.31 or newer recommended

The engines/ folder contains the app engine DLL plus the CUDA/MSVC runtime DLLs the current Windows engine build needs.

Important:

nvcuda.dll is not included and should not be uploaded here.
nvcuda.dll comes from the NVIDIA display driver.
Users still need a working NVIDIA driver installed.
Users should not need the full CUDA Toolkit installed if they use the app's Download Engine DLLs button.

The app can use either:

DLLs already installed on the user's system, such as CUDA/MSVC runtime DLLs found through system paths.
DLLs downloaded from this repository into the app's writable engine folder.

Engine Dependency Diagnostics

Higgs Audio v3 Studio checks the Windows loader dependencies before loading the engine.

The current validator checks for:

nvcuda.dll
cublas64_13.dll
cublasLt64_13.dll
MSVCP140.dll
VCOMP140.DLL
VCRUNTIME140.dll
VCRUNTIME140_1.dll

If a runtime DLL is missing, users can press Download Engine DLLs in the app to download the files from this repository.

If nvcuda.dll is missing, users need to install or update their NVIDIA driver.

C++ Port Overview

The desktop app uses a ported native C++/CUDA inference engine instead of running the original Python pipeline directly.

At a high level:

Higgs Audio v3 model weights are loaded from GGUF files.
The native engine exposes a C ABI through audiocpp_engine.dll.
The Rust/Tauri backend dynamically loads the DLL and calls the native API.
The TypeScript frontend talks to Rust through Tauri IPC.
The same engine path is used by the UI and the local API server.
Whisper.cpp support is integrated into the engine path for reference transcript assistance.

Main runtime layers:

Tauri UI
  ↓
Rust backend / local API / queue
  ↓
C ABI engine wrapper
  ↓
Ported C++ Higgs Audio v3 runtime
  ↓
ggml / CUDA backend
  ↓
GGUF Higgs model files

Runtime Optimisations

The current app and engine include several production-focused improvements:

Live streaming playback for generated audio
API streaming through newline-delimited JSON events
Unified UI/API generation queue
Cancel checks inside the native decode loop
Runtime graph cleanup after generation, cancellation, and error paths
Per-stage VRAM diagnostics in the app Command Centre
F16 decode KV cache enabled by default in the CUDA engine
F32 decode KV fallback available for diagnostics
Unsupported mixed K/V cache modes disabled to avoid CUDA flash-attention fatal paths
Saved speaker reference cache support through .hspkcache
Speaker ZIP import/export for reusable speaker identities
Manifest/checksum-based model and engine downloads
Engine dependency preflight for clearer Windows DLL loading errors

Quantization Notes

The GGUF builds in this repository are provided so users can choose a quality/VRAM tradeoff from inside the app.

q4_k_m: smallest supported Higgs model option in this repo
q5_k: middle option between Q4 and Q6
q6_k: higher-quality K-quant option below Q8
q8_0: recommended default for most users with enough VRAM
bf16: highest-fidelity build, largest VRAM requirement

The app keeps the small model assets/config files bundled with the installer/portable package where possible, while the large GGUF weights are downloaded separately from this Hugging Face repository.

API Support

Higgs Audio v3 Studio includes a local API server.

Supported API capabilities include:

Plain TTS
Saved-speaker voice clone
Continue speech
WAV output
MP3 output
Streaming NDJSON audio/progress events
Saved speaker discovery
API Command Centre logs in the desktop app

For API examples and current endpoint details, see the GitHub README:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio#http-api-streaming-and-command-centre

Installation

Recommended path:

Download the latest installer or portable build from GitHub releases.
Open Higgs Audio v3 Studio.
If the engine or runtime DLLs are missing, click Download Engine DLLs.
Download or select a Higgs GGUF model.
Load engine.
Load model.
Generate audio.

GitHub releases:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio/releases

Manual Engine Folder

If placing files manually, keep the engine package together:

engines/
├─ audiocpp_engine.dll
├─ cublas64_13.dll
├─ cublasLt64_13.dll
├─ MSVCP140.dll
├─ VCOMP140.DLL
├─ VCRUNTIME140.dll
└─ VCRUNTIME140_1.dll

Do not upload or redistribute nvcuda.dll. It belongs to the NVIDIA driver.

Checksums

Checksums are provided in:

checksums/SHA256SUMS.txt

The desktop app and users can use this file to verify downloaded runtime/model files.

Safety And Responsible Use

Do not use Higgs Audio v3 Studio, Higgs Audio v3, or any voice cloning workflow to impersonate people without consent, create deceptive or malicious voices, defraud people, bypass identity checks, harass others, or cause harm.

Only generate speech when you have the rights and consent required for the source voice, transcript, and intended output use.

Credits

This project depends on the upstream Higgs Audio v3 model work by Boson AI.

Upstream model:

https://huggingface.co/bosonai/higgs-audio-v3-tts-4b

Desktop app and ported runtime repository:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Whisper.cpp:

https://github.com/ggml-org/whisper.cpp

NVIDIA CUDA runtime components are provided under NVIDIA's CUDA Toolkit license terms.

Microsoft Visual C++ runtime components are provided under Microsoft's Visual Studio / Visual C++ Redistributable license terms.

Citation

@misc{bosonai_higgs_audio_tts_v3_2026,
  title  = {Higgs TTS 3: Conversational Speech for Voice AI from Boson AI},
  author = {Boson AI},
  year   = {2026},
  howpublished = {https://huggingface.co/bosonai/higgs-tts-3-4b},
}

License

The Higgs Audio v3 / Higgs TTS 3 model weights and upstream model assets are governed by the Boson Higgs TTS 3 Research and Non-Commercial License.

The desktop app and native port code are maintained separately on GitHub:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Check all upstream licenses before redistributing weights, using generated audio commercially, or packaging this project into another product.

Downloads last month: 49

GGUF

Model size

5B params

Architecture

higgs_tts

Hardware compatibility

4-bit

6-bit

8-bit

16-bit