Higgs Audio v3 Studio Runtime Files

This repository hosts the downloadable runtime files for Higgs Audio v3 Studio, a Windows desktop app for local Higgs Audio v3 TTS, voice cloning, speech continuation, and multi-speaker generation.

GitHub app repository: https://github.com/Saganaki22/Higgs-Audio-v3-Studio

This repository is not the original upstream model release. It provides GGUF model builds, the Windows CUDA engine DLL package, checksums, and a manifest used by the desktop app downloader.

What This Is

Higgs Audio v3 Studio is a Rust/Tauri desktop application that runs a ported native C++/CUDA implementation of Higgs Audio v3 locally.

The app provides:

  • Local TTS generation
  • Voice clone workflow
  • Continue speech workflow
  • Multi-speaker workflow
  • Speaker Gallery for reusable speaker identities
  • WAV/MP3 export
  • Local API server
  • API streaming endpoint
  • Whisper-assisted reference transcript workflow
  • Model/engine download UI
  • Hardware telemetry and VRAM diagnostics
  • Engine dependency diagnostics for missing CUDA/MSVC runtime DLLs

App Download

Use the desktop app from GitHub releases:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio/releases

Source code:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Runtime Repository Layout

The app expects this Hugging Face repository layout:

/
├─ manifest.json
├─ checksums/
│  └─ SHA256SUMS.txt
├─ engines/
│  ├─ audiocpp_engine.dll
│  ├─ cublas64_13.dll
│  ├─ cublasLt64_13.dll
│  ├─ MSVCP140.dll
│  ├─ VCOMP140.DLL
│  ├─ VCRUNTIME140.dll
│  └─ VCRUNTIME140_1.dll
└─ models/
   ├─ higgs-q4_k_m/
   │  └─ q4_k_m.gguf
   ├─ higgs-q5_k/
   │  └─ q5_k.gguf
   ├─ higgs-q6_k/
   │  └─ q6_k.gguf
   ├─ higgs-q8_0/
   │  └─ q8_0.gguf
   └─ higgs-bf16/
      └─ bf16.gguf

Available Downloads

File Purpose
manifest.json App downloader manifest with file names, sizes, hashes, and recommended model metadata
checksums/SHA256SUMS.txt SHA256 checksums for engine package and model files
engines/audiocpp_engine.dll Windows CUDA engine DLL used by the Tauri desktop app
engines/cublas64_13.dll NVIDIA CUDA 13 cuBLAS runtime DLL required by the engine
engines/cublasLt64_13.dll NVIDIA CUDA 13 cuBLASLt runtime DLL required by cuBLAS
engines/MSVCP140.dll Microsoft C++ runtime DLL
engines/VCOMP140.DLL Microsoft OpenMP runtime DLL
engines/VCRUNTIME140.dll Microsoft Visual C++ runtime DLL
engines/VCRUNTIME140_1.dll Microsoft Visual C++ runtime DLL
models/higgs-q4_k_m/q4_k_m.gguf Smaller quantized model
models/higgs-q5_k/q5_k.gguf Balanced K-quant model
models/higgs-q6_k/q6_k.gguf Higher-quality K-quant model
models/higgs-q8_0/q8_0.gguf Recommended default model
models/higgs-bf16/bf16.gguf Highest-fidelity BF16 model

Recommended VRAM

Model Recommended VRAM Notes
q4_k_m 8 GB Smaller model for lower VRAM systems
q5_k 9 GB Balanced K-quant option
q6_k 10 GB Higher-quality K-quant option
q8_0 12 GB Recommended default quality/speed balance
bf16 16 GB Highest-fidelity testing model

Engine Requirements

The prebuilt engine package is intended for:

  • Windows x64
  • NVIDIA RTX 30xx, 40xx, or 50xx GPU
  • CUDA 13 compatible NVIDIA driver
  • Higgs Audio v3 Studio 0.2.31 or newer recommended

The engines/ folder contains the app engine DLL plus the CUDA/MSVC runtime DLLs the current Windows engine build needs.

Important:

  • nvcuda.dll is not included and should not be uploaded here.
  • nvcuda.dll comes from the NVIDIA display driver.
  • Users still need a working NVIDIA driver installed.
  • Users should not need the full CUDA Toolkit installed if they use the app's Download Engine DLLs button.

The app can use either:

  1. DLLs already installed on the user's system, such as CUDA/MSVC runtime DLLs found through system paths.
  2. DLLs downloaded from this repository into the app's writable engine folder.

Engine Dependency Diagnostics

Higgs Audio v3 Studio checks the Windows loader dependencies before loading the engine.

The current validator checks for:

  • nvcuda.dll
  • cublas64_13.dll
  • cublasLt64_13.dll
  • MSVCP140.dll
  • VCOMP140.DLL
  • VCRUNTIME140.dll
  • VCRUNTIME140_1.dll

If a runtime DLL is missing, users can press Download Engine DLLs in the app to download the files from this repository.

If nvcuda.dll is missing, users need to install or update their NVIDIA driver.

C++ Port Overview

The desktop app uses a ported native C++/CUDA inference engine instead of running the original Python pipeline directly.

At a high level:

  • Higgs Audio v3 model weights are loaded from GGUF files.
  • The native engine exposes a C ABI through audiocpp_engine.dll.
  • The Rust/Tauri backend dynamically loads the DLL and calls the native API.
  • The TypeScript frontend talks to Rust through Tauri IPC.
  • The same engine path is used by the UI and the local API server.
  • Whisper.cpp support is integrated into the engine path for reference transcript assistance.

Main runtime layers:

Tauri UI
  ↓
Rust backend / local API / queue
  ↓
C ABI engine wrapper
  ↓
Ported C++ Higgs Audio v3 runtime
  ↓
ggml / CUDA backend
  ↓
GGUF Higgs model files

Runtime Optimisations

The current app and engine include several production-focused improvements:

  • Live streaming playback for generated audio
  • API streaming through newline-delimited JSON events
  • Unified UI/API generation queue
  • Cancel checks inside the native decode loop
  • Runtime graph cleanup after generation, cancellation, and error paths
  • Per-stage VRAM diagnostics in the app Command Centre
  • F16 decode KV cache enabled by default in the CUDA engine
  • F32 decode KV fallback available for diagnostics
  • Unsupported mixed K/V cache modes disabled to avoid CUDA flash-attention fatal paths
  • Saved speaker reference cache support through .hspkcache
  • Speaker ZIP import/export for reusable speaker identities
  • Manifest/checksum-based model and engine downloads
  • Engine dependency preflight for clearer Windows DLL loading errors

Quantization Notes

The GGUF builds in this repository are provided so users can choose a quality/VRAM tradeoff from inside the app.

  • q4_k_m: smallest supported Higgs model option in this repo
  • q5_k: middle option between Q4 and Q6
  • q6_k: higher-quality K-quant option below Q8
  • q8_0: recommended default for most users with enough VRAM
  • bf16: highest-fidelity build, largest VRAM requirement

The app keeps the small model assets/config files bundled with the installer/portable package where possible, while the large GGUF weights are downloaded separately from this Hugging Face repository.

API Support

Higgs Audio v3 Studio includes a local API server.

Supported API capabilities include:

  • Plain TTS
  • Saved-speaker voice clone
  • Continue speech
  • WAV output
  • MP3 output
  • Streaming NDJSON audio/progress events
  • Saved speaker discovery
  • API Command Centre logs in the desktop app

For API examples and current endpoint details, see the GitHub README:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio#http-api-streaming-and-command-centre

Installation

Recommended path:

  1. Download the latest installer or portable build from GitHub releases.
  2. Open Higgs Audio v3 Studio.
  3. If the engine or runtime DLLs are missing, click Download Engine DLLs.
  4. Download or select a Higgs GGUF model.
  5. Load engine.
  6. Load model.
  7. Generate audio.

GitHub releases:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio/releases

Manual Engine Folder

If placing files manually, keep the engine package together:

engines/
├─ audiocpp_engine.dll
├─ cublas64_13.dll
├─ cublasLt64_13.dll
├─ MSVCP140.dll
├─ VCOMP140.DLL
├─ VCRUNTIME140.dll
└─ VCRUNTIME140_1.dll

Do not upload or redistribute nvcuda.dll. It belongs to the NVIDIA driver.

Checksums

Checksums are provided in:

checksums/SHA256SUMS.txt

The desktop app and users can use this file to verify downloaded runtime/model files.

Safety And Responsible Use

Do not use Higgs Audio v3 Studio, Higgs Audio v3, or any voice cloning workflow to impersonate people without consent, create deceptive or malicious voices, defraud people, bypass identity checks, harass others, or cause harm.

Only generate speech when you have the rights and consent required for the source voice, transcript, and intended output use.

Credits

This project depends on the upstream Higgs Audio v3 model work by Boson AI.

Upstream model:

https://huggingface.co/bosonai/higgs-audio-v3-tts-4b

Desktop app and ported runtime repository:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Whisper.cpp:

https://github.com/ggml-org/whisper.cpp

NVIDIA CUDA runtime components are provided under NVIDIA's CUDA Toolkit license terms.

Microsoft Visual C++ runtime components are provided under Microsoft's Visual Studio / Visual C++ Redistributable license terms.

Citation

@misc{bosonai_higgs_audio_tts_v3_2026,
  title  = {Higgs TTS 3: Conversational Speech for Voice AI from Boson AI},
  author = {Boson AI},
  year   = {2026},
  howpublished = {https://huggingface.co/bosonai/higgs-tts-3-4b},
}

License

The Higgs Audio v3 / Higgs TTS 3 model weights and upstream model assets are governed by the Boson Higgs TTS 3 Research and Non-Commercial License.

The desktop app and native port code are maintained separately on GitHub:

https://github.com/Saganaki22/Higgs-Audio-v3-Studio

Check all upstream licenses before redistributing weights, using generated audio commercially, or packaging this project into another product.

Downloads last month
49
GGUF
Model size
5B params
Architecture
higgs_tts
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support