voxtral.cpp

A ggml-based C++ implementation of Voxtral Realtime 4B.

Quickstart

1. Download the model

Download the pre-converted GGUF model from Hugging Face:

# Default: Q4_0 quantization
./tools/download_model.sh Q4_0

2. Build

Build the project using CMake:

cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

3. Audio Preparation

The model expects 16-bit PCM WAV files at 16kHz (mono). You can use ffmpeg to convert your audio files:

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

4. Run Inference

./build/voxtral \
  --model models/voxtral/Q4_0.gguf \
  --audio path/to/input.wav \
  --threads 8

Advanced Usage

Manual Quantization

You can quantize an existing GGUF file using the native quantizer:

./build/voxtral-quantize \
  models/voxtral/voxtral.gguf \
  models/voxtral/voxtral-q6_k.gguf \
  Q6_K \
  8

Testing

The test suite runs over samples/*.wav files.

Numeric Parity Check

To verify numeric parity against the reference implementation:

python3 tests/test_voxtral_reference.py

Custom Tolerances

You can override comparison tolerances via environment variables:

VOXTRAL_TEST_ATOL (default: 1e-2)
VOXTRAL_TEST_RTOL (default: 1e-2)
VOXTRAL_TEST_THREADS

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
ggml @ 5cecdad		ggml @ 5cecdad
include		include
samples		samples
src		src
tests		tests
tools		tools
.clang-tidy		.clang-tidy
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
voxtral.cpp_colab.ipynb		voxtral.cpp_colab.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

voxtral.cpp

Quickstart

1. Download the model

2. Build

3. Audio Preparation

4. Run Inference

Advanced Usage

Manual Quantization

Testing

Numeric Parity Check

Custom Tolerances

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

voxtral.cpp

Quickstart

1. Download the model

2. Build

3. Audio Preparation

4. Run Inference

Advanced Usage

Manual Quantization

Testing

Numeric Parity Check

Custom Tolerances

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages