Kokoro TTS is an open-weight text-to-speech model with 82 million parameters. It is small enough to run locally on many Macs, but there is more than one way to use it.
The best setup depends on your Mac and what you are building. The official package is the easiest place to start. Other runtimes are useful for portable inference, Apple Silicon optimization, local web apps, and native Mac apps.
This guide compares the practical options and shows the shortest path for each one. For model architecture, voices, and language details, read the Kokoro technical guide. For a complete voiceover workflow, see the Kokoro YouTube voiceover guide.
Ways to Run Kokoro TTS Locally on Mac
| Method | Best for | Intel Mac | Apple Silicon Mac |
|---|---|---|---|
| Official Python package with PyTorch | Learning Kokoro and writing Python scripts | Yes | Yes |
| ONNX Runtime | Portable CPU inference and quantized model choices | Yes | Yes |
MLX with mlx-audio or kokoro-mlx |
Apple-focused inference and local APIs | No | Yes |
| JavaScript with kokoro-js | Browser, Node.js, and Electron apps | Yes | Yes |
| Swift packages | Native macOS app development | Depends on runtime | Yes |
| Ready-made desktop apps | Using Kokoro without writing code | Check the app | Check the app |
If you only want one recommendation, start with the official Python package. Use ONNX when portability matters. Use MLX when you have an M-series Mac and want an Apple Silicon-focused runtime.
System Requirements
Kokoro can run locally on Intel Macs and Apple Silicon Macs. That includes MacBook Air and MacBook Pro laptops, plus Mac mini, iMac, Mac Studio, and Mac Pro desktops. Apple Silicon models with M1, M2, M3, M4, or M5-series chips can also use MLX-based options. Intel Mac users should choose Python with PyTorch or ONNX Runtime instead.
For the Python examples, install:
- macOS
- Python 3.10, 3.11, or 3.12 for the official package
- Homebrew
- Several GB of free disk space for dependencies, caches, and model files
The model is lightweight, but generation speed depends on your chip, memory, runtime, text length, and output settings.
Option 1: Official Python Package With PyTorch
The official kokoro package is the clearest way to understand the model. It
uses PyTorch and downloads the model files when needed.
Create a project and virtual environment:
The official pipeline uses eSpeak NG for English fallback cases and some non-English languages.
mkdir kokoro-mac
cd kokoro-mac
python3 -m venv .venv
source .venv/bin/activate
pip install "kokoro>=0.9.4" soundfile
brew install espeak-ngCreate run_kokoro.py:
from kokoro import KPipeline
import soundfile as sf
pipeline = KPipeline(lang_code="a")
generator = pipeline(
"Kokoro is running locally on this Mac.",
voice="af_heart",
)
for index, (_, _, audio) in enumerate(generator):
sf.write(f"kokoro-{index}.wav", audio, 24000)Run it:
python run_kokoro.pyThe script writes one or more WAV files in the current folder. The a
language code selects American English. The official repository also documents
British English, Spanish, French, Hindi, Italian, Japanese, Brazilian
Portuguese, and Mandarin Chinese pipelines.
Apple Silicon MPS Fallback
On Apple Silicon, the official repository recommends enabling the PyTorch MPS fallback if your script hits a CPU-versus-MPS tensor error:
PYTORCH_ENABLE_MPS_FALLBACK=1 python run_kokoro.pyThis lets unsupported operations fall back to the CPU while the rest of the pipeline uses Apple’s Metal backend.
Option 2: ONNX Runtime
ONNX Runtime is useful when you want a portable inference format, CPU-friendly deployment, or quantized model choices. This is a community-supported route rather than the official Hexgrad package.
One practical wrapper is kokoro-onnx. Create a separate environment:
python3 -m venv .venv-onnx
source .venv-onnx/bin/activate
pip install kokoro-onnx soundfileDownload a compatible ONNX model file and voice file by following the wrapper’s
current README. Then create run_kokoro_onnx.py:
from kokoro_onnx import Kokoro
import soundfile as sf
kokoro = Kokoro("kokoro-v1.0.onnx", "voices-v1.0.bin")
samples, sample_rate = kokoro.create(
"Kokoro ONNX is running locally on this Mac.",
voice="af_sarah",
speed=1.0,
lang="en-us",
)
sf.write("kokoro-onnx.wav", samples, sample_rate)Run it:
python run_kokoro_onnx.pyThe Kokoro-82M-v1.0-ONNX model card lists current ONNX variants, including quantized files. Smaller quantized models can reduce memory use, with a possible quality tradeoff.
Option 3: MLX on Apple Silicon
MLX is Apple’s machine learning framework for Apple Silicon. Choose this path for an M1, M2, M3, M4, or M5-series MacBook, Mac mini, iMac, Mac Studio, or Mac Pro. It is not an Intel Mac runtime.
The most complete MLX option is
mlx-audio.
Kokoro support in mlx-audio also uses
Misaki
for text processing. The toolkit supports quantized Kokoro models, a Python
API, a CLI, and an OpenAI-compatible local API server.
Install it:
python3 -m venv .venv-mlx
source .venv-mlx/bin/activate
pip install mlx-audio misakiGenerate audio from the command line:
mlx_audio.tts.generate \
--model mlx-community/Kokoro-82M-bf16 \
--text "Kokoro MLX is running locally on Apple Silicon." \
--voice af_heartFor lower memory use, replace the model with
mlx-community/Kokoro-82M-8bit or mlx-community/Kokoro-82M-4bit.
mlx-audio also provides a Python API when you need to integrate generation
into an application.
Run an MLX Local API
mlx-audio can also expose an OpenAI-compatible local endpoint:
mlx_audio.server --host 127.0.0.1 --port 8000In another terminal:
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"mlx-community/Kokoro-82M-bf16","input":"Hello from Kokoro on this Mac.","voice":"af_heart"}' \
--output speech.wavFor a smaller dedicated wrapper, also review
kokoro-mlx.
Use mlx-audio when you want a broader toolkit or a local API server.
Option 4: JavaScript and Browser Apps
kokoro-js is the JavaScript route for local browser,
Node.js,
and Electron
projects. It uses ONNX models and can run with browser-oriented
runtimes such as WebGPU or WASM, depending on the environment.
Install the package:
npm install kokoro-jsBecause browser runtime support and package APIs can evolve, use the current
kokoro-js package examples when wiring it into a project. You can also test
the model in the
Kokoro Web demo
before building your own local interface.
Choose JavaScript when the speech generator should live inside a web app, desktop web wrapper, or front-end prototype.
Option 5: Native Swift Integrations
Native macOS developers can integrate Kokoro through community Swift ports. These options are useful for packaged Mac apps, but they require more application code than the Python examples.
| Package | Runtime | Notes |
|---|---|---|
| kokoro-ios | MLX Swift | Supports macOS 15 or newer and requires you to bundle model and voice files |
| SwiftKokoroONNX | ONNX | Useful when an ONNX-based Swift package fits your app |
| swift-coreml-kokoro | Core ML | Community Core ML option for Apple-platform deployment |
These are community projects, so check each repository’s current requirements and releases before choosing one for production work.
Option 6: Ready-Made Desktop Apps
If you want to generate speech without writing code, community desktop apps can package Kokoro behind a graphical interface. For example, Freekoko is a Mac app with a local interface and HTTP API.
Desktop apps are convenient for testing voices and generating occasional audio. For repeatable automation or product integration, use Python, ONNX, MLX, JavaScript, or Swift directly.
Which Kokoro Setup Should You Choose?
Use the official Python package if you are learning Kokoro, writing a script, or following the YouTube voiceover workflow.
Use ONNX Runtime if you want portable inference, CPU-friendly deployment, or quantized model files on either Intel or Apple Silicon Macs.
Use mlx-audio if you have an Apple Silicon Mac and want an MLX-native Python
API, lower-memory quantized options, or a local OpenAI-compatible endpoint.
Use kokoro-js for browser, Electron, and JavaScript application workflows.
Use a Swift port when Kokoro needs to ship inside a native macOS app.
Troubleshooting
espeak-ng Is Missing
Install it with Homebrew:
brew install espeak-ngThe official pipeline uses espeak-ng for English fallback cases and some
non-English languages.
PyTorch Reports a CPU and MPS Tensor Error
Use the fallback environment variable documented by the official repository:
PYTORCH_ENABLE_MPS_FALLBACK=1 python run_kokoro.pyMLX Does Not Work on an Intel Mac
MLX is designed for Apple Silicon. Use the official Python package or ONNX Runtime on Intel Macs.
The First Run Takes Longer
Most setups download model files the first time you use them. Later runs can reuse the cached files.
Long Text Produces Multiple Files
Kokoro pipelines commonly split longer text into segments. Keep the numbered WAV files, join them after generation, or split your input into paragraphs before synthesis.
Run Kokoro Privately on Your Mac
Kokoro is small enough to make local TTS practical across several Mac workflows. Start with Python, then move to ONNX, MLX, JavaScript, or Swift when your deployment needs become clearer.
If you want a hosted workflow without maintaining local runtimes, try Spokio.
