NeuTTS Nano and Air: Neuphonic's On-Device TTS That Runs on a Raspberry Pi

Most on-device TTS involves a tradeoff: smaller footprint means lower quality. Neuphonic’s NeuTTS models challenge that assumption.

NeuTTS Nano and NeuTTS Air are designed for on-device inference. They target useful speech quality while running on constrained hardware such as Raspberry Pi-class devices, phones, and lower-memory laptops.

Here is what makes them worth knowing about.

The two models

	NeuTTS Nano	NeuTTS Air
Parameters	~120M	~552M
CPU speed	Check current benchmarks	Check current benchmarks
Real-time factor	Hardware-dependent	Hardware-dependent
Voice cloning	Zero-shot workflows	Zero-shot workflows
Languages	English, Spanish, German, French, Urdu, Japanese, Korean, Chinese, Portuguese	English
Format	GGUF / GGML	GGUF / GGML
GPU needed	No	No (recommended for production)

NeuTTS Nano is the compact option. At roughly 120M parameters, it is designed for fast CPU-oriented generation, though real speed depends on hardware, quantization, runtime, and text length.

NeuTTS Air trades some speed for quality. It is the model to evaluate when Nano’s quality is not enough but GPU budget is limited.

Why model size matters for on-device TTS

Many larger TTS models — including Qwen3-TTS, Chatterbox-family models, and Orpheus — are commonly evaluated with GPU-oriented workflows. That can make them harder to deploy for:

Mobile apps that cannot assume a GPU
Edge devices like Raspberry Pi or IoT hardware
Low-power environments where GPU draws too much energy
Applications where the TTS model shares resources with other compute

Neuphonic’s approach is different. By using a CPU-oriented deployment path distributed in GGUF/GGML-style formats, the models can target devices that can load them into RAM.

Voice cloning without fine-tuning

Both models describe zero-shot voice cloning from short reference audio. Neuphonic calls this “infinite cloning”; for production use, review the current license, consent requirements, and product limits.

The key product idea is that cloning can happen on-device. For applications that need personalized voices at scale — think language learning apps, audiobook generators, or accessibility tools — this is a meaningful capability if quality and licensing fit the use case.

Running on constrained hardware

NeuTTS Nano has been discussed for constrained hardware such as:

Raspberry Pi-class devices
M-series MacBook Air-class laptops
Recent iPhones
Recent Android phones

This level of portability can open TTS use cases that are difficult with heavier GPU-first models: offline navigation voice, on-device accessibility tools, battery-powered assistants, and privacy-sensitive workflows.

The watermarking angle

Neuphonic describes PerTh watermarking for generated speech provenance. This is increasingly relevant as voice cloning becomes more accessible and policymakers scrutinize synthetic voice disclosure.

As with any watermarking claim, developers should review current documentation and test how the watermark behaves under compression, editing, and distribution.

Where Neuphonic fits in the TTS ecosystem

Neuphonic appears focused on a different part of the spectrum than larger GPU-oriented TTS models: devices where GPU access is limited or nonexistent.

For developers building mobile apps, edge AI products, or battery-constrained systems, NeuTTS Nano is worth evaluating for quality-per-watt tradeoffs.

Where Spokio fits

Spokio runs on Mac, including Apple Silicon and Intel Macs, so Raspberry Pi-class deployment is not its core use case. However, the trend Neuphonic represents matters: the TTS ecosystem is splitting into heavier high-quality workflows and smaller edge-oriented workflows.

For Mac users who want local TTS today, Spokio is powered by Chatterbox Turbo, supports local voice cloning and batch export, exports MP3, WAV, AIFF, and M4A, and does not upload text, audio, or voice samples to cloud services.

NeuTTS Nano and Air: Neuphonic's On-Device TTS That Runs on a Raspberry Pi

The two models

Why model size matters for on-device TTS

Voice cloning without fine-tuning

Running on constrained hardware

The watermarking angle

Where Neuphonic fits in the TTS ecosystem

Where Spokio fits

More from the blog

Try Spokio for Mac.

Product

Features

Use Cases

Compare