Neuphonic introduces NeuTTS Air, a realistic open source speech language model that operates locally on devices without requiring GPU servers or internet connection, with instant voice cloning capability.
Neuphonic has launched NeuTTS Air as an open source project, a speech language model that runs directly on local devices. Unlike advanced speech synthesis systems that have traditionally been available only through cloud APIs, this model operates completely without internet connection. The company indicates this is the first speech synthesis model with these realistic characteristics capable of running entirely on the user's own device.
The model is built on Qwen 0.5B, a lightweight language model optimized for text understanding and generation, combined with NeuCodec, Neuphonic's proprietary neural audio codec. This architecture allows the system to run in real-time even on mid-range devices, including laptops, mobile phones, and Raspberry Pi boards. The company has distributed NeuTTS Air in GGML format, designed specifically to enable efficient inference on devices without requiring specialized hardware.
One of the system's standout features is instant voice cloning, which allows creating a personalized voice profile with just three seconds of reference audio. This functionality operates entirely on the local device, meaning voice data is never transmitted to external servers. Neuphonic notes that this approach addresses privacy and regulatory compliance concerns, especially relevant in applications handling sensitive data.
The model generates voices with a high degree of naturalness for its size, balancing audio quality with processing speed and storage requirements. The architecture combines a compact language model with an audio codec that achieves high quality at reduced bitrates through the use of a single codebook. According to the company, this balance enables real-time applications on devices with limited resources.
NeuTTS Air is freely available on Hugging Face under an open source license. The model's audio outputs include watermarks for identification purposes. Neuphonic indicates that the system's power consumption has been specifically optimized for mobile and embedded devices, enabling its use in applications ranging from voice assistants to interactive toys and tools requiring strict privacy regulation compliance.
Voice synthesis company that generates natural speech for devices using artificial intelligence. Offers services through cloud API and compact on-device models. Includes voice cloning ...
09/06/2026
Anthropic introduces Claude Fable 5 and Claude Mythos 5, two versions of its most capable model to date. They share the same foundation, but one is ...
25/05/2026
Pope Leo XIV publishes the first encyclical dedicated to artificial intelligence, setting human dignity as the criterion for all technological ...
19/05/2026
Rime introduces Coda, a text-to-speech model for real-time conversational agents that reproduces the rhythm, pauses and intonation of natural ...
11/05/2026
Thinking Machines Lab has published a research preview of TML-Interaction-Small, an interaction model designed to collaborate with the user in real ...