Supertonic - Lightning-Fast, On-Device Multilingual TTS

Powerful Features

Blazing Fast

Low-latency, real-time synthesis across desktop, browser, mobile, and edge. Fast enough to turn an entire webpage into audio in under a second.

Real-time processing
Sub-second latency
Optimized for performance

31-Language Multilingual

Synthesize directly from text across 31 languages, or pass lang="na" for language-agnostic processing.

No language adapters needed
Automatic language detection
Native language support

Compact & Efficient

99M-parameter open-weight model - a fraction of the size of larger TTS systems for faster downloads and lower memory footprint.

Smaller model size
Faster cold starts
Lower memory usage

Edge-Device Ready

Runs locally on desktop, mobile, browsers, and resource-constrained hardware like Raspberry Pi or e-readers with zero network dependency.

No GPU required
Complete privacy
Offline capable

Studio-Quality Audio

Outputs studio-grade 44.1kHz 16-bit WAV directly, ready for production playback without any external upsampler.

44.1kHz output
16-bit precision
Production ready

Expression Tags

10 inline tags (e.g. <laugh>, <breath>, <sigh>) bring natural human nuance into generated speech.

No prompt engineering
Reference audio free
Natural expression

Superior Performance

Reading Accuracy

Evaluated on the Minimax-MLS-test benchmark, Supertonic 3 stays within a competitive WER/CER range against much larger open TTS models while preserving a lightweight on-device deployment path.

Average WER across all languages

2.5

%

Runtime Efficiency

Supertonic 3 runs fast on CPU, even compared with larger baselines measured on A100 GPU, and uses substantially less memory.

CPU Inference Speed 3-5x faster than GPU baselines

Memory Usage ~2GB RAM

Model Size ~500MB ONNX

Model Size Comparison

At about 99M parameters across the public ONNX assets, Supertonic 3 is much smaller than 0.7B to 2B class open TTS systems.

Supertonic 3 99M

Open TTS Systems 700M-2B+

31 Languages Supported

Not sure which language your text is in? Pass lang="na" and Supertonic will handle it automatically.

Arabic Bulgarian Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hindi Hungarian Indonesian Italian Japanese Korean Latvian Lithuanian Polish Portuguese Romanian Russian Slovak Slovenian Spanish Swedish Turkish Ukrainian Vietnamese

Quick Start Guide

Install the Python SDK

On the first run, Supertonic downloads the model assets automatically.

pip install supertonic

Python Example

from supertonic import TTS

# First run downloads the model automatically
tts = TTS(auto_download=True)

style = tts.get_voice_style(voice_name="M1")

text = "Supertonic is a lightning fast, on-device TTS system."

wav, duration = tts.synthesize(
    text=text,
    lang="en",                      # Language code
    voice_style=style,              # Voice style
    total_steps=8,                  # Quality: 5-12
    speed=1.05,                     # Speed: 0.7-2.0
)

tts.save_audio(wav, "output.wav")
print(f"Generated {duration[0]:.2f}s of audio")

Local HTTP Server

Run Supertonic as a local HTTP service for integration with other tools.

pip install 'supertonic[serve]'
supertonic serve --host 127.0.0.1 --port 7788

Available Endpoints:

POST /v1/tts (Native)
POST /v1/audio/speech (OpenAI compatible)
GET /docs (API Documentation)

Live Demos & Use Cases

Raspberry Pi Demo

Real-time text-to-speech on Raspberry Pi, demonstrating on-device performance.

Watch Demo

E-Reader Integration

Experience Supertonic on an Onyx Boox e-reader in airplane mode with zero network dependency.

Watch Demo

Chrome Extension

Turn any webpage into audio in under one second with complete privacy.

Install Extension

Interactive Demo

Try Supertonic directly in your browser with our interactive demo. Experience real-time synthesis with different languages and voice styles.

Try Interactive Demo Audio Samples Voice Builder

Multi-Runtime SDKs

Ready-to-use examples through ONNX Runtime across multiple platforms.

Python

ONNX Runtime inference with pip installation

View Documentation →

Node.js

Server-side JavaScript implementation

View Example →

Browser

WebGPU/WASM inference in the browser

View Example →

Java

Cross-platform JVM implementation

View Example →

C++

High-performance C++ implementation

View Example →

C#

.NET ecosystem implementation

View Example →

Go

Go implementation with ONNX Runtime

View Example →

Swift

macOS and iOS applications

View Example →

Complete SDK List

Additional SDKs available for Rust, iOS, and Flutter platforms.

Rust

Memory-safe implementation

iOS

Native iOS apps

Flutter

Cross-platform apps

Voice Cloning Made Simple

Voice Builder

Turn your voice into a deployable, edge-native TTS with permanent ownership. Create custom voice profiles for both Supertonic 2 and Supertonic 3.

Permanent custom voice profile
Version-specific JSON files
Complete ownership
Easy integration

Create Your Voice

For Commercial Use

Need more voices or enterprise features? Check out our commercial offerings:

Supertone Play

700+ commercially usable preset voices

Supertone API

Hosted TTS and voice services

Built With Supertonic

TLDRL

Free, on-device TTS extension for reading any webpage

Read Aloud

Open-source TTS browser extension

PageEcho

E-Book reader app for iOS

VoiceChat

On-device voice-to-voice LLM chatbot in the browser

OmniAvatar

Talking avatar video generator from photo + speech

CopiloTTS

Kotlin Multiplatform TTS SDK via ONNX Runtime

Choose Your Version

Latest

Supertonic 3

31 Languages
Expression Tags
Improved Accuracy
99M Parameters

Main Branch Models

Stable

Supertonic 2

5 Languages
Production Ready
66M Parameters
Backward Compatible

Release Branch Models

Legacy

Supertonic 1

1 Language (EN)
Basic Features
66M Parameters
Deprecated

Models

Ready to Experience Lightning-Fast TTS?

Join thousands of developers who trust Supertonic for their on-device text-to-speech needs.

View on GitHub Try Demo Create Voice