Files
supertonic-3-mlx/NOTICE
transcrilive 12dbf4a821 v0.1.0 — initial release
MLX-native port of Supertone's Supertonic 3 multilingual TTS. Runs the
full flow-matching + classifier-free-guidance pipeline at ~x100 realtime
on Apple Silicon, with audio cosine 1.0 vs the cached MLX path and
cosine 0.98 vs the upstream ONNX Runtime reference.

Weights are hosted at https://huggingface.co/ambassadia/supertonic-3-mlx
and auto-downloaded on first use; this repository ships the port code,
the model card, audio samples, and a zero-config setup_and_test.sh.

Install:
    pip install git+https://gitea.tavportal.com/olivier/supertonic-3-mlx.git

Quick test:
    git clone https://gitea.tavportal.com/olivier/supertonic-3-mlx.git
    cd supertonic-3-mlx && ./setup_and_test.sh

Licenses (dual): model weights = BigScience Open RAIL-M (Section 4
propagation), port code = Apache-2.0. See LICENSE, LICENSE-CODE, NOTICE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 09:17:05 +02:00

40 lines
1.9 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

supertonic-3-mlx
================
This release is a derivative of the upstream Supertone Supertonic 3
text-to-speech model and consists of two artefact classes governed by
two different licenses:
1. The model weights (under ./weights/*.safetensors) are released under
the BigScience Open RAIL-M License. The full text is in ./LICENSE and
was copied verbatim from
https://huggingface.co/Supertone/supertonic-3/blob/main/LICENSE
The Attachment A use restrictions (Section 5 + Attachment A clauses
(a)(m)) apply to all downstream use of the model and of any output
generated by the model.
2. The MLX port code (under ./src/supertonic_3_mlx/) is released under
the Apache License, Version 2.0. The full text is in ./LICENSE-CODE.
Attribution and modifications statement (BigScience Open RAIL-M Section 4.c):
Copyright (c) 2026 Supertone Inc. — original model weights and reference
Python/ONNX implementation. Distributed at
https://huggingface.co/Supertone/supertonic-3
Copyright (c) 2026 Olivier Dupont — MLX-native port code, weight format
conversion (ONNX → safetensors via the 3-stage extractor in
``src/supertonic_3_mlx/pipeline.py:_convert_onnx``), and pipeline
optimisations (``mx.compile`` of the CFG Euler loop, cross-attention
K/V cache shared across the 5 Euler steps). Distributed at
https://huggingface.co/ambassadia/supertonic-3-mlx
The MLX port does not modify the model's learned parameters in any
semantic sense — the only weight-level transformation is a tensor-shape
re-layout to match the MLX memory model (e.g. depthwise Conv1d
``(C, 1, K)`` → ``(C, K, 1)``). Bit-identical audio output to the
upstream ONNX Runtime reference is preserved up to FP32 accumulation
noise (cosine ≥ 0.98 on the full pipeline, cosine = 1.00 on the vocoder).
No use of the Supertone trademarks, logos, or trade dress is asserted or
permitted by this release (BigScience Open RAIL-M Section 8).