2.4 KiB
markovian-rsa-mlx
First MLX implementation of Zyphra's Markovian RSA test-time compute methodology, targeting ZAYA1-8B on Apple Silicon. Boosts reasoning accuracy by sampling N parallel reasoning traces, extracting their tails, and feeding aggregation prompts back to the model.
Status : v0.1.0. Aggregation prompt is
zaya_v1(reverse-engineered ; paper does not publish the co-trained format). HMMT'25 5-problem smoke shows ≥ 0 pp lift on M2 Pro.
Install
uv add "markovian-rsa-mlx @ git+https://gitea.tavportal.com/olivier/markovian-rsa-mlx.git"
This pulls in mlx-lm from kyr0's feat/zaya-support branch automatically (until upstream PR #1261 merges).
Quickstart
Python API:
from markovian_rsa_mlx import MarkovianRSAOrchestrator, RSAConfig
orch = MarkovianRSAOrchestrator.from_pretrained("kyr0/zaya1-base-8b-MLX")
cfg = RSAConfig.default_16gb() # parallel=2, chunk=16K — fits 16 GB Mac
text, audit = orch.solve(
"Compute the integral of x^2 from 0 to 5",
config=cfg, return_audit=True, audit_path="run.jsonl",
)
print(text)
CLI:
markovian-rsa-mlx solve "Compute the integral of x^2 from 0 to 5" \
--profile default-16gb --audit run.jsonl
Profiles
| Profile | rounds | parallel | chunk | Mem | Notes |
|---|---|---|---|---|---|
default-16gb |
2 | 2 | 16 K | ~ 8 GB | safest on M2 16 GB |
paper-16k |
2 | 4 | 16 K | ~ 16-24 GB | paper "deployment" profile |
paper-headline-40k |
2 | 16 | 40 K | 32+ GB | paper headline (HMMT'25 89.6) |
Audit JSONL
Every event of the run is one line. Schema in
docs/superpowers/specs/2026-05-10-markovian-rsa-mlx-design.md Section 2.
Bench
uv run python scripts/bench_hmmt.py --n-problems 5 --rounds 2 --parallel 4 \
--output bench-out/hmmt_smoke.json
Architecture
orchestrator.py: drives N parallel traces + T rounds.prompts.py: round-0 +zaya_v1aggregation template.batching.py: dispatches between serial andBatchGeneratorpaths.audit.py: streaming JSONL writer + event types.guards.py: memory + context budget checks.
License
MIT. See LICENSE.
Model weights are governed by the upstream Zyphra licence ; see Zyphra/ZAYA1-8B.
Provenance
Spec produced via 2-round Codex (gpt-5.5 xhigh) brainstorming. Implementation by Olivier Dupont with code-review assistance.