Files
markovian-rsa-mlx/README.md

2.4 KiB

markovian-rsa-mlx

First MLX implementation of Zyphra's Markovian RSA test-time compute methodology, targeting ZAYA1-8B on Apple Silicon. Boosts reasoning accuracy by sampling N parallel reasoning traces, extracting their tails, and feeding aggregation prompts back to the model.

Status : v0.1.0. Aggregation prompt is zaya_v1 (reverse-engineered ; paper does not publish the co-trained format). HMMT'25 5-problem smoke shows ≥ 0 pp lift on M2 Pro.

Install

uv add "markovian-rsa-mlx @ git+https://gitea.tavportal.com/olivier/markovian-rsa-mlx.git"

This pulls in mlx-lm from kyr0's feat/zaya-support branch automatically (until upstream PR #1261 merges).

Quickstart

Python API:

from markovian_rsa_mlx import MarkovianRSAOrchestrator, RSAConfig

orch = MarkovianRSAOrchestrator.from_pretrained("kyr0/zaya1-base-8b-MLX")
cfg = RSAConfig.default_16gb()  # parallel=2, chunk=16K — fits 16 GB Mac
text, audit = orch.solve(
    "Compute the integral of x^2 from 0 to 5",
    config=cfg, return_audit=True, audit_path="run.jsonl",
)
print(text)

CLI:

markovian-rsa-mlx solve "Compute the integral of x^2 from 0 to 5" \
  --profile default-16gb --audit run.jsonl

Profiles

Profile rounds parallel chunk Mem Notes
default-16gb 2 2 16 K ~ 8 GB safest on M2 16 GB
paper-16k 2 4 16 K ~ 16-24 GB paper "deployment" profile
paper-headline-40k 2 16 40 K 32+ GB paper headline (HMMT'25 89.6)

Audit JSONL

Every event of the run is one line. Schema in docs/superpowers/specs/2026-05-10-markovian-rsa-mlx-design.md Section 2.

Bench

uv run python scripts/bench_hmmt.py --n-problems 5 --rounds 2 --parallel 4 \
  --output bench-out/hmmt_smoke.json

Architecture

  • orchestrator.py : drives N parallel traces + T rounds.
  • prompts.py : round-0 + zaya_v1 aggregation template.
  • batching.py : dispatches between serial and BatchGenerator paths.
  • audit.py : streaming JSONL writer + event types.
  • guards.py : memory + context budget checks.

License

MIT. See LICENSE.

Model weights are governed by the upstream Zyphra licence ; see Zyphra/ZAYA1-8B.

Provenance

Spec produced via 2-round Codex (gpt-5.5 xhigh) brainstorming. Implementation by Olivier Dupont with code-review assistance.