From 1c4b9e8652273b5b917b30f0ad9c64ff63ce6861 Mon Sep 17 00:00:00 2001 From: transcrilive Date: Sun, 10 May 2026 03:22:17 +0200 Subject: [PATCH] docs(README): full quickstart + profiles + bench section for v0.1.0 --- README.md | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 71 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index a26efe6..bb3b3d4 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,74 @@ # markovian-rsa-mlx -First MLX implementation of Zyphra's Markovian RSA test-time compute -methodology, targeting ZAYA1-8B on Apple Silicon. +First MLX implementation of Zyphra's **Markovian RSA** test-time compute methodology, targeting **ZAYA1-8B** on Apple Silicon. Boosts reasoning accuracy by sampling N parallel reasoning traces, extracting their tails, and feeding aggregation prompts back to the model. -Status : work in progress — see `pyproject.toml` for current version. +> **Status :** v0.1.0. Aggregation prompt is `zaya_v1` (reverse-engineered ; paper does not publish the co-trained format). HMMT'25 5-problem smoke shows ≥ 0 pp lift on M2 Pro. + +## Install + +```bash +uv add "markovian-rsa-mlx @ git+https://gitea.tavportal.com/olivier/markovian-rsa-mlx.git" +``` + +This pulls in `mlx-lm` from kyr0's `feat/zaya-support` branch automatically (until upstream PR #1261 merges). + +## Quickstart + +Python API: + +```python +from markovian_rsa_mlx import MarkovianRSAOrchestrator, RSAConfig + +orch = MarkovianRSAOrchestrator.from_pretrained("kyr0/zaya1-base-8b-MLX") +cfg = RSAConfig.default_16gb() # parallel=2, chunk=16K — fits 16 GB Mac +text, audit = orch.solve( + "Compute the integral of x^2 from 0 to 5", + config=cfg, return_audit=True, audit_path="run.jsonl", +) +print(text) +``` + +CLI: + +```bash +markovian-rsa-mlx solve "Compute the integral of x^2 from 0 to 5" \ + --profile default-16gb --audit run.jsonl +``` + +## Profiles + +| Profile | rounds | parallel | chunk | Mem | Notes | +|---|---:|---:|---:|---:|---| +| `default-16gb` | 2 | 2 | 16 K | ~ 8 GB | safest on M2 16 GB | +| `paper-16k` | 2 | 4 | 16 K | ~ 16-24 GB | paper "deployment" profile | +| `paper-headline-40k` | 2 | 16 | 40 K | 32+ GB | paper headline (HMMT'25 89.6) | + +## Audit JSONL + +Every event of the run is one line. Schema in +[`docs/superpowers/specs/2026-05-10-markovian-rsa-mlx-design.md`](docs/superpowers/specs/2026-05-10-markovian-rsa-mlx-design.md) Section 2. + +## Bench + +```bash +uv run python scripts/bench_hmmt.py --n-problems 5 --rounds 2 --parallel 4 \ + --output bench-out/hmmt_smoke.json +``` + +## Architecture + +- `orchestrator.py` : drives N parallel traces + T rounds. +- `prompts.py` : round-0 + `zaya_v1` aggregation template. +- `batching.py` : dispatches between serial and `BatchGenerator` paths. +- `audit.py` : streaming JSONL writer + event types. +- `guards.py` : memory + context budget checks. + +## License + +MIT. See [LICENSE](LICENSE). + +Model weights are governed by the upstream Zyphra licence ; see [`Zyphra/ZAYA1-8B`](https://huggingface.co/Zyphra/ZAYA1-8B). + +## Provenance + +Spec produced via 2-round Codex (gpt-5.5 xhigh) brainstorming. Implementation by Olivier Dupont with code-review assistance.