Files
markovian-rsa-mlx/CHANGELOG.md

951 B

Changelog

v0.1.1 — 2026-05-10

Added

  • RSAConfig.enable_thinking field (default False). Toggling <think> mode in the chat template substantially affects output quality on math problems.
  • Bench scripts/bench_hmmt.py now uses corrected gold answers for the placeholder HMMT-1 (66, was 100) and HMMT-5 (1, was 76).

Changed

  • Default enable_thinking flipped to False. Empirical testing shows <think> mode causes the model to narrate the aggregation prompt ("We have a user message: ...") instead of solving. Direct mode produces math reasoning immediately.
  • _render_chat(messages, *, enable_thinking) signature now takes an explicit kwarg (was hardcoded to True).

Bench results

  • 5/5 vanilla + 5/5 RSA on corrected HMMT subset. lift_pp +0.00pp (ceiling effect — vanilla already at 100%).

v0.1.0 — 2026-05-10

Initial public release. T=2 N=4 RSA orchestrator with audit JSONL + CLI + HMMT bench harness.