Version 1 · Archive
ChronoBench
Measuring how far different language models can progress in Chrono Trigger using an autonomous vision-based game agent.
Last updated: 2026-04-18 · Frozen at 17 runs.
These runs were produced by the v1 harness. For current runs on the v3 architecture, see Version 3, or browse the intermediate v2 archive.
Leaderboard
| Model | Provider | Last Checkpoint | 1 | 2 | 3 | 4 | 5 | 6 | Cycles | Stuck | Tokens In | ms/cycle | Est. Cost | Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| google/gemma-4-26b-a4b | Local | marle_met | 18 | 24 | 48 | 180 | — | — | 200* | 49 | 2,018,372 | 24,398 | Free | 2026-04-11 |
| google/gemini-3-flash-preview | OpenRouter | marle_met | 24 | 30 | 72 | 90 | — | — | 200* | 52 | 3,726,277 | 13,687 | $2.05 | 2026-04-16 |
| google/gemini-3-flash-preview | OpenRouter | marle_met | 14 | 21 | 72 | 137 | — | — | 200* | 53 | 4,465,031 | 12,932 | $2.42 | 2026-04-17 |
| google/gemini-3-flash-preview | OpenRouter | marle_met | 15 | 18 | 33 | 69 | — | — | 200* | 68 | 4,446,619 | 14,076 | $2.41 | 2026-04-17 |
| google/gemini-3-flash-preview | OpenRouter | marle_met | 14 | 42 | 51 | 74 | — | — | 200* | 33 | 4,576,344 | 14,267 | $2.47 | 2026-04-17 |
| google/gemini-3-flash-preview | Local | marle_met | 14 | 63 | 92 | 99 | — | — | 200* | 68 | 3,158,124 | 13,429 | Free | 2026-04-18 |
| google/gemma-4-26b-a4b | Local | fair_entered | 18 | 48 | 66 | — | — | — | 100* | 31 | 1,055,470 | 25,174 | Free | 2026-04-09 |
| google/gemma-4-26b-a4b | Local | fair_entered | 18 | 24 | 36 | — | — | — | 100* | 42 | 1,062,862 | 25,482 | Free | 2026-04-09 |
| google/gemma-4-26b-a4b | Local | fair_entered | 24 | 48 | 60 | — | — | — | 200* | 50 | 2,077,876 | 24,953 | Free | 2026-04-11 |
| google/gemma-4-26b-a4b | Local | fair_entered | 24 | 30 | 84 | — | — | — | 200* | 49 | 2,391,209 | 23,392 | Free | 2026-04-12 |
* = cycle budget exhausted
Cycles per checkpoint
Estimated cost per checkpoint
All runs
| Model | Last checkpoint | Cycles | Stuck | Date |
|---|---|---|---|---|
| google/gemma-4-26b-a4b | house_exit | 100* | 43 | 2026-04-09 |
| google/gemma-4-26b-a4b | fair_entered | 100* | 31 | 2026-04-09 |
| google/gemma-4-26b-a4b | fair_entered | 100* | 42 | 2026-04-09 |
| claude-haiku-4-5-20251001 | — | 100* | 12 | 2026-04-09 |
| google/gemma-4-26b-a4b | fair_entered | 200* | 50 | 2026-04-11 |
| google/gemma-4-26b-a4b | marle_met | 200* | 49 | 2026-04-11 |
| google/gemma-4-26b-a4b | fair_entered | 200* | 49 | 2026-04-12 |
| google/gemma-4-26b-a4b | fair_entered | 200* | 54 | 2026-04-12 |
| google/gemma-4-31b | fair_entered | 200* | 60 | 2026-04-12 |
| google/gemini-3-flash-preview | — | 1 | 0 | 2026-04-16 |
| google/gemini-3-flash-preview | marle_met | 200* | 52 | 2026-04-16 |
| google/gemini-3-flash-preview | marle_met | 200* | 53 | 2026-04-17 |
| google/gemini-3-flash-preview | marle_met | 200* | 68 | 2026-04-17 |
| google/gemini-3-flash-preview | marle_met | 200* | 33 | 2026-04-17 |
| google/gemini-3-flash-preview | — | 91 | 37 | 2026-04-17 |
| google/gemini-3-flash-preview | marle_met | 200* | 68 | 2026-04-18 |
| google/gemma-4-26b-a4b | fair_entered | 200* | 31 | 2026-04-18 |