Run Archive

All benchmark runs, newest first. Download raw data for any run.

DateScenario SetModelsScenariosStatusData
2026-05-11latest2.12358completescored.jsonreport.md