SpatialBench was released as a suite of 146 verifiable spatial‑biology problems drawn from real workflows to evaluate agentic tools in biology. The benchmark spans five platforms (MERFISH, Seeker, 10x Visium, Xenium, Atlasxomics) and seven task categories, with each problem set at a real decision point in analysis. Developers argue SpatialBench closes the demo‑to‑deployment gap by grading tools on verifiable outcomes rather than curated examples. The resource pressures model builders to handle messy images, count matrices and reproducible code—key capabilities for automated analysis in modern spatial assays.
Get the Daily Brief