SpatialBench: benchmark forces biology agents out of demo mode

SpatialBench introduced a suite of 146 verifiable problems drawn from real spatial biology workflows to benchmark agents that analyze spatial transcriptomics and imaging data. The authors argue that current agent demos overfit toy tasks and that SpatialBench’s problems — spanning MERFISH, Visium, Xenium and other platforms — demand end‑to‑end data handling, coding and reproducible biological answers. The benchmark provides a concrete yardstick for labs and vendors evaluating AI agents for spatial biology analysis. Clarification: spatial assays measure molecular patterns across tissue sections, requiring combined image and matrix analysis.

Get the Daily Brief

SpatialBench: benchmark forces biology agents out of demo mode