SpatialBench introduced a benchmark suite of 146 verifiable problems drawn from authentic spatial biology workflows to evaluate agent‑style models on spatial assay analysis. The benchmark spans platforms (MERFISH, Visium, Seeker, Xenium, DBIT‑seq) and tasks such as QC, clustering, and cell typing, aiming to close the demo‑to‑deployment gap for computational agents in biology. Authors built tasks from real analysis snapshots and stress‑tested shortcuts to ensure results reflect substantive analytical ability rather than pattern matching. SpatialBench provides a quantifiable baseline for tool developers, lab bioinformaticians, and funders to prioritize models that produce verifiable, reproducible biology.